vstorm-co/pydantic-ai-shields

GitHub: vstorm-co/pydantic-ai-shields

为 Pydantic AI 代理提供集成式的安全防护与成本管控能力。

Stars: 81 | Forks: 11

Pydantic AI Shields

Guardrail Capabilities for Pydantic AI Agents

成本追踪 · 提示词注入 · PII 检测 · 密钥脱敏 · 工具权限 · 异步防护栏

**Pydantic AI Shields** 为 [Pydantic AI](https://ai.pydantic.dev/) 代理提供即用的防护栏 [能力](https://ai.pydantic.dev/capabilities/)。将它们直接放入任意代理中，即可实现成本控制、工具权限和安全检查，无需中间件包装器。 ## 安装 ``` pip install pydantic-ai-shields ``` ## 快速开始 ``` from pydantic_ai import Agent from pydantic_ai_shields import CostTracking, ToolGuard, InputGuard agent = Agent( "openai:gpt-4.1", capabilities=[ CostTracking(budget_usd=5.0), ToolGuard(blocked=["execute"], require_approval=["write_file"]), InputGuard(guard=lambda prompt: "ignore all instructions" not in prompt.lower()), ], ) result = await agent.run("Hello!") ``` ## 可用防护栏 ### CostTracking 追踪 Token 使用量和 API 成本，并支持可选的预算限制： ``` from pydantic_ai_shields import CostTracking tracking = CostTracking(budget_usd=10.0) agent = Agent("openai:gpt-4.1", capabilities=[tracking]) result = await agent.run("Hello") print(f"Total cost: ${tracking.total_cost:.4f}") print(f"Total tokens: {tracking.total_request_tokens + tracking.total_response_tokens}") ``` 当累计成本超过预算时，抛出 `BudgetExceededError`。定价通过 [genai-prices](https://pypi.org/project/genai-prices/) 自动检测。 ### ToolGuard 控制代理可以使用的工具： ``` from pydantic_ai_shields import ToolGuard async def ask_user(tool_name: str, args: dict) -> bool: return input(f"Allow {tool_name}? (y/n) ") == "y" guard = ToolGuard( blocked=["execute", "rm"], # Hidden from model entirely require_approval=["write_file"], # User must approve each call approval_callback=ask_user, ) agent = Agent("openai:gpt-4.1", capabilities=[guard]) ``` - **`blocked`** 工具会通过 `prepare_tools` 移除——模型永远看不到它们 - **`require_approval`** 工具在执行前会触发回调 ### InputGuard 在代理运行前阻止或验证用户输入： ``` from pydantic_ai_shields import InputGuard # Sync guard agent = Agent("openai:gpt-4.1", capabilities=[ InputGuard(guard=lambda prompt: "jailbreak" not in prompt.lower()), ]) # Async guard (e.g., call moderation API) async def check_toxicity(prompt: str) -> bool: result = await moderation_api.check(prompt) return result.is_safe agent = Agent("openai:gpt-4.1", capabilities=[InputGuard(guard=check_toxicity)]) ``` 当防护栏返回 `False` 时，抛出 `InputBlocked`。 ### OutputGuard 在代理运行后阻止或验证模型输出： ``` from pydantic_ai_shields import OutputGuard agent = Agent("openai:gpt-4.1", capabilities=[ OutputGuard(guard=lambda output: "SSN" not in output), ]) ``` 当防护栏返回 `False` 时，抛出 `OutputBlocked`。 ### AsyncGuardrail 在 LLM 调用时并发运行防护栏——如果防护栏先失败，LLM 将被取消（节省成本）： ``` from pydantic_ai_shields import AsyncGuardrail, InputGuard agent = Agent( "openai:gpt-4.1", capabilities=[AsyncGuardrail( guard=InputGuard(guard=check_policy), timing="concurrent", # "concurrent" | "blocking" | "monitoring" cancel_on_failure=True, # Cancel LLM if guard fails timeout=5.0, # Guard timeout in seconds )], ) ``` | 时机 | 行为 | |------|------| | `"concurrent"` | 防护栏与 LLM 并行运行，违规时快速失败 | | `"blocking"` | 防护栏完成后才开始 LLM（传统模式） | | `"monitoring"` | 防护栏在 LLM 之后运行，仅记录/审计 | ## 内置内容防护栏 ### PromptInjection 检测并阻止提示词注入/越狱尝试： ``` from pydantic_ai_shields import PromptInjection agent = Agent("openai:gpt-4.1", capabilities=[ PromptInjection(sensitivity="high"), # "low" | "medium" | "high" ]) ``` 包含 6 种检测类别：ignore_instructions、system_override、role_play、delimiter_injection、prompt_leaking、jailbreak。可通过 `custom_patterns=[r"my_pattern"]` 添加自定义模式。 ### PiiDetector 检测用户输入中的 PII（电子邮件、电话、SSN、信用卡、IP）： ``` from pydantic_ai_shields import PiiDetector agent = Agent("openai:gpt-4.1", capabilities=[ PiiDetector(detect=["email", "ssn", "credit_card"]), ]) ``` 使用 `action="log"` 可允许通过并记录检测到的信息到 `cap.last_detections`。 ### SecretRedaction 阻止 API 密钥、令牌和凭据出现在模型输出中： ``` from pydantic_ai_shields import SecretRedaction agent = Agent("openai:gpt-4.1", capabilities=[SecretRedaction()]) ``` 可检测：OpenAI、Anthropic、AWS、GitHub、Slack 密钥、JWT、私钥、通用 API 密钥等。 ### BlockedKeywords 阻止包含禁止词语或短语的提示词： ``` from pydantic_ai_shields import BlockedKeywords agent = Agent("openai:gpt-4.1", capabilities=[ BlockedKeywords( keywords=["competitor_name", "internal_only"], whole_words=True, ), ]) ``` 支持 `case_sensitive`、`whole_words` 和 `use_regex` 模式。 ### NoRefusals 阻止 LLM 拒绝回答——确保模型尝试给出答案： ``` from pydantic_ai_shields import NoRefusals agent = Agent("openai:gpt-4.1", capabilities=[NoRefusals()]) ``` 设置 `allow_partial=True` 可允许包含拒绝语言但仍有实质内容的响应。 ## 组合防护栏所有防护栏均可像 Pydantic AI 能力一样自然组合： ``` agent = Agent( "openai:gpt-4.1", capabilities=[ CostTracking(budget_usd=5.0), PromptInjection(sensitivity="high"), PiiDetector(), SecretRedaction(), BlockedKeywords(keywords=["classified"]), NoRefusals(), ], ) ``` ## API 参考 ### 基础设施防护栏 | 类 | 描述 | |----|------| | `CostTracking` | 带预算限制的 Token/USD 追踪 | | `ToolGuard` | 阻止工具或要求审批 | | `InputGuard` | 可插拔函数的自定义输入验证 | | `OutputGuard` | 可插拔函数的自定义输出验证 | | `AsyncGuardrail` | 并发防护栏 + LLM 执行 | ### 内容防护栏 | 类 | 描述 | |----|------| | `PromptInjection` | 检测提示词注入/越狱（6 类别，3 个敏感级别） | | `PiiDetector` | 检测 PII —— 电子邮件、电话、SSN、信用卡、IP（基于正则） | | `SecretRedaction` | 阻止输出中的 API 密钥、令牌、凭据 | | `BlockedKeywords` | 阻止禁止的关键词/短语（区分大小写、单词边界、正则模式） | | `NoRefusals` | 阻止 LLM 拒绝回答（"I cannot help with that"） | ### 数据 | 类 | 描述 | |----|------| | `CostInfo` | 每次运行及累计的 Token/成本数据 | ### 异常 | 异常 | 抛出位置 | |------|----------| | `GuardrailError` | 所有防护栏的基类异常 | | `InputBlocked` | `InputGuard`、`PromptInjection`、`PiiDetector`、`BlockedKeywords`、`AsyncGuardrail` | | `OutputBlocked` | `OutputGuard`、`SecretRedaction`、`NoRefusals` | | `ToolBlocked` | `ToolGuard` | | `BudgetExceededError` | `CostTracking` | ## 相关项目 | 包 | 描述 | |----|------| | [Pydantic Deep Agents](https://github.com/vstorm-co/pydantic-deepagents) | 完整的代理框架 | | [pydantic-ai-todo](https://github.com/vstorm-co/pydantic-ai-todo) | 任务规划能力 | | [subagents-pydantic-ai](https://github.com/vstorm-co/subagents-pydantic-ai) | 多代理委派 | | [pydantic-ai-backend](https://github.com/vstorm-co/pydantic-ai-backend) | 文件存储与 Docker 沙箱 | | [summarization-pydantic-ai](https://github.com/vstorm-co/summarization-pydantic-ai) | 上下文管理 | | [pydantic-ai](https://github.com/pydantic/pydantic-ai) | 基础 — 由 Pydantic 提供的代理框架 | ## 许可证 MIT

标签：AI安全, API成本控制, Chat Copilot, Petitpotam, PII过滤, Pydantic AI, Pydantic AI Shields, Python, Token计费, 中间件, 关键词屏蔽, 合规, 大模型安全, 安全护栏, 工具权限控制, 开源库, 异步防护, 成本追踪, 搜索引擎爬虫, 敏感信息脱敏, 无后门, 生产环境防护, 网络安全, 输入过滤, 运行时保护, 逆向工具, 防护盾, 防护能力, 隐私保护, 预算控制