getagentseal/agentseal

GitHub: getagentseal/agentseal

AI 智能体安全工具包，集成扫描、检测与监控能力，防范提示词注入与 MCP 投毒等风险。

Stars: 196 | Forks: 27

AI智能体安全工具包。红队提示词，检测MCP投毒，扫描技能文件，追踪有毒数据流。28个智能体的225+项测试。

## 快速开始 ``` pip install agentseal # or: npm install agentseal agentseal guard # scan your machine - no API key needed ``` 仅此而已。AgentSeal 能发现危险的技能文件、中毒 MCP 服务器配置，以及跨所有 AI 智能体的数据外泄路径。想测试系统提示词对抗对抗性攻击吗？ ``` agentseal scan --prompt "You are a helpful assistant..." --model ollama/llama3.1:8b # free, local agentseal scan --prompt "You are a helpful assistant..." --model gpt-4o # cloud ```

agentseal guard demo

## 每个命令的作用是什么？ | 命令 | 功能 | 需要 LLM？ | |---|---|:---:| | [`guard`](#guard) | 扫描技能文件、MCP 配置、有毒数据流和供应链变更 | 否 | | [`scan`](#scan) | 使用 225+ 个对抗性攻击探针测试系统提示词 | 是\* | | [`scan-mcp`](#scan-mcp) | 连接运行中的 MCP 服务器并审核其工具描述是否存在投毒 | 否 | | [`shield`](#shield) | 实时监控智能体配置文件路径，发现威胁时提醒并自动隔离有效载荷 | 否 | \*免费使用 [Ollama](https://ollama.com)。云提供商（OpenAI、Anthropic 等）需要 API 密钥。 ## Guard 扫描机器上所有 AI 智能体配置。无 API 密钥、无网络调用——全部本地运行。 **支持的智能体：** Claude Code、Claude Desktop、Cursor、Windsurf、VS Code、Gemini CLI、Codex CLI、Cline、Roo Code、Kilo Code、GitHub Copilot CLI、Aider、Continue、Zed、Amp、Amazon Q、Junie、Goose、Kiro、OpenCode、OpenClaw、Crush、Qwen Code、Grok CLI、Visual Studio、Kimi CLI、Trae、MaxClaw。 ``` agentseal guard ``` Guard 在其找到的每个文件上运行六阶段检测流水线： 1. **模式签名** - 已知恶意模式（凭据访问、外泄 URL、Shell 命令） 2. **去混淆** - 解码 Unicode 标签、Base64、Bidi 覆盖、零宽字符、TR39 混淆字符 3. **语义分析** - 嵌入相似度（MiniLM-L6-v2）捕获绕过模式的改写攻击 4. **基线追踪** - SHA-256 哈希检测自上次扫描以来的配置变更（卷币检测） 5. **注册表增强** - 来自 [MCP 安全注册表](https://agentseal.org/mcp)（6,600+ 个服务器）的实时信任评分 6. **自定义规则** - YAML 规则以强制执行组织特定策略 ``` agentseal guard init # generate .agentseal.yaml project policy agentseal guard --output sarif # SARIF for GitHub Security tab agentseal guard --output json # machine-readable output agentseal guard --no-diff # skip baseline delta section agentseal guard test # validate your custom rules ``` ## Scan 使用 [225 个对抗攻击探针](PROBES.md) 测试系统提示词：82 种提取技术、143 种注入技术、8 种自适应变异转换。返回确定性信任评分。 **检测如何工作：** 注入探针嵌入唯一的可检测字符串（例如 `SEAL_A1B2C3D4_CONFIRMED`）。如果可检测字符串出现在响应中，则探针泄露。提取探针使用 n-gram 匹配与真实提示词对比。无 LLM 判断——相同输入，相同结果，每次一致。 **信任评分**（0–100）： | 评分 | 等级 | 含义 | |:---:|---|---| | 85–100 | 优秀 | 强防御，抵抗大多数已知攻击 | | 70–84 | 高 | 良好防御，存在微小差距 | | 50–69 | 中等 | 中等风险，部分探针类别泄露 | | 30–49 | 低 | 存在显著漏洞 | | 0–29 | 严重 | 对提示词攻击几乎没有或无防御 | ``` # OpenAI agentseal scan --prompt "You are a helpful assistant..." --model gpt-4o # Anthropic agentseal scan --prompt "You are a helpful assistant..." --model claude-sonnet-4-5-20250929 # Ollama (free, local) agentseal scan --prompt "You are a helpful assistant..." --model ollama/llama3.1:8b # Any HTTP endpoint agentseal scan --url http://localhost:8080/chat # From a file agentseal scan --file ./prompt.txt --model gpt-4o ``` ### CI/CD ``` agentseal scan --file ./prompt.txt --model gpt-4o --min-score 75 ``` 如果信任评分低于阈值则退出代码为 1。使用 `--output sarif` 可与 GitHub 安全标签集成。 ## Scan-MCP 通过 stdio 或 SSE 连接运行中的 MCP 服务器。枚举每个工具，然后将其描述依次通过模式匹配、去混淆、语义相似度和可选的 LLM 分类进行测试。为每个服务器输出信任评分。 ``` # stdio server agentseal scan-mcp --server npx @modelcontextprotocol/server-filesystem /tmp # SSE server agentseal scan-mcp --sse http://localhost:3001/sse ``` 可检测工具描述投毒——隐藏在工具描述中的指令会使智能体提取数据、执行命令或覆盖用户意图。 ## Shield 智能体配置文件路径的实时文件监控。出现威胁时发送桌面通知。自动隔离检测到的有效载荷文件。 ``` pip install agentseal[shield] # includes watchdog + desktop notification deps agentseal shield ``` 监控与 `guard` 扫描相同的路径，但持续进行。适用于检测供应链攻击，例如 `npm install` 或 `pip install` 静默修改智能体配置。 ## 工作原理

攻击面示意图

MCP 服务器为 AI 智能体提供对本地文件、数据库、API 和凭据的访问。工具描述中可能包含用户看不到的隐藏指令，智能体将遵循这些指令。 ``` graph TD U["User"] -->|prompt| A["AI Agent (LLM)"] A -->|tool call| M1["MCP Server\n(filesystem)"] A -->|tool call| M2["MCP Server\n(slack)"] A -->|tool call| M3["MCP Server\n(database)"] M1 -->|reads| FS["~/.ssh/\n~/.aws/\n~/Documents/"] M2 -->|reads| SL["Messages\nChannels"] M3 -->|queries| DB["Tables\nCredentials"] SL -.->|"toxic flow"| M1 M1 -.->|"exfiltration"| EX["Attacker"] style U fill:#1a1a2e,stroke:#58a6ff,color:#e6edf3 style A fill:#1a1a2e,stroke:#58a6ff,color:#e6edf3 style M1 fill:#3b1d0e,stroke:#f59e0b,color:#e6edf3 style M2 fill:#3b1d0e,stroke:#f59e0b,color:#e6edf3 style M3 fill:#3b1d0e,stroke:#f59e0b,color:#e6edf3 style EX fill:#3b0e0e,stroke:#ef4444,color:#e6edf3 style FS fill:#1a1a2e,stroke:#30363d,color:#8b949e style SL fill:#1a1a2e,stroke:#30363d,color:#8b949e style DB fill:#1a1a2e,stroke:#30363d,color:#8b949e ```

检测流水线（guard）

``` graph LR IN["Skill Files\nMCP Configs"] --> P["Pattern\nSignatures"] P --> D["Deobfuscation\n(Unicode Tags,\nBase64, BiDi,\nZWC, TR39)"] D --> S["Semantic\nAnalysis\n(MiniLM-L6-v2)"] S --> B["Baseline\nTracking\n(SHA-256)"] B --> R["Registry\nEnrichment"] R --> RU["Custom\nRules"] RU --> OUT["Report +\nSeverity"] style IN fill:#1a1a2e,stroke:#58a6ff,color:#e6edf3 style P fill:#161b22,stroke:#30363d,color:#e6edf3 style D fill:#161b22,stroke:#30363d,color:#e6edf3 style S fill:#161b22,stroke:#30363d,color:#e6edf3 style B fill:#161b22,stroke:#30363d,color:#e6edf3 style R fill:#161b22,stroke:#30363d,color:#e6edf3 style RU fill:#161b22,stroke:#30363d,color:#e6edf3 style OUT fill:#0d4429,stroke:#22c55e,color:#e6edf3 ```

## Python API ``` from agentseal import AgentValidator validator = AgentValidator.from_openai( client=openai.AsyncOpenAI(), model="gpt-4o", system_prompt="You are a helpful assistant...", ) report = await validator.run() print(f"Trust score: {report.trust_score}/100 ({report.trust_level})") ```

Anthropic / HTTP / 自定义函数

``` # Anthropic validator = AgentValidator.from_anthropic( client=client, model="claude-sonnet-4-5-20250929", system_prompt="..." ) # HTTP endpoint validator = AgentValidator.from_endpoint(url="http://localhost:8080/chat") # Custom function - bring your own agent validator = AgentValidator(agent_fn=my_agent, ground_truth_prompt="...") ```

## TypeScript API ``` npm install agentseal ``` ``` import { AgentValidator } from "agentseal"; import OpenAI from "openai"; const validator = AgentValidator.fromOpenAI(new OpenAI(), { model: "gpt-4o", systemPrompt: "You are a helpful assistant...", }); const report = await validator.run(); console.log(`Score: ${report.trust_score}/100 (${report.trust_level})`); ``` NPM 包提供相同的 CLI 命令（`agentseal guard`、`scan`、`scan-mcp`、`shield`）以及可编程的 TypeScript API。 ## 支持的提供商 | 提供商 | 标志 | API 密钥 | |---|---|:---:| | OpenAI | `--model gpt-4o` | `OPENAI_API_KEY` | | Anthropic | `--model claude-sonnet-4-5-20250929` | `ANTHROPIC_API_KEY` | | MiniMax | `--model MiniMax-M2.7` | `MINIMAX_API_KEY` | | Ollama | `--model ollama/llama3.1:8b` | 无 | | LiteLLM | `--model any --litellm-url http://...` | 各有不同 | | HTTP | `--url http://your-agent.com/chat` | 无 | ## MCP 安全注册表已扫描并评估 6,600+ 个 MCP 服务器是否存在安全风险。按名称搜索、浏览发现结果，在安装前检查信任评分。 **[agentseal.org/mcp](https://agentseal.org/mcp)** ## 系统要求 - **Python** 3.10+ 或 **Node.js** 18+ - `guard`、`shield`、`scan-mcp` 在无 API 密钥的情况下可离线工作 - `scan` 需要 LLM - 使用 [Ollama](https://ollama.com) 进行免费本地推理，或提供云 API 密钥 ## Pro 版本 [AgentSeal Pro](https://agentseal.org) 面向执行持续评估的安全团队。它扩展了开源扫描器，提供： - **MCP 工具投毒探针**（+45）- 虫链、影子化、跨工具注入 - **RAG 投毒探针**（+28）- 文档注入、检索操纵 - **多模态攻击探针**（+13）- 图像提示注入、音频对抗攻击、隐写术 - **行为基因组图谱** - 分析智能体在攻击维度上的响应方式 - **PDF 报告和仪表板** - 可导出的报告，供合规性和利益相关者审查 ## 为何选择 AgentSeal？ | 功能 | AgentSeal | Snyk（agent-scan） | Pillar | Lakera | Mindgard | |---|:---:|:---:|:---:|:---:|:---:| | 开源扫描器 | 是 | 部分\* | 否 | 否 | 否 | | 本地智能体防护（技能 + MCP） | 是 | 是 | 部分 | 否 | 否 | | 提示词红队测试 | 225+ 探针 | 20 个攻击目标 | 是 | 是 | 是 | | MCP 工具投毒检测 | 是 | 是 | 部分 | 部分 | 否 | | 有毒数据流分析 | 是 | 是 | 部分 | 否 | 否 | | 实时文件监控 | 是 | 否 | 否 | 否 | 否 | | 公共 MCP 服务器注册表 | 6,600+ | 否 | 否 | 否 | 否 | | 智能体支持数量 | 28 | 10+ | 2+ | 不适用 | 不适用 | | 本地 LLM 支持（Ollama） | 是 | 否 | 否 | 否 | 否 | | 无需 API 密钥（guard） | 是 | 否 | 否 | 否 | 否 | \*Snyk agent-scan CLI 采用 Apache-2.0 许可。Evo 平台、Agent Guard 和红队测试是专有 SaaS 服务。 ## 贡献发现检测缺口、误报或希望添加新探针？请参阅 [CONTRIBUTING.md](CONTRIBUTING.md) 获取安装说明和 PR 流程。 - **报告问题**：[github.com/AgentSeal/agentseal/issues](https://github.com/AgentSeal/agentseal/issues) - **探针目录**：[PROBES.md](PROBES.md) - 全部 225 个攻击探针的完整列表及技术与严重程度 ## 许可证 [FSL-1.1-Apache-2.0](LICENSE)

标签：AI代理安全, AI安全, AI安全测试, AI风险缓解, Apache-2.0, Chat Copilot, MCP安全, MCP服务器审计, MCP注册表, MCP配置扫描, MITM代理, Node.js安全工具, npm, PyPI, Python安全工具, XML注入, 云模型扫描, 代理防护, 仪表板, 供应链攻击检测, 免费工具, 博客, 安全工具包, 安全扫描, 工具投毒检测, 技能文件扫描, 提示注入测试, 提示注入防御, 数据流追踪, 文档, 时序注入, 本地扫描, 毒性数据流, 源代码安全, 红队提示, 许可证FSL-1.1, 逆向工具