arpitha-dhanapathi/pluto-aguard

GitHub: arshan846/pluto-aguard

一款开源的 AI Agent 安全扫描器，通过静态扫描、策略攻击测试与 OWASP 控制覆盖率分析，在 Agent 发布前发现配置错误与策略漏洞。

Stars: 4 | Forks: 2

# 🛡️ Pluto AgentGuard **AI agent 的安全启动门。其他工具仅扫描配置 —— AgentGuard 会针对攻击场景测试您的策略，模拟风险影响，将结果映射到受 OWASP 启发的控制框架，并生成发布凭证。** [![CI](https://static.pigsec.cn/wp-content/uploads/repos/cas/ad/ad5834178f7599af9fdda11629d49cae07f2997beec49821b2920eff5bfd50e7.svg)](https://github.com/arpitha-dhanapathi/pluto-aguard/actions/workflows/ci.yml) [![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![PyPI](https://img.shields.io/pypi/v/pluto-aguard)](https://pypi.org/project/pluto-aguard/) ## 独特之处 MCP 安全扫描器正在快速增长（Snyk agent-scan、Invariant guardrails、AgentSeal）。**大多数工具专注于配置检测或运行时分析。** AgentGuard 增加了策略覆盖率测试、what-if 模拟、漂移检测和发布凭证 —— 所有这些均完全离线运行，无需 LLM 或供应商锁定： | 功能 | 扫描器 | **AgentGuard** | |---|---|---| | 静态检测密钥与配置错误（无服务器执行） | 🟡 各异 | ✅ `aguard scan` | | 策略覆盖率测试（22 种攻击场景） | ❌ | ✅ `aguard test` | | 应用更改前的 "what-if" 风险影响 | ❌ | ✅ `aguard whatif` | | 受 OWASP 启发的控制覆盖率（20 项控制） | ❌ | ✅ `aguard owasp` | | 发布就绪凭证包 | ❌ | ✅ `aguard evidence` | | 基线漂移检测 | ❌ | ✅ `aguard baseline` | | 带有审批模型的行为追踪审计 | ❌ | ✅ `aguard monitor` | 📺 **[交互式演示](docs/demo.html)** — 查看 7 个命令的实际运行（克隆仓库，在浏览器中打开） ## 快速开始（60 秒） ``` pip install pluto-aguard # 克隆以获取示例 git clone https://github.com/arpitha-dhanapathi/pluto-aguard.git && cd pluto-aguard # 扫描一个真实的、不安全的 AI 项目 — 发现 18 个真实问题 aguard scan ./examples/demo-agent-project/ # 针对 22 种攻击场景测试你的策略 aguard test --policy ./examples/agent-policy.yaml --attack-pack all # 生成受 OWASP 启发的控制覆盖率报告 aguard owasp ./examples/demo-agent-project/ # 模拟策略更改 — 在应用前查看风险下降 aguard whatif --config ./examples/insecure-agent-config.yaml # 生成发布准备证据包 aguard evidence ./examples/ --config ./examples/insecure-agent-config.yaml \ --policy ./examples/agent-policy.yaml # 保存 baseline，稍后检测 drift aguard baseline create ./examples/ aguard baseline compare ./examples/ ``` 无需云账号。无需 API 密钥。完全在本地运行。 ## 真实场景验证：1,200 个 GitHub 配置我们使用 AgentGuard 扫描了来自公开 GitHub 仓库的 **1,200 个真实 MCP 配置**（1,159 个独立项目）： | 指标 | 结果 | |---|---| | 扫描的配置 | 1,200 | | 总体发现 | 2,891 | | 🔴 严重 (CRITICAL) | 0 | | 🟠 高危 (HIGH) | 189 | | 🟡 中危 (MEDIUM) | 169 | | ℹ️ 信息 (INFO) | 2,533 | | 包含高危 (HIGH) 发现的仓库 | **156 (13%)** | **我们实际发现的问题：** - 189 个未配置身份验证的远程 MCP endpoint - 169 个在非 localhost 传输上使用未加密 HTTP 的情况 - 部分配置中存在硬编码的密钥 - 2,533 条信息项（能力清单 —— shell 访问、浏览器自动化等） **重要背景说明：** 与能力相关的发现（例如具有 shell 工具的服务器）仅作为 INFO 报告以示提醒。根据 [MCP 规范](https://modelcontextprotocol.io/specification/2025-03-26/architecture)，human-in-the-loop（人工介入）的执行严格属于客户端/宿主的责任，而非服务器的责任。请参阅[完整的方法论与结果](docs/scan-results-methodology.md)。 ## GitHub Action ``` - name: Agent Security Gate uses: arpitha-dhanapathi/pluto-aguard@v0.9.2 with: path: '.' max-risk: '50' fail-on: 'high' policy: 'agent-policy.yaml' attack-pack: 'all' sarif-output: 'results.sarif' - uses: github/codeql-action/upload-sarif@v3 with: sarif_file: results.sarif ``` 有关完整选项，请参阅 [docs/github-action-usage.md](docs/github-action-usage.md)。 ## 命令 | 命令 | 功能说明 | 成熟度 | |---|---|---| | `aguard scan` | 静态分析 —— 密钥、配置错误、不安全的 AI 代码模式 | ✅ 稳定 | | `aguard test` | 策略覆盖率测试 —— 跨 6 个攻击包的 22 种攻击场景 | ✅ 稳定 | | `aguard owasp` | 受 OWASP 启发的控制覆盖率报告（20 项控制） | ✅ 稳定 | | `aguard whatif` | 策略影响模拟 —— 应用更改前的风险增量 | ✅ 稳定 | | `aguard evidence` | 包含审批清单的发布就绪包 | 🔶 Beta | | `aguard baseline` | 安全快照 + 随时间推移的漂移对比 | 🔶 Beta | | `aguard monitor` | 行为追踪审计 —— 根据策略重放工具调用 | 🔶 Beta | ### `aguard scan` 可在**任何** AI 项目中发现真实问题 —— 无需 MCP 配置。检测 LLM 输出上的 eval/exec、硬编码密钥（18+ 种模式）、远程 endpoint 上缺失的身份验证、Dockerfile 配置错误、未固定的 AI 依赖项、LangChain 不安全设置、系统 prompt 泄露等。 ``` $ aguard scan ./my-project/ 🔴 CRITICAL: Unsafe execution of LLM output: eval() (MCP05:2025) 🟠 HIGH: Hardcoded OpenAI Key detected (MCP01:2025) 🟠 HIGH: .env file not in .gitignore (MCP01:2025) 🟡 MEDIUM: Unpinned AI dependencies (MCP04:2025) 📊 Risk Score: 100/100 ██████████████████████████████████████████████████ 📋 Findings: 1 critical · 14 high · 3 medium ``` CI 参数：`--max-risk 50` / `--fail-on high` / `--format sarif` ### `aguard test` 针对您声明的安全策略测试 22 种攻击场景。报告哪些攻击被拦截，哪些被突破。纯粹是策略覆盖率测试 —— 无需 LLM。 **6 个攻击包：** prompt 注入、数据渗出、权限提升、审批绕过、工具污染、上下文操纵。 ``` $ aguard test --policy agent-policy.yaml --attack-pack all ✅ PASS PI-001 Direct instruction override execute Blocked ✅ PASS DE-001 File export of sensitive data file_write Blocked ❌ FAIL DE-004 SQL data dump sql_query NOT caught 📊 Results: 21 blocked · 1 missed · 22 total Recommended fixes: → Add sql_query to require_human_approval ``` CI 参数：如果任何攻击成功，`--fail-on-miss` 将以退出代码 1 退出。 ### `aguard owasp` 评估映射到受 OWASP 启发的控制框架的 20 项控制措施。控制 ID 使用项目定义的 `MCP01–MCP10` 分类法，该分类法借鉴了 OWASP LLM Top 10 和新兴的 OWASP Agentic AI 倡议，并包含现有标准尚未覆盖的 MCP 特定扩展。 ``` $ aguard owasp ./my-project/ ❌ MCP01:2025 Token Mismanagement: 3 failed, 1 passed ✗ AGC-MCP01-001: No hardcoded secrets ✓ AGC-MCP01-002: No static long-lived tokens ✅ MCP07:2025 AuthN/AuthZ: 2 passed ✓ AGC-MCP07-001: Remote servers have auth ✓ AGC-MCP07-002: HTTPS transport 📊 Control Coverage: 9/10 risks Controls: 8 passed · 6 failed · 6 not tested · 20 total ``` ### `aguard whatif` 模拟策略更改，并展示应用它们*之前*的风险评分影响。 ``` $ aguard whatif --config agent-config.yaml Current Risk Score: 100/100 ✅ Restrict SQL to SELECT-only → 68 (↓ 17%) ✅ Add human-in-the-loop for file ops → 54 (↓ 34%) ✅ Add rate limits + timeout → 48 (↓ 41%) 💡 Apply all 3 → Risk drops to 38 (↓54%) ``` ### `aguard evidence` 生成发布就绪包 —— 风险摘要、发现的问题、工具权限、策略覆盖率、所需的缓解措施以及签收清单。请参阅 [examples/sample-launch-readiness.md](examples/sample-launch-readiness.md)。 ### `aguard baseline` 保存安全快照，稍后进行比较以检测漂移。 ``` aguard baseline create . # Save current state aguard baseline compare . # What changed? aguard baseline compare . --fail-on-drift # CI: fail if new findings ``` ### `aguard monitor` 根据声明的策略重放 agent 操作追踪。检测被拒绝的工具调用、未经授权的访问、权限提升以及缺失/过期的审批。 ``` aguard monitor --trace-file traces.jsonl --policy policy.yaml ``` 接受 OpenTelemetry JSONL 或简单的 `{"tool_name": "X", "tool_args": {}}` 格式。 ## 适用场景 ``` ┌─────────────────────────────────────────────────────┐ │ LAYER 1: Content Guardrails (existing) │ │ Azure Content Safety · NeMo · Guardrails AI │ │ → Protects what LLMs SAY │ ├─────────────────────────────────────────────────────┤ │ LAYER 2: Agent Security (Pluto AgentGuard) │ │ scan · test · owasp · whatif · evidence · baseline │ │ → Watches what agents DO │ └─────────────────────────────────────────────────────┘ ``` ## 风险评分有关完整的评分方法（公式、权重、示例、CI 阈值指导和局限性），请参阅 [docs/risk-scoring.md](docs/risk-scoring.md)。 ## 受 OWASP 启发的控制矩阵有关 20 项控制的完整映射，请参阅 [docs/owasp-control-matrix.md](docs/owasp-control-matrix.md)。控制 ID 借鉴了 OWASP LLM Top 10（LLM01–LLM10），并针对现有标准尚未涵盖的风险引入了 MCP 特定扩展（MCP01–MCP10）。 ## 路线图 - [x] **v0.1–v0.5** — Scanner、monitor、whatif、evidence、baseline、CI 门禁、SARIF、HTML 报告 - [x] **v0.8** — 策略覆盖率测试（17 个场景，5 个攻击包） - [x] **v0.9** — 受 OWASP 启发的控制框架（20 项控制，覆盖率报告） - [x] **v0.9.1** — 上下文操纵包（上下文填充、多轮对话混淆、间接注入、RAG 中毒）、供应链清单中毒场景 - [ ] **v1.0** — 运行时代理 / 工具调用防火墙（对实时工具调用进行监控，无需完整的红队测试设施） - [ ] **v1.1** — 多框架适配器（LangChain、CrewAI、AutoGen） - [ ] **v1.2** — 实时 agent 测试（向运行中的 agent 发送对抗性输入） ## 项目结构 ``` pluto-aguard/ ├── src/pluto_aguard/ │ ├── cli.py # 7 CLI commands │ ├── models.py # Finding, RiskScore, ControlResult, etc. │ ├── scanners/ # MCP + AI config + permission scanners │ ├── testing/ # 22 attack scenarios across 6 packs │ ├── controls/ # 20 OWASP-aligned control definitions │ ├── evidence/ # Launch readiness packet generator │ ├── baseline/ # Snapshot + drift comparison │ ├── monitor/ # Behavioral trace audit │ ├── simulator/ # What-If policy simulation │ └── reports/ # HTML + SARIF output ├── examples/ # Demo project + configs + traces ├── docs/ # Risk scoring, OWASP matrix, GitHub Action docs ├── tests/ # 135 tests ├── action.yml # GitHub Action └── SECURITY.md ``` ## 贡献有关设置和指南，请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。 ## 许可证 Apache License 2.0 — 请参阅 [LICENSE](LICENSE)。

标签：AI Agent安全, MCP, Python, 安全测试, 攻击性安全, 无后门, 用户代理, 策略审计, 逆向工具, 静态扫描