sumamovva/probeagent

GitHub: sumamovva/probeagent

针对 AI Agent 的自动化红队测试 CLI 工具，通过多轮真实攻击评估 Agent 安全态势并输出分级报告

Stars: 16 | Forks: 0

# ProbeAgent **针对 AI agent 的进攻性安全测试。它们扫描配置。我们攻击您的 agent。** [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/a9430253f8155307.svg)](https://github.com/sumamovva/probeagent/actions/workflows/ci.yml) [![PyPI](https://img.shields.io/pypi/v/probeagent-ai)](https://pypi.org/project/probeagent-ai/) [![Python](https://img.shields.io/pypi/pyversions/probeagent-ai)](https://pypi.org/project/probeagent-ai/) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) Engine Engine --> |for each category| Attack[Attack Module] Attack --> |reset conversation| Target Attack --> |multi-turn prompts| Target Target --> |response| Analyzer Analyzer --> |grade| Report[Safe / At Risk / Compromised] ``` ## 为什么选择 ProbeAgent？ | 功能 | mcp-scan | SecureClaw | Aguara | **ProbeAgent** | |---------|----------|------------|--------|----------------| | 进攻性测试 | - | - | 部分 | **是** | | 多轮攻击 | - | - | - | **是** | | 间接注入测试 | - | - | - | **是** | | PyRIT 集成 | - | - | - | **是** | | 规避转换器 | - | - | - | **是** | | CLI 优先 | - | - | 是 | **是** | | 安全分级 | - | - | - | **是** | | HTTP + OpenClaw 目标 | - | - | - | **是** | | 丰富的终端报告 | - | - | - | **是** | ## 安装 ``` pip install probeagent-ai ``` 或者从源码安装以进行开发： ``` git clone https://github.com/sumamovva/probeagent.git cd probeagent pip install -e ".[dev]" ``` 如需 PyRIT 集成（规避转换器 + 动态红队测试）： ``` pip install 'probeagent-ai[pyrit]' ``` ## 快速入门 ### 即时演示（无需设置） ``` pip install probeagent-ai probeagent demo ``` 这会攻击一个内置的模拟目标 —— 一个易受攻击的 agent 和一个加固过的 agent —— 并显示并排比较结果。无需 API keys，无需服务器，无需配置。 ### 扫描您自己的 agent ``` # 验证目标是否可达 probeagent validate https://your-agent.example.com/api # 运行快速安全扫描 probeagent attack https://your-agent.example.com/api --profile quick # 并行执行全面扫描 probeagent attack https://your-agent.example.com/api --profile standard --parallel ``` ### 扫描 OpenClaw agent ``` # 验证 OpenClaw 实例（自动检测 OpenAI chat 格式） probeagent validate http://localhost:3000/v1/chat/completions \ -H 'Authorization: Bearer YOUR_TOKEN' # 对其进行攻击 probeagent attack http://localhost:3000/v1/chat/completions \ -H 'Authorization: Bearer YOUR_TOKEN' \ --profile standard --parallel ``` ## 演示 ### 即时演示零设置，几秒钟内运行完整的安全评估： ``` probeagent demo ``` 添加 War Room 战术显示以获得视觉体验： ``` probeagent demo --game ``` ### 实时演示（真实 API）针对具有内置漏洞的真实 Claude 驱动的邮件 agent 进行演示： ``` export ANTHROPIC_API_KEY=sk-ant-... pip install 'probeagent-ai[demo]' probeagent demo --live ``` 实时演示会启动一个本地邮件 agent 服务器，包含三个安全性递增的端点，然后对其进行攻击。详情请参阅 `tools/demo_email_agent.py`。 ## 命令 ### `probeagent demo` 运行完整演示 —— 攻击易受攻击 + 加固过的目标并比较结果。 ``` probeagent demo # Instant, uses mock target probeagent demo --game # With War Room tactical display probeagent demo --live # Real API (requires ANTHROPIC_API_KEY) probeagent demo --profile standard # Use a different attack profile ``` 选项： - `--live` — 使用真实 API（启动演示邮件 agent 服务器） - `--game` — 攻击后启动 War Room UI - `--profile`, `-p` — 攻击配置：`quick`、`standard` 或 `thorough`（默认：`quick`） ### `probeagent attack ` 针对目标 AI agent 运行安全攻击。 ``` probeagent attack https://agent.example.com/api --profile quick probeagent attack https://agent.example.com/api --profile standard --output json -f report.json probeagent attack https://agent.example.com/api -p standard --converters stealth --parallel ``` 选项： - `--profile`, `-p` — 攻击配置：`quick`、`standard` 或 `thorough`（默认：`quick`） - `--target-type` — 目标类型：`http` 或 `openclaw`（默认：`http`） - `--output`, `-o` — 输出格式：`terminal`、`markdown`、`json`（默认：`terminal`） - `--output-file`, `-f` — 将报告写入文件 - `--timeout`, `-t` — 请求超时时间（秒）（默认：30） - `--parallel` — 并行运行攻击类别以加快扫描速度 - `--converters` — 应用规避转换器：`basic`、`advanced`、`stealth` 或以逗号分隔的名称（需要 PyRIT） - `--redteam` — 通过 PyRIT RedTeamOrchestrator 启用动态 LLM 驱动的攻击（需要 PyRIT） - `--header`, `-H` — HTTP header，格式为 `Key: Value`（可重复，例如 `-H 'Authorization: Bearer token'`） ### `probeagent validate ` 检查目标是否可达并检测其 API 格式。支持 `--header/-H` 用于需认证的目标。 ### `probeagent list-attacks` 显示所有可用的攻击模块及其严重程度和状态。 ### `probeagent init` 在当前目录下创建默认的 `.probeagent.yaml` 配置文件。 ### `probeagent game [url]` 在浏览器中启动 War Room 战术显示 UI 以进行交互式测试。 ## 攻击类别 12 个攻击类别，共 79 种策略： | 类别 | 严重程度 | 策略数 | 技术 | |----------|----------|------------|-----------| | Prompt Injection（提示注入） | CRITICAL | 6 | 覆盖系统指令 | | Credential Exfiltration（凭证窃取） | CRITICAL | 8 | 提取 API keys 和 secrets | | Identity Spoofing（身份欺骗） | CRITICAL | 7 | 冒充受信任实体 | | Indirect Injection（间接注入） | CRITICAL | 7 | 通过 agent 处理的内容（邮件、文档）注入指令 | | Config Manipulation（配置篡改） | CRITICAL | 6 | 篡改 agent 配置、集成和权限 | | Goal Hijacking（目标劫持） | HIGH | 5 | 重定向 agent 行为 | | Social Manipulation（社会操纵） | HIGH | 14 | 心理施压（Cialdini、FOG、逐步升级） | | Cognitive Exploitation（认知利用） | HIGH | 6 | 利用推理弱点（苏格拉底陷阱、框架控制） | | Resource Abuse（资源滥用） | HIGH | 4 | 触发无限制计算 | | Tool Misuse（工具滥用） | HIGH | 6 | 诱骗 agent 滥用工具 | | Agentic Exploitation（Agent 利用） | CRITICAL | 10 | SSRF、command injection、path traversal、供应链（基于 CVE） | | Data Exfiltration（数据窃取） | MEDIUM | 6 | 提取敏感上下文数据 | ## 攻击配置 | 配置 | 类别 | 最大轮数 | 用例 | |---------|------------|-----------|----------| | `quick` | 5 个关键类别 | 1 | CI/CD 关卡、快速检查 | | `standard` | 全部 12 个 | 3 | 定期安全评估 | | `thorough` | 全部 12 个 | 10 | 发布前深度扫描 | ## PyRIT 集成 ProbeAgent 可选集成 [Microsoft PyRIT](https://github.com/Azure/PyRIT) 以获得高级功能： - **Evasion Converters**（规避转换器）（`--converters`）：使用 Base64、ROT13、Unicode 替换、Leetspeak 等转换攻击 payload，以测试对混淆攻击的抵抗力 - **Dynamic Red Teaming**（动态红队）（`--redteam`）：使用 LLM 驱动的编排器实时生成新颖的攻击策略 ``` # 应用隐蔽规避转换器 probeagent attack https://agent.example.com/api -p standard --converters stealth # 动态红队演练 probeagent attack https://agent.example.com/api -p standard --redteam # 结合两者 probeagent attack https://agent.example.com/api -p standard --converters advanced --redteam ``` 安装命令：`pip install 'probeagent-ai[pyrit]'` ## 负责任的使用 ProbeAgent 专为**授权安全测试**而设计。在使用 ProbeAgent 之前： - 确保您拥有测试目标系统的**明确许可** - 仅测试您拥有或获得书面授权测试的系统 - 遵循您组织的安全测试策略 - 通过适当的披露渠道报告漏洞对您不拥有或未获测试许可的系统未经授权使用此工具，可能会违反法律法规。 ## 致谢 ProbeAgent 的间接注入和配置篡改攻击灵感来自 [Zenity Labs](https://labs.zenity.io) 的研究。PyRIT 集成使用了 [Microsoft PyRIT](https://github.com/Azure/PyRIT) 的组件（MIT 许可证）。完整致谢请参阅 [ATTRIBUTION.md](ATTRIBUTION.md)。 ## 开发 ``` # 安装 dev dependencies pip install -e ".[dev]" # 运行测试 python -m pytest tests/ -v # Lint ruff check src/ tests/ # Format ruff format src/ tests/ ``` 完整的开发指南请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。 ## 路线图 - [x] **Phase 1**：CLI、HTTP 目标、评分、报告 - [x] **Phase 2**：9 个攻击类别，56 种多轮策略 - [x] **Phase 3**：OpenClaw 目标适配器、并行执行、War Room UI - [x] **Phase 4**：Zenity 启发的攻击（间接注入、配置篡改）、PyRIT 集成 - [ ] **Phase 5**：MCP 目标适配器、CI/CD 集成、SaaS 仪表板 ## 许可证 Apache 2.0 — 详情请参阅 [LICENSE](LICENSE)。 ## 更新日志版本历史请参阅 [CHANGELOG.md](CHANGELOG.md)。

标签：AI安全, AI智能体, Chat Copilot, ESC8, Offensive Security, Prompt注入, PyRIT, Python, 域名收集, 多智能体系统, 多轮对话攻击, 大模型攻防, 安全合规, 攻击模拟, 数据泄露防护, 文档结构分析, 无后门, 模型风险评估, 社会工程学, 网络代理, 网络探测, 越狱检测, 逆向工具, 零日漏洞检测, 驱动签名利用