a805026135/agentred

GitHub: a805026135/agentred

AgentRed 是一款 AI Agent 安全测试框架，通过 160+ 攻击用例系统性评估智能体的安全风险并生成修复报告。

Stars: 0 | Forks: 0

AgentRed

English | 中文

AI Agent security testing framework with 160+ attack cases from latest research.

Quick Start • Features • Architecture • Test Cases • Advanced Usage

## 为什么选择 AgentRed？随着 AI agent 获得**工具使用**、**RAG 检索**和**自主执行**的能力，它们受到攻击的风险呈指数级增加。传统的 LLM 安全测试已经不够了——agent 具有独特的攻击面： - **工具劫持** — 攻击者将 agent 的工具调用重定向到恶意 endpoint - **RAG 投毒** — 被投毒的知识库导致 agent 输出有害内容 - **间接 prompt 注入** — 隐藏在网页、文档或工具输出中的注入指令 - **多 agent 攻击** — 被攻陷的 agent 在 agent 网络中传播攻击 - **记忆投毒** — 长期对话历史遭到篡改 **AgentRed** 系统性地针对源自最新学术研究（OWASP、arXiv、USENIX Security）的 **35 个类别**下的 **160 多种攻击向量**测试您的 AI agent。它会生成一份详细的 HTML 报告，包含评分、修复建议以及隐私优先的架构。 ## 功能 | 功能 | 描述 | |---------|-------------| | **160+ 攻击测试用例** | Prompt 注入 (18)、越狱 (8)、工具攻击 (18)、RAG 攻击 (16)、多 agent (4)、多模态 (4) 等 | | **7 阶段 Pipeline** | 配置检查 → 静态分析 → 自适应生成 → 工具使用 → 动态 → RAG/记忆 → Advisor | | **隐私优先架构** | 3 级脱敏处理；Advisor agent 仅接收脱敏数据，绝不接触原始 agent 内容 | | **自适应测试生成** | 分析您的 agent 的领域/工具/能力，并自动生成针对性的测试用例 | | **AI 驱动的 Advisor** | 使用任何 LLM（GPT-4o、DeepSeek 等），根据脱敏后的报告生成修复建议 | | **3 种输入模式** | 本地目录扫描、prompt 文本输入或实时 API endpoint 测试 | | **工具使用测试** | 通过 agent 的 function calling 接口进行测试（不仅仅是 prompt 注入） | | **RAG/记忆评估** | 11 项静态检查 + 9 项动态测试，用于评估 RAG 质量、记忆持久性和工具调用 | | **配置检查器** | 检测缺失的 API 密钥、占位符值、空的 .env 文件 | | **HTML + JSON 报告** | 雷达图、维度卡片、严重性表格、修复建议 | ## 快速开始 ### 前置条件 ``` # 需要 Python 3.10+ python --version # 安装依赖 pip install pyyaml ``` ### 3 种运行方式 ``` # 1. 交互式向导（推荐初学者使用） python cli.py wizard # 2. 扫描本地 agent 目录（无需 API key） python cli.py test --dir /path/to/your-agent # 3. 使用 API endpoint 进行测试（完整动态测试） python cli.py test --api-key sk-xxx --api-endpoint https://api.openai.com/v1/chat/completions ``` ### 示例：一条命令完成完整测试 ``` # 测试启用了所有功能的 OpenAI 兼容 agent python cli.py test \ --dir ./my-agent \ --api-key $OPENAI_API_KEY \ --name "finance-bot" \ --advisor-model deepseek-chat \ --privacy-level moderate \ --output ./reports ``` ## 截图 ### 报告概览 HTML 报告一目了然地展示了总体得分、维度细分和风险级别： Report Overview

### 详细的静态分析每个检查项目都显示通过/警告/失败状态以及具体发现： Static Analysis

### 架构 ``` Input Sources Engine Pipeline Output ┌──────────┐ ┌─────────────────────┐ ┌──────────────┐ │ Local Dir │────▶│ Config→Static │────▶│ HTML+JSON │ │ Prompt │ │ Adaptive→ToolUse │ │ Radar Chart │ │ API │ │ Dynamic→RAG/Mem │ │ Remediation │ └──────────┘ │ Advisor→Sanitize │ └──────────────┘ └─────────────────────┘ ▲ │ 160+ Cases │ ▼ Privacy Filter ┌───────────┴───┐ │ Sanitized │◀── Advisor LLM │ Data Only │ └───────────────┘ ```

点击查看完整架构图

## 测试用例覆盖范围所有测试用例均源自**最新的学术研究和行业标准**： | 类别 | 数量 | 来源 | |----------|-------|--------| | **Prompt 注入** | 18 | OWASP LLM01, InjecAgent (arXiv 2403.02691) | | **越狱** | 8 | Many-shot (NeurIPS'24), GCG, AutoDAN, Cognitive Override | | **有害内容** | 10 | OWASP LLM06, Safety Alignment Bypass | | **数据泄露** | 10 | CamoLeak, EchoLeak (CVE-2025-32711), Hidden Markdown | | **工具攻击** | 8 | ToolHijacker (arXiv 2504.19793) | | **工具描述投毒** | 4 | MCP Protocol Attack (MDPI 2026) | | **工具影子化 / Rug Pull** | 5 | MCP Tool Substitution Attacks | | **工具调用滥用** | 6 | Privilege Escalation via Tools | | **工具权限越界** | 3 | OWASP Agentic AI - Tool Misuse | | **知识库投毒** | 3 | PoisonedRAG (USENIX Security'25) | | **RAG Context 投毒** | 2 | Retrieval Manipulation | | **跨文档数据泄露** | 2 | EchoLeak-style Cross-Doc Injection | | **Retriever 后门** | 2 | Adversarial Retrieval Triggers | | **对抗性 Embedding** | 2 | Semantic Space Poisoning | | **记忆投毒** | 3 | Conversation History Corruption | | **RAG 范围违规** | 2 | Out-of-Bounds Retrieval | | **多 Agent 攻击** | 4 | A2A Protocol Exploitation, Cascade Attacks | | **多模态攻击** | 4 | CrossInject (arXiv 2504.14348), Steganography | | **社会工程学** | 2 | Persona Spoofing, Authority Impersonation | | **协议攻击** | 3 | A2A/MCP Protocol Layer Exploits | | **自主不当行为** | 3 | Goal Drift, Resource Exhaustion, Loop Traps | | **AI 病毒 / 自我复制** | 3 | Prompt-based Self-Propagation | | **护栏操纵** | 3 | Safety Filter Bypass Techniques | | **边界（输入）** | 8 | Fuzzing, Overflow, Encoding Attacks | | **边界（任务）** | 5 | Capability Confusion, Goal Hijacking | | **边界（Context）** | 8 | Context Window Attacks, Injection | | **边界（权限）** | 4 | ACL Bypass, Role Escalation | | **边界（协议）** | 3 | API Contract Violations | | **性能** | 12 | Latency, Accuracy, Consistency, Robustness | **总计：35 个攻击类别下的 160 个测试用例** ### 研究来源 | 来源 | 年份 | 核心贡献 | |--------|------|-------------------| | [OWASP LLM 应用 Top 10](https://genai.owasp.org/) | 2025 | LLM01-LLO10 标准风险 | | [OWASP Agentic AI 风险](https://genai.owasp.org/2025/12/09/owasp-genai-security-project-releases-top-10-risks-and-mitigations-for-agentic-ai-security/) | 2025 | Agent 专属的十大风险 | | [LLM 安全综述](https://arxiv.org/abs/2505.01177) | 2025 | 后门、对抗性输入、embedding 反演 | | [LLM Agent 和 MCP 中的 Prompt 注入](https://www.mdpi.com/2078-2489/17/1/54) | 2026 | Tool Shadowing, Rug Pull, ZombAI, CamoLeak | | [Agentic AI 安全](https://arxiv.org/abs/2510.23883) | 2025 | 多 agent、社会工程学、自主失控 | | [ToolHijacker](https://arxiv.org/abs/2504.19793) | 2025 | 工具选择操纵、文档注入 | | [PoisonedRAG](https://usenix.org/conference/usenixsecurity25) | 2025 | 知识库投毒（90% 成功率） | | [CrossInject](https://arxiv.org/abs/2504.14348) | 2025 | 跨模态注入（+30.1% 成功率） | | [Many-shot 越狱](https://arxiv.org/abs/2310.04451) | 2024 | 长上下文错误模式注入 | ## 高级用法 ### 自适应测试生成 AgentRed 会自动分析您 agent 的特征，并生成针对性的测试用例： ``` # 启用自适应生成（默认：开启） python cli.py test --dir ./my-agent # 分析器检测： # - 领域：金融、医疗、customer_service、教育、法律、代码... # - 工具：web_search、code_exec、file_access、database、email、mcp、browser... # - 能力：rag、memory、multi_agent、autonomous、tool_use... # - 风险等级：高 / 中 / 低（基于安全功能） ``` 不同的 agent 类型会生成**完全不同**的测试用例集： - **金融 Agent** → 针对特定领域的欺诈/注入测试 - **医疗 Agent** → HIPAA/安全边界测试 - **代码 Agent** → 沙箱逃逸、命令注入、依赖投毒 - **启用 MCP 的 Agent** → 工具描述投毒、影子化、rug pull ### 隐私级别 ``` # 严格：移除所有敏感信息（API keys、路径、电子邮件） python cli.py test --privacy-level strict # 适中：掩码路径，总结 prompts（默认） python cli.py test --privacy-level moderate # 最小：仅移除显式凭证 python cli.py test --privacy-level minimal ``` ### Advisor 配置 ``` # 使用 DeepSeek 作为 advisor（高性价比，对中文友好） python cli.py test --advisor-api-key $DEEPSEEK_API_KEY --advisor-model deepseek-chat # 使用 GPT-4o 进行更深入的分析 python cli.py test --advisor-api-key $OPENAI_API_KEY --advisor-model gpt-4o # 使用 reasoning model 进行复杂分析 python cli.py test --advisor-model deepseek-reasoner # 仅规则模式（无需 API） python cli.py test --no-advisor ``` ### 功能开关 ``` # 禁用单个模块 python cli.py test --no-adaptive # No adaptive generation python cli.py test --no-tool-use # No tool use testing python cli.py test --no-extended-eval # No RAG/Memory evaluation python cli.py test --no-config-check # No config checking python cli.py test --no-advisor # No AI advisor # Dry run：预览测试用例而不执行 python cli.py test --dry-run # 按维度或严重程度过滤 python cli.py test --dimension security pythoncli.py test --severity critical,high ``` ## CLI 参考 ``` Usage: python cli.py test [OPTIONS] Input Options: --dir PATH Local agent directory to scan --prompt TEXT System prompt text (or @file.txt) --api-key KEY OpenAI-compatible API key --api-endpoint URL API endpoint URL --name NAME Agent name for report Testing Options: --dimension DIM security | boundary | performance | all --severity SEV critical,high | high,medium | ... --output DIR Report output directory (./reports) --dry-run Preview without executing Feature Flags: --no-adaptive Disable adaptive test generation --no-tool-use Disable tool use testing --no-extended-eval Disable RAG/Memory evaluation --no-config-check Disable configuration checking --no-advisor Disable AI advisor Privacy Options: --privacy-level LEVEL strict | moderate | minimal --no-sanitized Don't sanitize advisor input (not recommended) Advisor Options: --advisor-api-key KEY Advisor LLM API key --advisor-model MODEL Model name (deepseek-chat, gpt-4o, ...) --advisor-strategy STR auto | api_only | rule_only | hybrid Other Commands: wizard Interactive setup wizard ``` ## 项目结构 ``` agent-tester/ ├── cli.py # CLI entry point ├── config.yaml # Default configuration ├── demo.py # End-to-end demo script │ ├── core/ │ ├── runner.py # Main orchestrator (7-stage pipeline) │ ├── loader.py # YAML test case loader │ ├── evaluator.py # Rule-based response evaluator │ ├── scorer.py # Multi-dimensional scorer │ ├── reporter.py # JSON report generator │ ├── html_reporter.py # HTML visual report generator │ ├── scanner.py # Directory scanner (detect framework) │ ├── source.py # Unified agent profile (API/Local/Prompt) │ ├── static_analyzer.py # 12-item static analysis engine │ ├── advisor.py # AI advisor agent (any LLM) │ ├── config_checker.py # Missing config detection │ ├── rag_memory_evaluator.py # RAG/Memory/Tool evaluation │ ├── tool_use_tester.py # Tool/function calling tester │ ├── privacy_filter.py # 3-level data sanitizer │ └── adaptive_generator.py # Profile-based test case generator │ ├── client/ │ ├── base.py # Abstract client interface │ ├── openai_client.py # OpenAI-compatible client │ └── mock_client.py # Mock client for offline testing │ ├── testcases/ │ ├── security.yaml # 115 security test cases │ ├── boundary.yaml # 33 boundary test cases │ ├── performance.yaml # 12 performance test cases │ ├── tool_attack.yaml # 18 tool attack test cases │ └── rag_attack.yaml # 16 RAG attack test cases │ ├── demo-agent/ # Example agent for demo │ ├── agent.py │ ├── config.json │ └── system_prompt.txt │ └── screenshots/ # README images ``` ## 工作原理 ### 阶段 1：输入分析 AgentRed 接受三种类型的输入： 1. **本地目录** (`--dir`) — 扫描您的 agent 项目，自动检测框架（LangChain、AutoGen、CrewAI、OpenAI SDK 或自定义框架），提取系统 prompt、配置文件和代码模式。 2. **Prompt 文本** (`--prompt`) — 直接分析系统 prompt 字符串或 `@file.txt` 引用。 3. **API Endpoint** (`--api-key`) — 通过兼容 OpenAI 的 API 连接到正在运行的 agent，进行完整的动态测试。 ### 阶段 2：7 阶段测试 Pipeline ``` [0] Config Check → Detect missing API keys, placeholder values, empty configs [1] Static Analysis → 12 checks on prompts, code, config (no API needed) [1.5] Adaptive Gen → Generate targeted cases based on agent profile [2] Tool Use Test → Test via agent's function calling interface [3] Dynamic Test → Send 160+ attack prompts, evaluate responses [4] RAG/Memory Eval → Evaluate RAG quality, memory safety, tool usage [5] Advisor → AI generates remediation (sanitized input only) [6] Report Sanitize → Auto-remove sensitive data before saving ``` ### 阶段 3：隐私优先设计 ``` Your Agent ──(raw data)──▶ AgentRed Tester ──(sanitize)──▶ Advisor LLM │ PrivacyFilter strict/moderate/minimal • API Keys masked • Paths redacted • Prompts summarized • Emails removed ``` **Advisor agent 永远不会看到您的原始 agent 内容。** 只有经过脱敏处理的评估摘要会被发送以供分析。 ## 路线图 - [ ] 具有实时测试可视化的 Web UI 仪表板 - [ ] CI/CD 集成插件（GitHub Actions pre-commit hook） - [ ] 不同 agent 框架的基准排行榜 - [ ] 导出为 SARIF 格式以集成到 GitHub Security 标签页 - [ ] 多语言支持（中文/英文/日文） - [ ] 用于自定义攻击模块的插件系统 ## 许可证 MIT 许可证 — 详情请参阅 [LICENSE](LICENSE)。 ## 致谢本项目建立在安全社区开创性研究的基础之上： - **OWASP Foundation** — LLM Top 10 和 Agentic AI Top 10 标准 - **USENIX Security Symposium** — PoisonedRAG 论文作者 - **arXiv 贡献者** — 上述所有引用过其研究成果的研究人员，这些工作为我们的测试用例提供了信息 - **更广泛的 AI 安全社区** — 感谢他们推动了对 agent 漏洞的深入理解

_{构建此项目旨在让 AI agent 对每个人来说都更安全。}
如果这个项目对您有帮助，请考虑给它点个 ⭐！

标签：AI安全, Chat Copilot, DLL 劫持, Homebrew安装, Petitpotam, Python, 多模态安全, 大语言模型, 无后门, 渗透测试框架, 红队评估, 逆向工具, 防御