shoebsyedm24/Agentic-AI-Red-Team

GitHub: shoebsyedm24/Agentic-AI-Red-Team

一个基于 Docker 沙箱的 AI 智能体红队演练实验室，利用 CrewAI 编排的多智能体对故意留有行为漏洞的 AI 靶标发起自动化攻击，覆盖提示词注入、越狱、RAG 投毒等 AI 特有威胁场景。

Stars: 0 | Forks: 0

# Agentic AI 红队实验室 [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/05/891f799271000805.svg)](https://github.com/shoebsyedm24/agentic-ai-red-team/actions/workflows/ci.yml) ## 这是什么本实验室模拟了一次真实的 AI 红队攻防演练 —— 完全在 Docker 中的 localhost 上运行。你将构建故意留有漏洞的 AI 智能体目标，然后使用专门的红队智能体对其进行攻击，这些攻击智能体本身由 Claude AI 驱动。 **本实验室与传统安全实验室的区别：** 每个目标都是一个*具备智能体特性的* AI 系统（而不仅仅是一个 Web 应用），每次攻击都利用的是 *行为*漏洞 —— 提示词注入、越狱、RAG 数据投毒以及智能体权限提升。此外，第 5 个目标使用了 OWASP Juice Shop，展示了被攻陷的 AI 助手如何被武器化以攻击真实的 Web 应用。 ## 架构 ``` macOS Host ├── Red Team Agents (CrewAI + Claude Sonnet 4.6) │ Recon → Prompt Injection → Jailbreak → Data Extraction → Privilege Escalation │ → Web Attack (Juice Shop) → Report (auto-writes to Obsidian) │ ├── Obsidian Knowledge Base (MITRE ATLAS-organized, MCP-connected) │ Auto-populated with findings after each campaign │ └── Docker Sandbox (internal: true — NO internet egress) ├── target-chatbot :8080 — Prompt injection target ├── target-rag :8081 — RAG data poisoning target (ChromaDB) ├── target-agent :8082 — Agentic privilege escalation target ├── target-multiagent :8083 — Trust boundary target ├── juice-shop :3000 — OWASP Juice Shop (traditional web target) └── target-webagent :8084 — AI assistant wrapping Juice Shop (web attack chain) ``` **关键设计**：所有目标容器均使用 `MockVulnerableLLM` —— 一个基于规则的模拟器，而非真实的 Anthropic API。这使网络保持 `internal: true`（无互联网连接），使测试结果具有确定性，且无需任何成本。真实的 Anthropic API 仅由宿主机上的攻击智能体使用。 ## 红队攻击覆盖范围 | 攻击 | MITRE ATLAS | 目标 | 描述 | |--------|------------|--------|-------------| | 直接提示词注入 | AML.T0054.001 | chatbot | 通过用户消息覆盖系统提示词 | | 间接 / RAG 注入 | AML.T0054.002 | rag | 使用恶意文档污染知识库 | | LLM 越狱 | AML.T0051 | chatbot | 角色扮演、DAN、多轮 Crescendo 攻击 | | 系统提示词提取 | AML.T0057 | chatbot, rag | 提取隐藏的指令和凭据 | | 路径遍历 | AML.T0040 | agent | 读取预期工作空间之外的文件 | | 命令注入 | AML.T0040 | agent | 通过工具参数执行任意 bash 命令 | | 信任边界绕过 | AML.T0054.002 | multiagent | 通过不受信任的工作器输出进行注入 | | AI 介导的 Web 攻击 | AML.T0054 + OWASP A1 | webagent | 重定向 AI 以攻击 Juice Shop REST API | ## 快速入门 ### 前置条件 - [Docker Desktop](https://www.docker.com/products/docker-desktop/) - Python 3.12+ - Node.js 20+（用于 Obsidian MCP 插件） - [Obsidian](https://obsidian.md) ### 设置 ``` git clone https://github.com/shoebsyedm24/agentic-ai-red-team.git cd agentic-ai-red-team # 1. 配置 secrets (默认 DRY_RUN=true —— 在验证连接之前是安全的) cp .env.example .env # 编辑 .env → 添加您的 ANTHROPIC_API_KEY # 2. 安装 Python deps pip install -r requirements.txt # 3. 构建并启动 Docker targets bash scripts/start_lab.sh # 4. 初始化 RAG knowledge base python scripts/seed_rag.py # 5. 试运行 —— 在花费 API tokens 之前确认 agents 能正常工作 DRY_RUN=true python -m agents.campaign --target chatbot # 检查 obsidian-vault/03-Findings/ 和 audit.log # 6. 实战 campaign (先在 .env 中设置 DRY_RUN=false) python -m agents.campaign --target chatbot ``` ### Obsidian 知识库 1. 打开 Obsidian → **打开文件夹为仓库** → 选择 `obsidian-vault/` 2. 安装社区插件：搜索 "claude-code-mcp" → 安装并启用 3. 在 Claude Code 中：运行 `/mcp` → 验证 `obsidian` 显示为已连接 4. 在一次攻击战役后，发现结果会自动出现在 `obsidian-vault/03-Findings/` 中 ## 项目结构 ``` ├── agents/ # Red team agents (CrewAI) │ ├── roles/ # 7 specialized agents │ ├── tools/ # HTTP client, payload library, Obsidian writer, MITRE mapper │ ├── crew.py # CrewAI Crew + target-to-agent mapping │ └── campaign.py # CLI entry point ├── docker/ # Sandboxed vulnerable targets │ ├── shared/mock_llm.py # MockVulnerableLLM (used by all targets) │ ├── target-chatbot/ # Prompt injection target │ ├── target-rag/ # RAG poisoning target │ ├── target-agent/ # Privilege escalation target │ ├── target-multiagent/ # Trust boundary target │ └── target-webagent/ # AI web assistant wrapping Juice Shop ├── obsidian-vault/ # Knowledge base (auto-populated) │ ├── 01-Learning/ # 8 beginner step-by-step notes │ ├── 02-Tactics/ # MITRE ATLAS technique reference notes │ ├── 03-Findings/ # Auto-populated by Report Agent │ ├── 04-Targets/ # Target architecture profiles │ └── 05-Reports/ # Campaign summaries ├── pyrit_campaigns/ # PyRIT multi-turn attack campaigns ├── garak_scans/ # Garak automated scanning ├── .claude/ # Claude Code hooks + MCP config │ ├── settings.json # Hook definitions + obsidian MCP server │ └── hooks/ # pre_bash_safety.sh, post_bash_audit.sh, ... └── tests/ # pytest — target health + agent smoke tests ``` ## 学习路径 `obsidian-vault/01-Learning/` 文件夹包含 8 份对新手友好的笔记，从基本原理开始解释每个组件： | 笔记 | 主题 | |------|-------| | 01 | 什么是 Agentic AI？（chatbot 与 agent 的对比，为什么自主性会改变威胁模型） | | 02 | Agentic 应用的 OWASP Top 10（每个风险都映射到实验室目标） | | 03 | MITRE ATLAS 框架（战术、技术、攻击链） | | 04 | Docker 沙箱设计（为什么 `internal: true` 是关键控制） | | 05 | CrewAI 架构（角色、任务、顺序流程、记忆） | | 06 | Claude Code Hooks 与 MCP（生命周期事件、stdin JSON、Obsidian 同步） | | 07 | PyRIT 深入解析（Crescendo 多轮攻击） | | 08 | Garak 扫描器（150+ 探针、REST 生成器、HTML 报告） | ## 使用的工具与框架 | 工具 | 用途 | |------|---------| | [CrewAI](https://github.com/crewaiinc/crewai) | 多智能体编排 | | [Claude Sonnet 4.6](https://anthropic.com) | 攻击智能体的 AI 骨干 | | [PyRIT](https://github.com/Azure/PyRIT) | 多轮 Crescendo 攻击战役 | | [Garak](https://github.com/NVIDIA/garak) | 自动化 LLM 漏洞扫描（150+ 探针） | | [ChromaDB](https://www.trychroma.com) | 用于 RAG 投毒目标的向量数据库 | | [OWASP Juice Shop](https://github.com/juice-shop/juice-shop) | 用于 AI 介导攻击的传统 Web 目标 | | [Obsidian](https://obsidian.md) | 按 MITRE ATLAS 组织的知识库 | ## 参考的安全框架 - [OWASP Top 10 for LLM Applications 2025](https://owasp.org/www-project-top-10-for-large-language-model-applications/) - [MITRE ATLAS v5.4](https://atlas.mitre.org/) - [Agentic Security Scoping Matrix (AWS)](https://aws.amazon.com/blogs/security/) ## 许可证 MIT —— 仅用于学习、作品集展示以及经过授权的安全研究。 **切勿对您不拥有或未获得明确测试权限的系统发起攻击。**

标签：Agentic AI, AI安全, AI红队, AI风险管理, Chat Copilot, ChromaDB, CISA项目, Claude AI, CrewAI, Docker, Docker部署, IP 地址批量处理, LLM漏洞, Mitre ATLAS, Obsidian, OPA, OWASP Juice Shop, RAG数据投毒, TGT, Web安全, XXE攻击, 信任边界, 协议分析, 反取证, 多智能体, 大模型安全, 安全评估, 安全防御评估, 安全靶场, 攻防演练, 智能体安全, 权限提升, 网络安全实验, 蓝队分析, 行为安全, 请求拦截, 逆向工具, 靶场