valoryan334-art/AI-red-team

GitHub: valoryan334-art/AI-red-team

面向大语言模型的自动化红队对抗测试与安全评估框架，提供多维度攻击向量编排、评分引擎和报告生成能力。

Stars: 0 | Forks: 0

# AI-red-team 一个用于大型语言模型对抗性测试和安全评估的 AI 红队框架。 ## 仓库状态 **本地仓库**：✅ 已初始化并提交 **远程配置**：`https://github.com/kkdevil6/AI-red-team.git` **分支**：`main` **提交**：初始提交包含 62 个文件（通过 .gitignore 排除了 Python 缓存） ## 项目结构 ``` ai-red-team/ ├── agathon/ # Main orchestration module │ ├── attack_tier_logic.py │ ├── orchestrator.py │ └── reporter.py ├── attacks/ # Attack implementations │ ├── adversarial_robustness.py │ ├── autonomous_adversary.py │ ├── base_tester.py │ ├── chain_of_thought_hijacking.py │ ├── context_manipulation.py │ ├── data_exfiltration.py │ ├── emotional_manipulation.py │ ├── invisible_command_injection.py │ ├── logic_jailbreak.py │ ├── model_misuse.py │ ├── prompt_injection.py │ ├── rag_poisoning.py │ ├── system_prompt_extraction.py │ ├── token_smuggling.py │ └── unit/ # Utility modules │ ├── logger.py │ ├── payload_manager.py │ ├── report_generator.py │ └── scoring_engine.py ├── clients/ # LLM client implementations │ └── llm_client.py ├── prompts/ # Attack prompt templates │ ├── exfiltration_templates.txt │ └── injection_templates.txt ├── reports/ # Generated test reports ├── utils/ # General utilities │ ├── logger.py │ └── payload_manager.py ├── config.py # Configuration settings ├── main.py # Main entry point ├── run_redteam.py # Red team runner script ├── comprehensive_test.py # Comprehensive test suite └── requirements-agathon.txt # Python dependencies ``` ## 安装说明 ### 前置条件 - Python 3.10+ - 虚拟环境（推荐） ### 安装步骤 1. 克隆仓库： ``` git clone https://github.com/kkdevil6/AI-red-team.git cd AI-red-team ``` 2. 创建并激活虚拟环境： ``` python -m venv .venv .venv\Scripts\activate # Windows # 或 source .venv/bin/activate # macOS/Linux ``` 3. 安装依赖项： ``` pip install -r requirements-agathon.txt ``` ## 使用方法 ### 运行红队框架 ``` python main.py ``` ### 运行特定攻击测试 ``` python run_redteam.py ``` ### 综合测试 ``` python comprehensive_test.py ``` ## 攻击类型 - **Prompt Injection**：直接 prompt 操控攻击 - **Context Manipulation**：篡改模型上下文与行为 - **System Prompt Extraction**：提取隐藏的系统指令 - **Chain-of-Thought Hijacking**：干扰推理过程 - **Token Smuggling**：隐藏 token 注入技术 - **Data Exfiltration**：尝试未经授权的数据提取 - **Logic Jailbreak**：突破逻辑约束 - **Emotional Manipulation**：基于情感/情绪的攻击 - **Invisible Command Injection**：不可见命令插入 - **Model Misuse**：通用模型滥用场景 - **Adversarial Robustness**：测试针对对抗性输入的鲁棒性 - **Autonomous Adversary**：自主攻击模式 - **RAG Poisoning**：检索增强生成攻击 ## 配置编辑 `config.py` 以自定义： - LLM endpoint 与模型 - 攻击参数 - 报告输出格式 - 日志级别 ## 报告测试报告会在 `reports/` 目录中以 JSON 和 HTML 格式生成。 ## 许可证 [在此处指定您的许可证] ## 联系方式如有问题或疑虑，请联系开发团队。 **最后更新**：2026 年 4 月 26 日 **状态**：代码已提交至本地，等待推送到 GitHub

标签：AI安全, Chat Copilot, CISA项目, DLL 劫持, Homebrew安装, LLM漏洞扫描, Python, RAG投毒, TGT, Token走私, 上下文操纵, 人工智能, 反取证, 多模态安全, 大语言模型, 安全合规, 安全评估, 密码管理, 对抗性测试, 思维链劫持, 情感操纵, 提示注入, 搜索语句（dork）, 攻防演练, 无后门, 无形命令注入, 模型滥用, 用户模式Hook绕过, 系统提示词提取, 网络代理, 网络安全, 自动化攻击框架, 逻辑越狱, 隐私保护, 集群管理