jasoncobra3/LLM_Sentinel

GitHub: jasoncobra3/LLM_Sentinel

面向大语言模型的企业级自动化红队测试框架，支持多种攻击策略与多提供商统一检测，并能自动生成安全评估报告。

Stars: 4 | Forks: 0

# 🛡️ LLM Sentinel [LLM 红队平台] **面向大语言模型的企业级自动化安全测试** [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Streamlit](https://img.shields.io/badge/Streamlit-1.32+-FF4B4B.svg)](https://streamlit.io) [![FastAPI](https://img.shields.io/badge/FastAPI-0.110+-009688.svg)](https://fastapi.tiangolo.com) [![DeepTeam](https://img.shields.io/badge/DeepTeam-3.8+-orange.svg)](https://github.com/confident-ai/deepeval) [![LangChain](https://img.shields.io/badge/LangChain-1.2+-green.svg)](https://langchain.com) ## 📌 项目描述 ### 挑战随着组织越来越多地在生产应用中采用大语言模型 (LLM)，确保其安全性以抵御对抗性攻击、越狱和提示注入已变得至关重要。传统的安全测试方法不足以评估 LLM 特有的漏洞。 ### 我们的解决方案 **LLM 红队平台** 是一个专为 LLM 设计的全面、生产就绪的安全测试框架。它通过对抗性红队技术自动发现漏洞，使安全研究人员、AI 工程师和组织能够在漏洞被生产环境利用之前识别出弱点。
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jasoncobra3/LLM_Red_Teaming) ### 适用人群？ - 进行 AI 安全评估的 **安全研究人员** - 构建生产级 LLM 应用的 **AI/ML 工程师** - 确保合规性和安全性的 **企业组织** - 专注于 AI 系统测试的 **红队专业人士** ### 核心价值主张 - **全面覆盖**：使用 12+ 种攻击方法测试 7+ 个漏洞类别 - **多提供商支持**：通过统一接口支持任何 LLM - **生产就绪**：具备持久化、身份验证和报告功能的企业级方案 - **可扩展框架**：支持自定义攻击和集成的模块化架构 - **自动化工作流**：减少 80% 以上的手动测试工作量 ## 🏗️ 架构概览 ### 系统设计 #### 架构 ![Architecture](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/b8fa316869032522.png) #### 流程图 ![Flow Diagram](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/1d9cf5739a032523.png) #### LLM 流程 ![LLM Flow](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/5cdeb9005a032524.png) #### 时序图 ![Sequence Diagram](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/791aed0c13032525.png) ## ⚙️ 技术栈 ### 后端 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **核心语言** | Python 3.10+ | 应用程序运行时 | | **Web 框架** | FastAPI 0.115+ | REST API 和异步请求处理 | | **UI 框架** | Streamlit 1.32+ | 交互式仪表板 | | **红队测试** | DeepTeam 3.8+ | 对抗性测试框架 | | **LLM 集成** | LangChain 1.2+ | 通用 LLM 提供商接口 | ### 前端 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **模板** | Jinja2 3.1+ | 服务端 HTML 渲染 | | **样式** | Custom CSS | 现代、响应式 UI 设计 | | **图表** | Plotly 6.0+ | 交互式数据可视化 | | **静态资源** | FastAPI StaticFiles | CSS/JS/镜像服务 | ### 数据库 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **主数据库** | SQLite | 嵌入式关系型数据库 | | **ORM** | SQLAlchemy 2.0+ | 数据库抽象和迁移 | | **模型** | SQLAlchemy ORM | 扫描、测试用例、配置 | ### 云 & DevOps | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **云平台** | Microsoft Azure | 应用程序托管和基础设施 | | **计算** | Azure App Service / AKS | Web 应用程序部署 | | **密钥** | Environment Variables | API 密钥管理 | ### AI/ML 提供商 | 提供商 | 集成方式 | 支持的模型 | |----------|-------------|------------------| | **OpenAI** | langchain-openai | GPT-4, GPT-3.5, GPT-4o | | **Azure OpenAI** | langchain-openai | Azure 托管的 OpenAI 模型 | | **Anthropic** | langchain-anthropic | Claude 3, Claude 2 | | **Google** | langchain-google-genai | Gemini Pro, Gemini Ultra | | **Groq** | langchain-groq | Llama 3, Mixtral | | **AWS Bedrock** | langchain-aws | Bedrock 模型 | | **HuggingFace** | langchain-huggingface | 开源模型 | ### 身份验证 & 安全 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **密码哈希** | bcrypt 4.2+ | 安全的凭据存储 | | **会话管理** | Starlette SessionMiddleware | 有状态身份验证 | | **密钥管理** | python-dotenv | 环境配置 | ### 监控 & 日志 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **日志记录** | Python logging module | 结构化应用程序日志 | | **日志输出** | File-based logging | 审计追踪和调试 | ### 实用工具 | 组件 | 技术 | 用途 | |-----------|-----------|---------| | **PDF 生成** | fpdf2 2.8+ | 安全报告生成 | | **数据处理** | Pandas 2.2+ | 结果分析和聚合 | | **验证** | Pydantic 2.10+ | 类型安全的配置管理 | | **环境配置** | pydantic-settings 2.7+ | 设置验证 | ## ✨ 功能特性 ### 核心功能 ✅ **自动化红队测试** - 一键对任何 LLM 进行安全扫描 - 可配置的攻击强度（每个漏洞 1-20 次攻击） - 支持批量扫描多个模型 ✅ **多提供商 LLM 支持** - OpenAI (GPT-4, GPT-3.5, GPT-4o) - Azure OpenAI (所有部署) - Anthropic Claude (2.x, 3.x) - Google Gemini (Pro, Ultra) - Groq (Llama 3, Mixtral) - AWS Bedrock - HuggingFace 模型 ✅ **全面的漏洞测试** - **鲁棒性**：输入过度依赖、错误信息 - **间接注入**：跨提示泄露 - **越狱**：系统提示绕过 - **Shell 注入**：代码执行尝试 - **提示泄露**：系统提示提取 - **目标劫持**：任务重定向 - **代理间安全**：多代理漏洞 ✅ **攻击库** - 100+ 个预构建的对抗性提示 - 按攻击类型和严重程度分类 - 自定义攻击构建器界面 ### 高级功能 🔬 **攻击增强方法** - **越狱策略**：DAN, Evil Confidant, STAN, 角色扮演 - **编码攻击**：ROT13, Base64, 凯撒密码 - **提示探测**：迭代优化 - **灰盒测试**：部分知识利用 - **多语言攻击**：非英语提示 🎯 **自定义攻击构建器** - 用于创建自定义提示的可视化界面 - 基于模板的攻击创建 - 针对目标模型的实时测试 - 保存并重用自定义攻击 📊 **高级分析** - 使用 Plotly 图表的交互式仪表板 - 漏洞分布分析 - 时间序列趋势跟踪 - 攻击成功率指标 - 模型对比视图 📑 **专业报告** - 自动生成的 PDF 安全报告 - 包含风险评分的执行摘要 - 详细的测试用例细分 - 修复建议 - 合规性文档 ## 📂 项目结构 ``` Red_Teaming/ │ ├── app.py # Streamlit UI entry point ├── web_app.py # FastAPI web application ├── migrate_db.py # Database migration script ├── requirements.txt # Python dependencies ├── sample.json # Sample configuration ├── .env.example # Environment template │ ├── auth/ # Authentication module │ ├── __init__.py │ └── authentication.py # Login logic, password hashing │ ├── config/ # Configuration management │ ├── __init__.py │ ├── settings.py # Environment-based settings │ └── providers.py # LLM provider configurations │ ├── core/ # Core business logic │ ├── __init__.py │ ├── red_team_engine.py # Main orchestration engine │ ├── llm_factory.py # LLM instance factory │ ├── attack_registry.py # Vulnerability & attack registry │ ├── attack_library.py # Pre-built attack prompts │ ├── jailbreak_strategies.py # Jailbreak method implementations │ └── custom_red_team_engine.py # Custom attack execution │ ├── database/ # Data persistence layer │ ├── __init__.py │ ├── db_manager.py # Database operations & queries │ └── models.py # SQLAlchemy ORM models │ ├── reports/ # Report generation │ ├── __init__.py │ └── pdf_generator.py # PDF report creation │ ├── ui/ # Streamlit UI components │ ├── __init__.py │ ├── components/ # Reusable UI widgets │ │ ├── __init__.py │ │ ├── charts.py # Plotly visualization components │ │ ├── model_selector.py # LLM selection widget │ │ └── sidebar.py # Navigation sidebar │ └── pages/ # Application pages │ ├── __init__.py │ ├── dashboard.py # Main dashboard view │ ├── configure.py # Provider configuration │ ├── attack_lab.py # Attack testing interface │ ├── results.py # Scan results display │ └── reports_page.py # Report management │ ├── templates/ # Jinja2 HTML templates (FastAPI) │ ├── base.html # Base template with layout │ ├── index.html # Landing page │ ├── dashboard.html # Dashboard view │ ├── config.html # Configuration page │ ├── attack.html # Attack execution page │ ├── custom_attack.html # Custom attack builder │ ├── results.html # Results display │ └── reports.html # Reports page │ ├── static/ # Static assets (CSS, JS, images) │ ├── css/ │ ├── js/ │ └── images/ │ ├── utils/ # Utility functions │ ├── __init__.py │ ├── logger.py # Logging configuration │ └── helpers.py # Helper functions │ ├── docs/ # Documentation │ ├── README.md # Documentation index │ ├── ARCHITECTURE.md # Architecture details │ ├── CONTRIBUTING.md # Contribution guidelines │ ├── DEMO.md # Demo walkthrough │ ├── REQUIREMENTS.md # Detailed requirements │ ├── SECURITY.md # Security policies │ └── CHANGELOG.md # Version history │ ├── logs/ # Application logs (gitignored) ├── reports/ # Generated PDF reports (gitignored) └── __pycache__/ # Python bytecode (gitignored) ``` ### 关键目录说明 | 目录 | 用途 | |-----------|---------| | `auth/` | 处理用户身份验证、密码哈希和会话管理 | | `config/` | 集中配置管理和提供商定义 | | `core/` | 核心红队逻辑，包括引擎、工厂和攻击注册表 | | `database/` | SQLAlchemy 模型和数据库操作封装 | | `reports/` | 安全评估报告的 PDF 生成 | | `ui/` | 基于 Streamlit 的用户界面组件和页面 | | `templates/` | FastAPI Web 界面的 Jinja2 模板 | | `static/` | Web UI 的 CSS、JavaScript 和图片资源 | | `utils/` | 共享实用函数和助手 | | `docs/` | 全面的项目文档 | ## 🖥️ 本地设置指南 ### 前置条件请确保已安装以下内容： - **Python 3.10 或更高版本**（推荐 Python 3.11） - **pip**（Python 包管理器） - **Git**（用于克隆仓库） - 至少一个 LLM 提供商的 **API 密钥**（参见下方的提供商列表） ### 环境变量在项目根目录创建一个 `.env` 文件： ``` cp .env.example .env ``` ### 安装步骤 1. **克隆仓库：** ``` git clone cd Red_Teaming ``` 2. **创建虚拟环境：** ``` python -m venv venv ``` 3. **激活虚拟环境：** **Windows:** ``` .\venv\Scripts\activate ``` **macOS/Linux:** ``` source venv/bin/activate ``` 4. **安装依赖项：** ``` pip install --upgrade pip pip install -r requirements.txt ``` 5. **初始化数据库：** ``` python migrate_db.py ``` ### 运行应用程序 #### 选项 1：Streamlit UI（交互式仪表板） ``` streamlit run app.py ``` 应用程序将在 `http://localhost:8501` 打开 #### 选项 2：FastAPI Web 应用（REST API + HTML） ``` python web_app.py ``` 或直接使用 Uvicorn： ``` uvicorn web_app:app --reload --host 0.0.0.0 --port 8000 ``` 应用程序将在 `http://localhost:8000` 可用 ### 运行测试目前，项目使用手动测试工作流。要验证您的设置： 1. **测试提供商连接性：** - 导航到配置页面 - 为每个配置好的提供商点击 "Test Connection" 2. **运行示例扫描：** - 进入 Attack Lab - 选择模型和漏洞类型 - 执行一个小规模扫描（5 次攻击） 3. **验证结果：** - 检查 Results 页面的扫描输出 - 生成 PDF 报告 ## 📸 截图 ### 仪表板 ![Dashboard Overview](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/425caaac4f032526.png) ![Dashboard Analytics](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/9f5875d141032527.png) ![Dashboard](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/c992920385032529.png) ### 配置 ![Configuration Setup](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/a4d0846cca032530.png) ![Configuration Options](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/a28134983f032532.png) ### 攻击实验室 ![Attack Lab Setup](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/f9deef11a9032533.png) ![Attack Lab Execution](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/7c063ca9e8032535.png) ### 自定义攻击 ![Custom Attack Configuration](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/7b743459f9032537.png) ![Custom Attack Results](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/ca659a8e56032538.png) ### PDF 报告 ![Report](https://static.pigsec.cn/wp-content/uploads/repos/2026/03/9832695da6032539.png) ## 📚 文档 [`docs/`](docs/) 文件夹中提供了全面的文档： - 📐 **[架构](docs/ARCHITECTURE.md)** - 系统设计和技术细节 - 🤝 **[贡献](docs/CONTRIBUTING.md)** - 如何为项目做出贡献 - 🔒 **[安全](docs/SECURITY.md)** - 安全最佳实践和注意事项 - 🎬 **[演示指南](docs/DEMO.md)** - 创建演示和视频 - 📋 **[更新日志](docs/CHANGELOG.md)** - 版本历史和发布说明 - 📦 **[需求](docs/REQUIREMENTS.md)** - 系统和依赖项要求 ### 支持的模型 | 提供商 | 模型 | 备注 | |----------|--------|-------| | **OpenAI** | GPT-4, GPT-4o, GPT-3.5-turbo | JSON 支持最佳 | | **Groq** | Llama 3.x, Mixtral, Gemma | 快速，提供免费层级 | | **Anthropic** | Claude 3 Opus, Sonnet, Haiku | 高质量响应 | | **Azure OpenAI** | 同 OpenAI | 企业级部署 | | **Google** | Gemini Pro, Gemini Pro Vision | 多模态支持 | | **Ollama** | 任何本地模型 | 注重隐私 | ## 🙏 致谢 ### 框架 & 库 - **[DeepEval](https://github.com/confident-ai/deepeval)** - 红队框架 - **[LangChain](https://github.com/langchain-ai/langchain)** - LLM 编排 - **[Streamlit](https://streamlit.io)** - 快速 UI 开发 - **[FastAPI]()** - 现代 API 框架 ### 安全研究 - **[OWASP LLM Top 10](https://owasp.org/www-project-top-10-for-large-language-model-applications/)** - 漏洞分类 - **[NIST AI RMF](https://www.nist.gov/itl/ai-risk-management-framework)** - 风险管理框架 - 安全研究人员和开源社区 ## ⭐ Star 历史如果您觉得这个项目有用，请考虑给它一个 Star！ ⭐ [![Star History Chart](https://api.star-history.com/svg?repos=YOUR_USERNAME/Red_Teaming&type=Date)](https://star-history.com/#YOUR_USERNAME/Red_Teaming&Date) ## 📈 统计 ![GitHub repo size](https://img.shields.io/github/repo-size/YOUR_USERNAME/Red_Teaming) ![GitHub language count](https://img.shields.io/github/languages/count/YOUR_USERNAME/Red_Teaming) ![GitHub top language](https://img.shields.io/github/languages/top/YOUR_USERNAME/Red_Teaming) ![GitHub last commit](https://img.shields.io/github/last-commit/YOUR_USERNAME/Red_Teaming) ![GitHub issues](https://img.shields.io/github/issues/YOUR_USERNAME/Red_Teaming) ![GitHub pull requests](https://img.shields.io/github/issues-pr/YOUR_USERNAME/Red_Teaming)

**用 ❤️ 为 AI 安全而构建** [⬆ 回到顶部](#-llm-red-teaming-platform)

标签：AI伦理, AV绕过, DeepTeam, DLL 劫持, FastAPI, Kubernetes, LangChain, Naabu, PE 加载器, Python, Streamlit, 人工智能安全, 企业级安全框架, 内容安全, 合规性, 多模型支持, 大语言模型, 安全合规, 对抗攻击, 敏感信息检测, 无后门, 模型加固, 系统提示泄露, 网络代理, 网络安全, 访问控制, 越狱检测, 轻量级, 逆向工具, 隐私保护