TheAyushTandon/RectitudeAI-PromptGuard

GitHub: TheAyushTandon/RectitudeAI-PromptGuard

一个生产级的 LLM 应用安全网关，通过多层检测机制防护 Prompt 注入、数据泄露和未授权工具调用等威胁。

Stars: 2 | Forks: 1

# RectitudeAI - LLM 安全网关 ![Python](https://img.shields.io/badge/python-3.11+-blue.svg) ![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-green.svg) ![License](https://img.shields.io/badge/license-MIT-blue.svg) 针对 LLM 应用程序的多层防御系统，具备运行时安全、Prompt 注入检测和行为异常监控功能。 ## 🎯 项目概述 RectitudeAI 是一个生产级的 LLM 应用程序安全网关，旨在解决自主 AI 系统中的关键漏洞，包括： - Prompt 注入攻击 - 数据泄露 - 未授权的工具执行 - 多轮越狱尝试 ### 当前状态：第二阶段进行中 🔄 **已实现：** - ✅ 支持 async 的 FastAPI 后端 - ✅ JWT 身份验证 - ✅ 基于 Redis 的速率限制 - ✅ LLM 集成 (OpenAI/Anthropic/Ollama) - ✅ 结构化日志 - ✅ 健康检查与监控 - ✅ Prompt 注入与恶意意图分类器 (`michellejieli/NSFW_text_classifier`) - ✅ 基于 Perplexity 的异常检测 - ✅ 具备风险聚合的策略引擎 - ✅ Red Team 引擎 - ✅ 加密工具签名 - ✅ 行为异常检测 - ✅ RL 策略更新器 ## 🚀 快速开始 ### 前置条件 - Python 3.11+ - Redis (用于速率限制) - OpenAI/Anthropic API key 或本地安装的 Ollama ### 安装 ``` # 1. Clone repository git clone https://github.com/TheAyushTandon/RectitudeAI-PromptGuard.git cd RectitudeAI-PromptGuard # 2. 创建 virtual environment python3.11 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # 3. 安装 dependencies pip install -r requirements.txt # 4. 设置 environment variables cp .env.example .env # 使用你的 API keys 编辑 .env # 5. 启动 Redis (如果尚未运行) # 选项 A: Docker docker run -d -p 6379:6379 redis:alpine # 选项 B: 系统服务 redis-server # 6. 运行 application uvicorn app.main:app --reload --port 8000 ``` ### 首次请求 ``` # 1. 登录以获取 JWT token curl -X POST http://localhost:8000/auth/login \ -H "Content-Type: application/json" \ -d '{ "username": "demo_user", "password": "demo_password_123" }' # 响应将包括: {"access_token": "eyJ...", ...} # 2. 发起 inference 请求 curl -X POST http://localhost:8000/v1/inference \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_TOKEN_HERE" \ -d '{ "user_id": "user_123", "prompt": "Explain quantum computing in simple terms", "max_tokens": 500, "temperature": 0.7 }' ``` ## 📁 项目结构 ``` rectitude-ai/ ├── backend/ │ ├── gateway/ # Main entry, Config, Auth, LLM │ ├── api/ # Health, Audit routes │ ├── layer1_intent_security/ # Intent & Injection classification │ ├── layer2_crypto/ # ZKP & FHE Engine (Vartika) │ ├── layer3_behavior_monitor/ # Agent Stability Index (ASI) │ ├── layer4_red_teaming/ # Automated attack runners │ └── layer5_orchestration/ # LangGraph governance ├── frontend/ # Dashboard (MINE) │ ├── app/ │ ├── components/ │ └── services/ └── datasets/ # Security datasets (JailbreakBench) ``` ## 🔧 配置 ### 环境变量 `.env` 中的关键设置： ``` # LLM Provider (选择一个) LLM_PROVIDER=openai # or anthropic, ollama # API Keys OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... # 安全 SECRET_KEY=your-secret-key-here JWT_EXPIRATION_MINUTES=60 # Rate Limiting RATE_LIMIT_REQUESTS=100 RATE_LIMIT_WINDOW_SECONDS=60 ``` ### 支持的 LLM 提供商 1. **OpenAI** (gpt-4, gpt-3.5-turbo) 2. **Anthropic** (claude-3-sonnet, claude-3-opus) 3. **Ollama** (llama2, mistral, mixtral - 本地) ## 📊 API 文档交互式 API 文档可通过以下地址访问： - **Swagger UI**: http://localhost:8000/docs - **ReDoc**: http://localhost:8000/redoc ### 关键端点 | 端点 | 方法 | 描述 | 需认证 | |----------|--------|-------------|---------------| | `/auth/login` | POST | 用户登录 | 否 | | `/v1/inference` | POST | LLM 推理 | 是 | | `/v1/models` | GET | 列出模型 | 是 | | `/health/` | GET | 健康检查 | 否 | ## 🧪 测试 ``` # 运行所有 tests pytest tests/ -v # 运行 with coverage pytest tests/ -v --cov=app --cov-report=html # 运行特定 test file pytest tests/test_api/test_endpoints.py -v # 运行特定 test pytest tests/test_api/test_endpoints.py::TestAuthEndpoints::test_login_success -v ``` **当前测试覆盖率：** >80% ## 🏗️ 开发路线图 ### 第一阶段：核心网关 ✅ (第 1-3 周) - FastAPI 服务器 - JWT 身份验证 - 速率限制 - LLM 集成 ### 第二阶段：检测层 🔄 (第 4-7 周) - 注入分类器 - Perplexity 检测器 - 策略引擎 ### 第三阶段：加密层 (第 8-10 周) - 文档哈希 - 工具调用签名 - 沙箱验证 ### 第四阶段：Red Team 引擎 (第 11-12 周) - 对抗性 Prompt 生成器 - 攻击运行器 - ASR 指标 ### 第五阶段：行为分析 (第 13-14 周) - 多轮检测 - 会话画像 - 异常评分 ## 🔒 安全特性 ### 第一与第二阶段 (当前) - ✅ 基于 JWT 的身份验证 - ✅ Token 过期与刷新 - ✅ 速率限制 (默认 100 请求/分钟) - ✅ 请求日志与审计追踪 - ✅ CORS 保护 - ✅ 输入验证 - ✅ 恶意意图与 Prompt 注入检测 (`michellejieli/NSFW_text_classifier`) ### 第三阶段及以后 (计划中) - 🔜 统计异常检测 - 🔜 基于风险的策略执行 - 🔜 HMAC 工具调用签名 - 🔜 持续 Red Team 测试 - 🔜 行为画像 ## 📈 性能指标 | 指标 | 目标 | 当前 | |--------|--------|---------| | 响应时间 | <500ms | ~300ms | | 吞吐量 | >1000 req/s | ~800 req/s | | 检测率 | >90% | N/A (第二阶段) | | 误报率 | <5% | N/A (第二阶段) | ## 🤝 参与贡献我们欢迎各种贡献！请遵循以下准则： 1. Fork 本仓库 2. 创建一个功能分支 (`git checkout -b feature/amazing-feature`) 3. 提交您的更改 (`git commit -m 'Add amazing feature'`) 4. 推送到分支 (`git push origin feature/amazing-feature`) 5. 发起一个 Pull Request ### 代码风格 - 遵循 PEP 8 - 使用类型提示 - 为所有函数添加 docstrings - 运行 `black` 进行格式化 - 运行 `flake8` 进行 lint ## 📝 许可证本项目根据 MIT 许可证授权 - 详情请参阅 LICENSE 文件。 ## 🙏 致谢 - OWASP Top 10 for LLM Applications - JailbreakBench 数据集 - FastAPI 框架 - OpenAI & Anthropic 提供的 LLM API **状态**：第二阶段进行中 | **下一个里程碑**：工具签名 (第三阶段)

标签：AI运行时安全, AI风险缓解, Anthropic, API安全, AV绕过, CCTV/网络接口发现, CISA项目, CIS基准, DNS 反向解析, FastAPI, JSON输出, JWT认证, Linux系统监控, LLM安全网关, LLM应用防火墙, LLM评估, Naabu, Ollama, OpenAI, Python, Redis, 人工智能安全, 内存规避, 内容安全, 合规性, 大语言模型安全, 工具滥用防护, 异常行为检测, 提示词注入检测, 搜索引擎查询, 无后门, 机密管理, 模型安全, 系统提示泄露, 网络安全, 自动化攻击, 自动红队, 请求拦截, 越狱攻击防御, 逆向工具, 隐私保护, 风险聚合