nwaogwugwuchinaza6-blip/ai-threat-intelligence-agent

GitHub: nwaogwugwuchinaza6-blip/ai-threat-intelligence-agent

一个结合机器学习分类与 LangChain Agent 的自动化网络安全威胁情报分析平台，能从非结构化数据中提取、分类并生成结构化威胁报告。

Stars: 0 | Forks: 0

# AI 威胁情报 Agent ![Python](https://img.shields.io/badge/Python-3.11-blue?logo=python&logoColor=white) ![FastAPI](https://img.shields.io/badge/FastAPI-0.115-green?logo=fastapi&logoColor=white) ![LangChain](https://img.shields.io/badge/LangChain-0.3-orange) ![Docker](https://img.shields.io/badge/Docker-Compose-blue?logo=docker&logoColor=white) ![PostgreSQL](https://img.shields.io/badge/PostgreSQL-15-blue?logo=postgresql&logoColor=white) 一个用于网络安全威胁情报的自主多 Agent AI 系统。该平台摄取非结构化的威胁数据（如新闻文章、CVE 描述、事件报告），利用 NLP 和机器学习对威胁进行分类，将其存储在向量数据库中以进行语义搜索，并通过安全的 REST API 提供情报服务。LangChain Agent 能够自主推理威胁、查询知识库，并生成结构化的威胁报告。专为需要大规模集中化、分类和调查威胁数据的安全运营团队而构建。该系统将经典的机器学习模型（用于分类的 TF-IDF + 逻辑回归，用于异常检测的 Isolation Forest）与现代 LLM 能力（通过 LangChain 接入的 GPT-4o）以及基于 ChromaDB 的 RAG 相结合，从而提供自动化的分诊处理与深度分析报告。 ## 架构 ``` ┌──────────┐ HTTPS/REST ┌─────────────────────────────────────────────────────┐ │ │ ──────────────────► │ FastAPI App │ │ User │ │ ┌─────────┐ ┌──────────┐ ┌──────────────────┐ │ │ (Client) │ ◄────────────────── │ │ Auth │ │ Threats │ │ Agent Router │ │ │ │ JSON Response │ │ (JWT) │ │ CRUD │ │ /analyze/report │ │ └──────────┘ │ └────┬────┘ └────┬─────┘ └────────┬─────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌─────────┐ ┌──────────┐ ┌──────────────────┐ │ │ │PostgreSQL│ │ ML Layer │ │ LangChain Agent │ │ │ │ (Users, │ │Classifier│ │ (GPT-4o) │ │ │ │ Threats) │ │ Anomaly │ │ + Tool Executor │ │ │ └─────────┘ └────┬─────┘ └────────┬─────────┘ │ │ │ │ │ │ │ ┌────────┴────────┐ │ │ │ ▼ ▼ │ │ │ ┌────────────┐ ┌──────────┐ │ │ └──│ ChromaDB │ │ OpenAI │ │ │ │ (Vectors) │ │ API │ │ │ └────────────┘ └──────────┘ │ └─────────────────────────────────────────────────────┘ ``` ## 前置条件 - **Python 3.11+** - **Docker** 和 **Docker Compose** - 拥有 `gpt-4o` 和 `text-embedding-ada-002` 访问权限的 **OpenAI API key** - **PostgreSQL 15**（通过 Docker Compose 提供） ## 安装说明 ### 1. 克隆仓库 ``` git clone https://github.com/your-org/ai-threat-intelligence-agent.git cd ai-threat-intelligence-agent ``` ### 2. 配置环境 ``` cp .env.example .env ``` 编辑 `.env` 并设置您的 `OPENAI_API_KEY` 和一个强 `SECRET_KEY`。 ### 3. 使用 Docker Compose 启动 ``` docker-compose up --build ``` API 将可以通过 `http://localhost:8000` 访问。交互式文档位于 `http://localhost:8000/docs`。 ### 4. 运行数据库迁移 ``` docker-compose exec app alembic upgrade head ``` ### 本地开发（不使用 Docker） ``` python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -r requirements.txt python -m nltk.downloader punkt punkt_tab stopwords alembic upgrade head uvicorn app.main:app --reload ``` ## API Endpoints | Method | Path | Description | Auth Required | |--------|------|-------------|---------------| | GET | `/health` | 健康检查 | 否 | | POST | `/auth/register` | 注册新用户 | 否 | | POST | `/auth/login` | 登录并接收 JWT token | 否 | | POST | `/threats` | 创建威胁（自动分类 + 向量化） | Analyst 或 Admin | | GET | `/threats` | 列出威胁（分页） | 否 | | GET | `/threats/{id}` | 获取单个威胁 | 否 | | PUT | `/threats/{id}` | 更新威胁 | Admin | | DELETE | `/threats/{id}` | 软删除威胁 | Admin | | GET | `/threats/search?q=` | 通过 RAG 进行语义搜索 | 否 | | POST | `/agent/analyze` | 完整的 Agent 威胁分析 | 否 | | POST | `/agent/report` | 生成多威胁报告 | 否 | | GET | `/agent/status` | Agent 健康状况与 ChromaDB 统计信息 | 否 | ## AI Agent 的工作原理该系统使用 **检索增强生成 (RAG)** pipeline，并结合了 **多工具 Agent 编排**： 1. **摄取**：威胁文本由 scikit-learn TF-IDF + 逻辑回归模型进行分类，并通过 Isolation Forest 进行异常检测。 2. **向量化**：OpenAI 的 `text-embedding-ada-002` 将威胁文本转换为向量，存储在 ChromaDB（`threat_intelligence` collection）中。 3. **Agent 推理**：由 GPT-4o 驱动的 LangChain `AgentExecutor` 接收威胁描述，并自主决定调用哪些工具： - `classify_threat` — 基于机器学习的类别与严重性预测 - `search_knowledge_base` — 基于 ChromaDB 的语义相似度搜索 - `generate_threat_report` — 结构化的正式报告生成 4. **记忆**：`ConversationBufferWindowMemory` (k=5) 保留最近的上下文，以支持多轮分析。 5. **报告生成**：Agent 将调查结果综合为以下几个部分：执行摘要、技术细节、受影响系统、建议的缓解措施以及严重性评估。 ## 示例：分析威胁 ``` curl -X POST http://localhost:8000/agent/analyze \ -H "Content-Type: application/json" \ -d '{ "text": "A new LockBit ransomware variant is spreading via phishing emails targeting healthcare organizations. The malware encrypts patient records and demands payment in Bitcoin within 72 hours." }' ``` **响应：** ``` { "success": true, "data": { "threat_classification": "Ransomware", "severity": "CRITICAL", "confidence": 0.92, "similar_threats": [ { "title": "LockBit Ransomware Campaign", "description": "Ransomware encrypts enterprise files...", "severity": "CRITICAL", "distance": 0.18 } ], "agent_report": "Executive Summary: A critical ransomware threat targeting healthcare has been identified..." }, "message": "Threat analysis completed" } ``` ## 身份验证与 RBAC | Role | Permissions | |------|-------------| | **admin** | 对威胁的完全 CRUD 操作，用户管理 | | **analyst** | 创建威胁，运行 Agent 分析 | | **viewer** | 只读访问权限 | 注册并登录以获取 JWT token（有效期为 30 分钟）： ``` # 注册 curl -X POST http://localhost:8000/auth/register \ -H "Content-Type: application/json" \ -d '{"email": "analyst@secops.com", "password": "securepass123", "role": "analyst"}' # 登录 curl -X POST http://localhost:8000/auth/login \ -d "username=analyst@secops.com&password=securepass123" # 使用 token curl -X POST http://localhost:8000/threats \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"title": "Phishing Campaign", "description": "Spear phishing targeting executives..."}' ``` ## 测试 ``` pytest --cov=app --cov-report=term-missing -v ``` ## 项目结构 ``` ai-threat-intelligence-agent/ ├── app/ │ ├── main.py # FastAPI entry point │ ├── config.py # Pydantic settings │ ├── database.py # SQLAlchemy setup │ ├── models/ # ORM models │ ├── schemas/ # Pydantic schemas │ ├── routers/ # API route handlers │ ├── agents/ # LangChain agent + RAG │ ├── ml/ # Classifier + anomaly detector │ ├── security/ # JWT auth + user model │ └── middleware/ # RBAC ├── alembic/ # DB migrations ├── tests/ # pytest suite ├── docker-compose.yml ├── Dockerfile └── requirements.txt ``` ## 许可证 MIT

标签：AV绕过, DLL 劫持, FastAPI, Python, RAG, 多智能体, 大语言模型, 威胁情报, 开发者工具, 无后门, 测试用例, 请求拦截, 逆向工具