danielamissah/cyberguard-multi-agent-cybersecurity-research-assistant

GitHub: danielamissah/cyberguard-multi-agent-cybersecurity-research-assistant

基于 LangGraph 的五智能体网络安全研究助手，结合 arXiv 论文 RAG 检索、实时威胁情报和 MITRE ATT&CK 模式分析来回答安全研究问题。

Stars: 0 | Forks: 0

# CyberGuard — 多智能体网络安全研究助手 **在线演示 → [huggingface.co/spaces/dkamissah/cyberguard-agent](https://huggingface.co/spaces/dkamissah/cyberguard-agent)** 一个生产级的 5 智能体 LangGraph 网络安全研究系统 —— 结合了基于 818 篇已索引 arXiv 论文的 RAG、实时威胁情报、恶意软件模式分析以及 AI 驱动的质量评估。端到端延迟约为 7.6 秒。 ## 主要结果 | 指标 | 数值 | | -------------------------- | ---------------------------------- | | 知识库 | 818 篇 arXiv 网络安全论文 | | 质量分数 | 85% (Critic 智能体评估) | | 每次查询的 RAG 来源 | 5 篇相关论文 | | 每次查询的实时网络来源 | 5 个威胁情报来源 | | 端到端延迟 | ~7.6 秒 | | LLM | 通过 Groq 调用 LLaMA 3.1-8B (~500 tok/s) | ## 智能体架构 ``` User Query │ ▼ Supervisor Agent — routes query, orchestrates pipeline │ ├──► RAG Agent — ChromaDB retrieval (818 cybersecurity papers) ├──► Web Search — live threat intelligence via Tavily API ├──► Code Analyst — MITRE ATT&CK mapping, malware pattern analysis ├──► Synthesiser — structured response with executive summary └──► Critic Agent — quality score 0–1, retries if below 0.7 (max 3x) ``` ### LangGraph 状态机 * 带有完整对话历史的类型化 `AgentState` * 条件路由 —— 如果质量 < 0.7，Critic 将触发重试循环 * 在回退到最佳可用回复前最多进行 3 次迭代 * 每次查询的所有智能体输出均记录到 MLflow 中 ## 技术栈 * **智能体框架:** LangGraph, LangChain * **向量数据库:** ChromaDB + Sentence-Transformers (all-MiniLM-L6-v2) * **LLM:** 通过 Groq API 调用 LLaMA 3.1-8B-Instant * **网络搜索:** Tavily API * **API:** FastAPI (9 个 endpoint) * **仪表板:** Streamlit (3 个标签页 — 研究助手、智能体追踪、系统信息) * **追踪:** MLflow * **基础设施:** Docker, GitHub Actions CI/CD → HF Spaces ## 仪表板 | 标签页 | 内容 | | ---------------------------- | ------------------------------------------------------------------------------------- | | **研究助手** | 查询输入、示例查询、每个智能体的进度条、带有来源的结构化结果 | | **智能体追踪** | 交互式 pipeline 图表、智能体角色描述 | | **系统信息** | 知识库统计信息、图结构、实时 API 测试 | ## 在本地运行 bash ``` git clone https://github.com/danielamissah/cyberguard-agent.git cd cyberguard-agent pip install -r requirements.txt cp .env.example .env # 使用你的 API keys 编辑 .env make kb # build knowledge base (~5 min) make api # FastAPI at localhost:8001 make dashboard # Streamlit at localhost:8501 ``` ## 环境变量 | 变量 | 必需 | 获取方式 | | ----------------------- | -------- | ---------------------------------------------------------------------------- | | `GROQ_API_KEY` | ✅ | [console.groq.com](https://console.groq.com)— 免费 | | `HF_TOKEN` | ✅ | [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)— 免费 | | `TAVILY_API_KEY` | ✅ | [tavily.com](https://tavily.com)— 免费 (1,000次/月) | | `MLFLOW_TRACKING_URI` | ❌ | 可选 — 跳过以禁用追踪 | ## API Endpoint | 方法 | Endpoint | 描述 | | ------ | --------------------------- | ------------------------------ | | GET | `/health` | 健康检查 | | GET | `/stats` | 知识库和智能体统计信息 | | POST | `/query` | 运行完整的 5 智能体 pipeline | | POST | `/query/batch` | 批量查询 (最多 5 个) | | GET | `/graph/info` | 智能体图结构 | | POST | `/knowledge-base/rebuild` | 从 arXiv 重建 ChromaDB | ## 知识库主题跨 10 个网络安全领域索引的 818 篇论文：对抗性 ML · 勒索软件检测 · 网络入侵检测 · 数据投毒 · 联邦学习安全 · 恶意软件分类 · 网络钓鱼检测 · 异常检测 · 网络威胁情报 · LLM 安全 ## 项目结构 ``` cyberguard-agent/ ├── src/ │ ├── graph/agent_graph.py # LangGraph StateGraph — all 5 agents │ ├── tools/knowledge_base.py # arXiv fetcher, chunker, ChromaDB indexer │ └── api/main.py # FastAPI application ├── dashboard/app.py # Streamlit dashboard (3 tabs) ├── configs/config.yaml # All hyperparameters ├── tests/test_smoke.py # Smoke tests (pytest) ├── .github/workflows/ci_cd.yml # Lint → test → Docker → GHCR ├── .env.example # Environment variable template ├── Dockerfile ├── Makefile └── requirements.txt ```

标签：Cloudflare, DLL 劫持, Kubernetes, MITRE ATT&CK, 人工智能, 多智能体, 大语言模型, 威胁情报, 开发者工具, 检索增强生成, 用户模式Hook绕过, 网络安全, 请求拦截, 逆向工具, 隐私保护