hemuhemu2014/Cyber-Threat-Hunting-Agent

GitHub: hemuhemu2014/Cyber-Threat-Hunting-Agent

AI 驱动的自主网络威胁狩猎 Agent，通过确定性规则引擎检测 8 种 Web 攻击并利用本地 LLM 生成人类可读的 SOC 事件报告。

Stars: 0 | Forks: 0

# 🛡️ AI 驱动的自主网络威胁狩猎 Agent 一个为 HackNocturn 黑客松构建的实时安全运营中心（SOC）模拟系统。该系统监控一个实时的“受害者” Web 应用程序，应用确定性规则检测 8 种不同的攻击类型，使用本地 LLM 生成人类可读的事件报告，并将所有内容流式传输到交互式的 Streamlit dashboard。 ## 架构 ``` Attacker (Requestly browser extension) │ ▼ Victim App ─── FastAPI (backend/api.py) │ ▼ data/logs.sqlite ◄── written by backend │ ▼ Threat Hunting Agent (agent/main.py) ├── Rule Engine (agent/rules_engine.py) ← 100 % deterministic └── LLM Client (agent/llm_client.py) ← Ollama llama3.1:8b │ ▼ data/logs.sqlite ◄── alerts table written by agent │ ▼ Streamlit Dashboard (dashboard/app.py) ← auto-refresh every 3 s ``` **核心设计原则：** AI 做出*零*安全决策。所有的检测、评分和缓解逻辑都存在于确定性规则引擎中。 LLM 仅负责将规则输出转化为经过润色的 SOC 报告文本。 ## 检测到的威胁类型 | # | 威胁类型 | 触发条件 | |---|-------------|---------| | 1 | **暴力破解攻击** (Brute Force Attack) | 同一 IP 每分钟 > 5 次登录失败 | | 2 | **端点侦察** (Endpoint Reconnaissance) | 每分钟 > 3 次受限路径访问 | | 3 | **未授权访问扫描** (Unauthorized Access Scan) | 每分钟 > 6 次 HTTP 401/403 响应 | | 4 | **撞库攻击** (Credential Stuffing) | 每分钟 > 8 个不同的用户名登录失败 | | 5 | **账户接管** (Account Takeover) | 登录失败 → 成功登录，同一 IP，5 分钟时间窗口内 | | 6 | **数据泄露** (Data Exfiltration) | 每分钟 > 20 次对数据端点的成功 GET 请求 | | 7 | **路径遍历攻击** (Path Traversal Attack) | 请求路径中包含任何 `../` 或 URL 编码的遍历 | | 8 | **DoS 速率洪泛** (DoS Rate Flood) | 单个 IP 每分钟总请求数 > 100 | ## 项目结构 ``` cyber-threat-hunting-agent/ │ ├── backend/ │ ├── api.py FastAPI victim-app (login, admin, data endpoints + catch-all) │ ├── db.py SQLite helpers (init_db, get_connection) │ ├── logger.py log_event() — writes every request to the DB │ └── main.py Entry point: init DB + start uvicorn │ ├── agent/ │ ├── main.py Core loop: Observe → Hypothesize → Investigate → Decide → Explain │ ├── rules_engine.py All 8 detection rules + ThreatAlert dataclass + mitigations │ └── llm_client.py Ollama integration (generate_hypothesis, generate_incident_report) │ ├── dashboard/ │ └── app.py Streamlit SOC watch view (auto-refreshes via @st.fragment) │ ├── data/ │ ├── logs.sqlite Shared SQLite database (created at first run) │ └── schema.sql Table definitions for logs + alerts │ ├── mock_generator.py Standalone traffic + alert simulator (no backend/agent needed) ├── pyproject.toml Python dependencies (managed with uv) └── README.md This file ``` ## 前置条件 | 工具 | 版本 | 备注 | |------|---------|-------| | Python | 3.13+ | | | [uv](https://docs.astral.sh/uv/) | 最新版 | 依赖管理 | | [Ollama](https://ollama.com) | 最新版 | 本地 LLM runtime | | llama3.1:8b | — | `ollama pull llama3.1:8b` | ## 设置 ``` # 1. Clone 仓库 git clone cd cyber-threat-hunting-agent # 2. 使用 uv 安装依赖 uv sync # 3. 拉取 LLM 模型（一次性下载，约 4.7 GB） ollama pull llama3.1:8b # 4. 确保 Ollama 正在运行 ollama serve ``` ## 运行完整技术栈从项目根目录打开**三个独立的终端**。 ### 终端 1 — 后端（受害者应用） ``` uv run python -m backend.main ``` FastAPI 服务器启动于 `http://localhost:8000`。交互式 API 文档位于：`http://localhost:8000/docs` ### 终端 2 — 威胁狩猎 Agent ``` uv run python agent/main.py ``` Agent 每 5 秒轮询一次数据库。检测结果会带彩色横幅打印到控制台，并写回到 `data/logs.sqlite`。 ### 终端 3 — Dashboard ``` uv run streamlit run dashboard/app.py ``` Dashboard 在 `http://localhost:8501` 打开。它通过 Streamlit 的 `@st.fragment` 每 3 秒自动刷新一次。 ## 演示工作流 1. 启动上述所有三个组件。 2. 并排打开受害者应用（`http://localhost:8000`）和 Dashboard（`http://localhost:8501`）。 3. **队友 / 攻击者**在他们的浏览器中激活 Requestly 规则以模拟攻击： | 攻击 | Requestly 动作 | |--------|-----------------| | 暴力破解 | 使用错误密码循环触发 `POST /login` | | 侦察 | 高频触发 `GET /admin`、`/config`、`/internal`、`/.env` | | 数据泄露 | 循环触发 `GET /api/users`、`/export`、`/download` | 4. 观察 Dashboard 实时检测每次攻击，分配风险等级和置信度分数，并显示由 LLM 生成的事件报告。 ## 无需完整技术栈开发（Mock 生成器） `mock_generator.py` 会将真实的攻击流量和预置的警报直接写入 `data/logs.sqlite`，以便您可以独立构建和测试 dashboard 或 agent。 ``` # Continuous 模式 — 按计划触发攻击（按 Ctrl-C 停止） uv run python mock_generator.py # 执行所有 8 个攻击场景一次后退出 uv run python mock_generator.py --once # 清除 DB 中的所有数据（在 judge 演示运行之间使用） uv run python mock_generator.py --reset ``` dashboard 侧边栏中的“重置演示数据”按钮可实现与 `--reset` 相同的功能，只需点击一下即可。 ## 配置 ### Agent 阈值（`agent/rules_engine.py → BASELINE`） ``` BASELINE = { "max_failed_logins_per_min": 5, # Brute Force trigger "max_restricted_hits_per_min": 3, # Recon trigger "max_401_403_per_min": 6, # Auth Scan trigger "max_distinct_users_per_min": 8, # Credential Stuffing trigger "account_takeover_min_failures":3, # Account Takeover: failures before success counts "max_data_requests_per_min": 20, # Data Exfiltration trigger "path_traversal_min_attempts": 1, # Path Traversal: any single hit fires "max_requests_per_min": 100, # DoS Flood trigger "alert_cooldown_seconds": 300, # suppress re-alerts for 5 min per (IP, threat) } ``` ### Agent 轮询间隔（`agent/main.py`） ``` POLL_INTERVAL_SECONDS = 5 # how often to query the DB MIN_CONFIDENCE_TO_ALERT = 40 # drop alerts below this threshold ``` ### LLM 模型（`agent/llm_client.py`） ``` MODEL_NAME = "llama3.1:8b" # change to any model available in your Ollama instance ``` ## 回答“你如何信任 AI？” ## 依赖项 | 包 | 用途 | |---------|---------| | `fastapi` | 受害者 Web 应用程序 | | `uvicorn` | FastAPI 的 ASGI 服务器 | | `streamlit` | SOC 监控视图 dashboard | | `altair` | 时间线和攻击分布图表 | | `pandas` | dashboard 的 DataFrame 查询 | | `requests` | Agent 发往 Ollama 的 HTTP 调用 | | `sqlite3` | 内置 — 共享数据层 | ## 团队角色 | 角色 | 负责内容 | 相关文件 | |------|------|-------| | 后端 | 受害者应用 + 事件记录 | `backend/` | | Agent | 规则引擎 + LLM 集成 | `agent/` | | Dashboard | SOC 监控视图 | `dashboard/` | **数据层契约：** - `backend/` 仅**写入** `logs` 表。 - `agent/` **读取** `logs`，**写入** `alerts`。 - `dashboard/` **读取**这两者 — 绝不进行写入操作。

标签：AI风险缓解, AV绕过, CISA项目, DLL 劫持, FastAPI, Kubernetes, Streamlit, Web安全, 域名收集, 大语言模型, 安全运营, 扫描框架, 红队行动, 蓝队分析, 访问控制, 逆向工具, 配置错误