sahnoun11/ThreatLens

GitHub: sahnoun11/ThreatLens

这是一个利用 AI 将 Windows 和 Linux 日志转化为专家级威胁分析建议的开源 SOC 助手。

Stars: 1 | Forks: 0

# 🔍 ThreatLens ### AI 驱动的日志分析与威胁搜寻助手 [![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://python.org) [![Streamlit](https://img.shields.io/badge/Streamlit-1.x-FF4B4B?style=for-the-badge&logo=streamlit&logoColor=white)](https://streamlit.io) [![Groq](https://img.shields.io/badge/Groq-Free_API-F55036?style=for-the-badge)](https://groq.com) [![Ollama](https://img.shields.io/badge/Ollama-Local_AI-000000?style=for-the-badge)](https://ollama.com) [![PostgreSQL](https://img.shields.io/badge/PostgreSQL-pgvector-336791?style=for-the-badge&logo=postgresql&logoColor=white)](https://github.com/pgvector/pgvector) [![License](https://img.shields.io/badge/License-MIT-green?style=for-the-badge)](LICENSE) **由 [Oussama Sahnoun](https://sahnoun11.github.io/) 构建 — 致力于社区 🌍** *100% 免费的 AI 助手，像资深 SOC 分析师一样在您的日志中搜寻威胁* ThreatLens Banner

## 🎥 实时演示

▶️ Click to watch the full demo on YouTube

## 📌 概述 **ThreatLens** 是一个开源的 AI 驱动日志分析和威胁搜寻助手，专为 SOC 分析师和网络安全专业人士打造。上传您的 **Windows 事件日志**（`.evtx`）或 **Linux 日志**（`.txt`），用通俗易懂的英语提问，并在几秒钟内获得专家级的威胁搜寻答案 —— 比任何人工审查都要快。基于完全 **免费的技术栈** 构建：使用 Groq API 进行超快的 LLM 推理，使用 Ollama 进行本地私有嵌入，以及 PostgreSQL + pgvector 进行语义搜索。**数据不会离开您的机器。** ## ⚡ 主要功能 | 功能 | 描述 | |---|---| | 📂 **日志摄取** | 支持 Windows EVTX 和 Linux TXT 日志 | | 🌐 **URL 摄取** | 抓取任何网页并将其索引到知识库中 | | 🧠 **RAG 管道** | 日志被分块，在本地进行嵌入，并存储在 pgvector 中 | | ⚡ **Groq LLM** | 通过免费的 Groq API 使用 LLaMA 3.3 70B 实现超快推理 | | 🔒 **私有嵌入** | 通过 Ollama 完全在本地进行 —— 没有数据发送到云端 | | 💬 **对话记忆** | 具有完整历史感知的多轮聊天 | | 🆓 **100% 免费** | Groq API + Ollama + Docker —— 零成本 | ## 📥 支持的日志格式 | 格式 | 扩展名 | 状态 | |---|---|---| | Windows 事件日志 | `.evtx` | ✅ 已支持 | | 纯文本 / Linux 日志 | `.txt` | ✅ 已支持 | | JSON 日志 | `.json` | 🔜 路线图 | | CSV 日志 | `.csv` | 🔜 路线图 | | XML 事件日志 | `.xml` | 🔜 路线图 | ## 🏗️ 架构 ``` User uploads log (EVTX / TXT) or pastes a URL │ ▼ Chunker (300 chars/chunk) Safe for nomic-embed-text 512-token limit │ ▼ Ollama (nomic-embed-text) ──► pgvector (PostgreSQL) │ ▼ User asks a question │ ▼ Semantic search in pgvector │ ▼ Groq LLM — LLaMA 3.3 70B │ ▼ Expert threat hunting answer 🎯 ``` ## 🧰 技术栈 | 组件 | 技术 | |---|---| | UI | Streamlit | | LLM | Groq API — `llama-3.3-70b-versatile` (免费) | | 嵌入 | Ollama — `nomic-embed-text` (本地，私有) | | 向量数据库 | PostgreSQL + pgvector (Docker) | | RAG 框架 | phidata | | EVTX 解析器 | `python-evtx` | ## 🚀 快速开始 ### 前置条件 - Python 3.10+ - [Docker](https://www.docker.com/) - [Ollama](https://ollama.com/) - 一个免费的 [Groq API key](https://console.groq.com/keys) —— 无需信用卡 ### 1. 克隆仓库 ``` git clone https://github.com/sahnoun11/threatlens.git cd threatlens ``` ### 2. 启动数据库 ``` docker run -d --name threatlens-db --restart always \ -e POSTGRES_DB=ai \ -e POSTGRES_USER=ai \ -e POSTGRES_PASSWORD=ai \ -p 5532:5432 ankane/pgvector ``` ### 3. 启动 Ollama 并拉取嵌入模型 ``` ollama serve ollama pull nomic-embed-text ``` ### 4. Python 环境设置 ``` python3 -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install -r requirements.txt ``` ### 5. 设置您的免费 Groq API key ``` export GROQ_API_KEY="your_free_key_here" ``` 或者创建一个 `.env` 文件： ``` GROQ_API_KEY=your_free_key_here ``` ### 6. 启动 ThreatLens ``` streamlit run app.py ``` 在浏览器中打开 **http://localhost:8501** 🚀 ## 📁 项目结构 ``` threatlens/ ├── app.py # Streamlit UI + file readers + chunking logic ├── assistant.py # AI brain — Groq + RAG + SOC analyst prompts ├── requirements.txt # Python dependencies ├── assets/ # Banner and media files ├── LICENSE # MIT License └── README.md ``` ## 📦 requirements.txt ``` streamlit>=1.35.0 phidata>=2.4.0 groq>=0.9.0 ollama>=0.2.0 pgvector>=0.2.5 psycopg[binary]>=3.1.0 sqlalchemy>=2.0.0 python-evtx>=0.7.4 evtx>=0.8.2 requests>=2.31.0 beautifulsoup4>=4.12.0 openai>=1.0.0 ``` ## 🧪 测试数据集没有日志可以用来测试？这里有一些很棒的免费资源： **Windows EVTX:** - [EVTX-ATTACK-SAMPLES](https://github.com/sbousseaden/EVTX-ATTACK-SAMPLES) — 映射到 MITRE ATT&CK 的真实攻击场景 - [OTRF Security Datasets](https://github.com/OTRF/Security-Datasets) — 模拟的 APT 战役 - [evtx-baseline](https://github.com/NextronSystems/evtx-baseline) — 基线与异常 EVTX **Linux Logs (TXT):** - [Loghub](https://github.com/logpai/loghub) — SSH, syslog, auth.log 样本 - [SecRepo](http://www.secrepo.com/) — Apache, DNS, IDS 日志 **来自您自己的机器：** ``` cp /var/log/auth.log ~/test_auth.txt cp /var/log/syslog ~/test_syslog.txt dmesg > ~/test_dmesg.txt ``` ## 🎯 示例问题上传日志文件后，试着问： ``` What failed login attempts are in this log? Are there any signs of lateral movement? Summarise all privilege escalation events. List all unique source IPs and flag suspicious ones. What happened between 2:00 AM and 3:00 AM? Are there any indicators of compromise (IOCs)? ``` ## 📄 许可证 MIT — 详见 [LICENSE](LICENSE)。

**👨‍💻 由 [Oussama Sahnoun](https://sahnoun11.github.io/) 构建** *"分析更快。搜寻更聪明。保持领先。"* 🛡️ **⭐ 如果您觉得这个项目有用，请给个 Star！**

标签：AI风险缓解, Kubernetes, Linux日志, LLaMA, LLM评估, Ollama, pgvector, PostgreSQL, Streamlit, Windows日志, 二进制发布, 人工智能, 免费安全工具, 安全运营, 开源工具, 扫描框架, 本地部署, 测试用例, 用户模式Hook绕过, 网络安全, 访问控制, 语义搜索, 请求拦截, 逆向工具, 隐私保护