diljotkaur05/SOC-THREAT-INTELLIGENCE-

GitHub: diljotkaur05/SOC-THREAT-INTELLIGENCE-

一款结合规则引擎与机器学习的日志异常检测工具，支持多种日志格式的自动解析、威胁识别和可视化展示。

Stars: 0 | Forks: 0

# 🛡️ 日志异常检测器一款网络安全工具，能够自动解析日志文件，通过机器学习和基于规则的分析来检测威胁，并在交互式仪表板中展示分析结果。 ## 📁 项目结构 ``` log-anomaly-detector/ ├── app.py ← Streamlit dashboard (run this) ├── requirements.txt ← Python dependencies ├── README.md ← This file └── src/ ├── parser/ │ ├── log_parser.py ← Parses 5 log formats using regex │ └── normalizer.py ← Cleans timestamps, validates IPs ├── features/ │ └── engineer.py ← Builds numeric feature vectors for ML ├── models/ │ ├── isolation_forest.py ← ML: detects global anomalies │ ├── lof.py ← ML: detects local anomalies │ └── rule_engine.py ← 10 hardcoded security rules ├── scorer/ │ └── risk_scorer.py ← Combines all signals → final severity └── report/ └── pdf_export.py ← Generates PDF security report ``` ## 🚀 快速开始 ### 1. 安装依赖 ``` pip install -r requirements.txt ``` ### 2. 运行仪表板 ``` streamlit run app.py ``` ### 3. 在浏览器中打开 ``` http://localhost:8501 ``` ### 4. 尝试演示点击侧边栏中的 **"Generate & Analyze Demo Logs"**，查看该工具对模拟攻击的检测演示。 ## 🔍 支持的日志格式 | 格式 | 示例来源 | 自动检测？ | |---|---|---| | Apache / Nginx | `/var/log/apache2/access.log` | ✅ 是 | | Syslog | `/var/log/syslog` | ✅ 是 | | Auth.log | `/var/log/auth.log` | ✅ 是 | | 防火墙 | iptables kernel 日志 | ✅ 是 | | JSON | 应用程序结构化日志 | ✅ 是 | ## 🚨 检测到的威胁 | # | 威胁 | 方法 | 严重程度 | |---|---|---|---| | 1 | 暴力破解登录 | 规则引擎 | 严重 | | 2 | SQL 注入 | 规则引擎 | 严重 | | 3 | 端口扫描 | 规则引擎 | 高 | | 4 | 目录遍历 | 规则引擎 | 高 | | 5 | 扫描工具（Nikto 等） | 规则引擎 | 高 | | 6 | Root 登录尝试 | 规则引擎 | 高 | | 7 | 防火墙风暴 | 规则引擎 | 高 | | 8 | 非工作时间管理员访问 | 规则引擎 | 中 | | 9 | 高错误率 | 规则引擎 | 中 | | 10 | 敏感路径访问 | 规则引擎 | 中 | | 11 | 统计异常 | Isolation Forest | 可变 | | 12 | 本地离群行为 | LOF 模型 | 可变 | ## 🧪 运行测试 ``` # 测试单个模块 python src/parser/log_parser.py python src/parser/normalizer.py python src/features/engineer.py python src/models/isolation_forest.py python src/models/lof.py python src/models/rule_engine.py python src/scorer/risk_scorer.py # 在设置 PYTHONPATH 的情况下运行 PYTHONPATH=. python src/report/pdf_export.py ``` ## 🛠️ 技术栈 | 组件 | 技术 | |---|---| | 语言 | Python 3.10+ | | 仪表板 | Streamlit | | ML 模型 | scikit-learn | | 数据处理 | pandas, numpy | | 图表 | Plotly | | PDF 导出 | ReportLab | | 日志解析 | re (regex) | ## 📊 Pipeline 工作原理 ``` Log File (any format) ↓ log_parser.py → structured dicts ↓ normalizer.py → clean timestamps, validated IPs ↓ engineer.py → numeric feature vectors ↓ ┌─────────────────────────────────────┐ │ isolation_forest.py (ML model 1) │ │ lof.py (ML model 2) │ │ rule_engine.py (10 rules) │ └─────────────────────────────────────┘ ↓ risk_scorer.py → final severity score per IP ↓ app.py → interactive Streamlit dashboard ↓ pdf_export.py → downloadable PDF report ``` ## ⚙️ 配置可调整的阈值位于每个相关文件的顶部： **rule_engine.py** ``` BRUTE_FORCE_THRESHOLD = 5 # failed logins before alert BRUTE_FORCE_WINDOW_SEC = 60 # time window in seconds PORT_SCAN_THRESHOLD = 20 # unique ports before alert ERROR_RATE_THRESHOLD = 0.80 # 80% error rate threshold ``` **isolation_forest.py** ``` DEFAULT_CONTAMINATION = 0.05 # expected anomaly fraction (5%) N_ESTIMATORS = 100 # number of isolation trees ``` **risk_scorer.py** ``` RULE_WEIGHTS = {"CRITICAL": 40, "HIGH": 30, "MEDIUM": 20} ML_WEIGHTS = {"CRITICAL": 25, "HIGH": 20, "MEDIUM": 10} SCORE_THRESHOLDS = {"CRITICAL": 76, "HIGH": 51, "MEDIUM": 26} ``` ## 📄 许可证学术 / 教育项目 — 网络安全 2026

标签：AMSI绕过, Apex, Kubernetes, Python, Streamlit, 威胁检测, 异常检测, 无后门, 机器学习, 网络安全, 访问控制, 逆向工具, 隐私保护