srini-cybersec/logsentry

GitHub: srini-cybersec/logsentry

LogSentry 是一个完全离线、纯 Python 实现的轻量级 SIEM 检测引擎，通过 Sigma 风格规则和有状态关联对日志进行安全威胁检测与评分。

Stars: 0 | Forks: 0

# LogSentry [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/e1381cb65e202028.svg)](https://github.com/srini-cybersec/logsentry/actions/workflows/ci.yml) [![Python](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Coverage](https://img.shields.io/badge/coverage-95%25-brightgreen.svg)](#testing) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) LogSentry 是一个**纯 Python、完全离线**的安全监控引擎。只需将其指向日志文件，它就会将其标准化到统一的分类体系中，评估 **Sigma 风格的检测规则**，运行**有状态的关联引擎**（如暴力破解、端口扫描、攻击序列），并以控制台 / JSON / CSV / **SARIF 2.1.0** / HTML 的格式输出包含评分和 MITRE ATT&CK 标签的警报。它是一个位于 SIEM 核心的检测引擎——但又足够轻量，可以在 CI、取证工作站或物理隔离的网络中运行。 ## 为什么选择 LogSentry？大多数“日志分析”脚本只是硬编码了一系列 `grep` 模式。真正的检测工程需要三样东西，而 LogSentry 开箱即用地提供了这些功能： 1. **检测即代码** - 可移植、可审查的 YAML 规则（兼容 Sigma 的子集）与引擎解耦，因此检测结果可以存储在版本控制中。 2. **关联，而不仅仅是匹配** - 一次登录失败是噪音；*六十秒内失败五次随后成功*则是一个安全事件。LogSentry 通过滑动窗口的 `event_count`、`value_count` 和 `temporal_sequence` 规则对此进行建模。 3. **与来源无关的标准化** - 无论是来自 Windows Security、sshd、Sysmon 还是防火墙的事件，只要通过字段别名分类体系，引用了 `user` / `src_ip` 的同一条规则就会触发。这一切都是完全离线的：**零网络调用，无遥测数据，无需 API 密钥。** ## 功能 - **支持 5 种摄取格式，自动检测**：JSONL、JSON 数组（包含 `{"records":[]}` 封装）、CSV、RFC-3164 syslog 和 `key=value` logfmt。 - **字段标准化**，映射到与供应商无关的 schema（`TargetUserName` / `rhost` / `clientip` -> `user` / `src_ip` ...），并带有健壮的时间戳解析（ISO-8601、epoch 毫秒/秒、Apache、syslog）。 - **Sigma 风格的检测引擎**，支持 `contains`、`startswith`、`endswith`、`re`、`cidr`、数值比较、`all`/`not` 修饰符，以及真正的条件解析器（`selection and not filter`、`1 of selection_*`、`all of them`）。 - **关联引擎**：支持在可配置的时间窗口和实体上进行 `event_count`、`value_count`（去重）和 `temporal_sequence` 操作。 - **24 条内置规则**（19 条检测规则 + 5 条关联规则），涵盖 Linux 身份验证、Windows Security、Sysmon 进程创建和 Web 服务器攻击——所有规则均带有 MITRE ATT&CK 标签。 - **风险评分**：最高 0-100 的评分，并提供 `CRITICAL` / `HIGH_RISK` / `MODERATE` / `LOW_RISK` / `CLEAN` 判定。 - **5 种报告格式**：Rich 控制台、JSON、CSV、**SARIF 2.1.0**（GitHub Code Scanning）、独立的 HTML 仪表板。 - **CI/CD 就绪**：提供用于流水线门控的 `--fail-on` 阈值、强化的非 root Docker 镜像，以及完整的 GitHub Actions 工作流。 ## 架构 ``` logs ──▶ Ingest ──▶ Normalize ──▶ Detection Engine ──┬─▶ Correlation Engine ──┐ (5 fmts) (sniff) (taxonomy) (Sigma rules) │ (sliding windows) │ └────────────┬───────────┘ ▼ Aggregate ─▶ Score (0-100) ─▶ Reports (dedup) + verdict (console/JSON/ CSV/SARIF/HTML) ``` 请参阅 [docs/architecture.md](docs/architecture.md) 以获取完整的设计和 Mermaid 图表。 ## 安装 ``` # 从 source 构建（预发布阶段推荐） git clone https://github.com/srini-cybersec/logsentry cd logsentry pip install -e . # 或者构建一个 wheel pip install build && python -m build ``` 需要 Python 3.11+。运行时依赖极少：`click`、`rich`、`PyYAML`。 ## 快速开始 ``` # 使用内置 rules 扫描 logs 目录 logsentry scan examples/sample_logs # 如果发现任何 HIGH 或更严重的问题，则强制使 CI job 失败 logsentry scan /var/log/auth.log --fail-on high # 为 GitHub Code Scanning 生成 SARIF logsentry scan ./logs --format sarif -o logsentry.sarif # 生成 HTML 仪表盘 logsentry scan ./logs --format html -o report.html # 列出所有已加载的 rule（内置 + 自定义） logsentry rules --rules ./my-rules/ ``` ### 示例输出 ``` ╭───────────────────────────── LogSentry ─────────────────────────────╮ │ Verdict: CRITICAL Risk score: 100/100 │ │ Events: 26 Rules: 24 Alerts: 20 │ ╰──────────────────────────────────────────────────────────────────────╯ Info: 3 Low: 3 Medium: 3 High: 8 Critical: 3 ┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ Sev ┃ Rule ┃ Kind ┃ Entity ┃ MITRE ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │ Critical │ corr-ssh-bruteforce-success │ correlation │ src_ip=203.0.113.66│ T1110.001 │ │ Critical │ sysmon-lsass-access │ detection │ command_line=... │ T1003.001 │ │ Critical │ web-shell-access │ detection │ url=/uploads/... │ T1505.003 │ │ High │ corr-ssh-bruteforce │ correlation │ src_ip=203.0.113.66│ T1110.001 │ │ High │ corr-win-bruteforce │ correlation │ user=administrator │ T1110 │ └──────────┴─────────────────────────────┴─────────────┴────────────────────┴─────────────┘ ``` ## 编写检测规则规则为 Sigma 风格的 YAML。将它们放在一个目录中并传入 `--rules ./dir`： ``` title: PowerShell encoded command id: sysmon-powershell-encoded level: high logsource: product: windows service: sysmon detection: selection: command_line|contains: - '-enc' - '-EncodedCommand' powershell: process|endswith: ['\powershell.exe', '\pwsh.exe'] condition: selection and powershell fields: [command_line, host] tags: - attack.execution - attack.t1059.001 remediation: Decode the command, inspect the payload, and isolate the host. ``` ### 编写关联规则 ``` title: Successful SSH logon after brute-force id: corr-ssh-bruteforce-success type: correlation level: critical correlation: type: temporal_sequence # event_count | value_count | temporal_sequence rules: # base detection rule ids, in order - linux-ssh-failed-password - linux-ssh-accepted-password group-by: [src_ip] # the entity to correlate on timespan: 300s # 60s | 5m | 1h ... ordered: true tags: [attack.t1110.001] remediation: Treat the host as compromised; rotate credentials. ``` ## 配置优先级：内置默认值 < `.logsentry.yml` < `LOGSENTRY_*` 环境变量 < CLI 参数。请参阅 [`.logsentry.yml.example`](.logsentry.yml.example)。 | 环境变量 | 用途 | |---|---| | `LOGSENTRY_MIN_SEVERITY` | 抑制低于此严重级别的警报 | | `LOGSENTRY_FAIL_ON` | 达到或超过此严重级别时以非零状态退出 | | `LOGSENTRY_USE_BUILTIN_RULES` | 设为 `false` 以禁用捆绑的规则 | | `LOGSENTRY_RULE_PATHS` | 以 `os.pathsep` 分隔的额外规则路径 | | `LOGSENTRY_EXCLUDE_RULES` | 以逗号分隔的需静默的规则 ID | ## CLI 参考 ``` logsentry scan Analyse log files/directories --format [console|json|csv|sarif|html] -o, --output PATH Write report to a file -r, --rules PATH Extra rule file(s)/dir(s) (repeatable) --no-builtin Disable the bundled rule packs --min-severity LEVEL Suppress alerts below LEVEL --fail-on LEVEL Exit 1 if any alert >= LEVEL --exclude-rule ID Silence a rule (repeatable) --input-format FMT Force jsonl|json|csv|syslog|kv --config PATH Path to a .logsentry.yml logsentry rules List loaded detection & correlation rules logsentry version Print version ``` ## Docker ``` docker build -t logsentry . # 已加固：非 root、只读 FS、无网络、已丢弃所有 caps docker run --rm --network none --read-only \ -v "$PWD/logs:/data:ro" logsentry scan /data --format json ``` 或者使用 `docker compose run --rm logsentry scan /data`。 ## 作为库使用 ``` from logsentry import Analyzer, Config analyzer = Analyzer(Config(use_builtin_rules=True)) result = analyzer.analyze_paths([Path("auth.log")]) print(result.verdict, result.risk_score) for alert in result.alerts: print(alert.rule_id, alert.entity, alert.mitre) ``` 请参阅 [`examples/demo.py`](examples/demo.py) 获取可运行的完整示例。 ## 测试 ``` pytest tests/ --cov=src/logsentry --cov-report=term-missing ``` 包含 135 个测试，**95% 的覆盖率**，`black` / `ruff` / `mypy` 检查完全通过，**零 Bandit 发现**。 ## 安全考量 LogSentry 是一款**防御性、只读、离线**的工具。它从不修改输入日志，不进行任何网络调用，使用 `yaml.safe_load_all` 解析规则，并提供强化的非 root 容器。请参阅 [SECURITY.md](SECURITY.md) 以获取完整的威胁模型。 ## 贡献欢迎贡献——请参阅 [CONTRIBUTING.md](CONTRIBUTING.md)。请运行质量门禁，并为新规则或行为添加测试。 ## 许可证 MIT © srini-cybersec - 请参阅 [LICENSE](LICENSE)。

标签：AMSI绕过, Python, 威胁检测, 恶意代码分类, 插件系统, 无后门, 请求拦截, 逆向工具