urrra39/BehaveGuard

GitHub: urrra39/BehaveGuard

BehaveGuard 利用 eBPF 和机器学习对 Linux 进程进行实时行为分析，通过学习正常运行基线来检测零日恶意软件和异常行为。

Stars: 0 | Forks: 0

# BehaveGuard 🛡️ [![CI](https://img.shields.io/github/actions/workflow/status/urrra39/BehaveGuard/ci.yml?branch=develop&style=flat-square)](https://github.com/urrra39/BehaveGuard/actions) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](LICENSE) [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg?style=flat-square)](https://www.python.org/) [![Linux 5.15+](https://img.shields.io/badge/linux-5.15%2B-orange.svg?style=flat-square)](https://www.kernel.org/) [![Stars](https://img.shields.io/github/stars/urrra39/BehaveGuard?style=flat-square)](https://github.com/urrra39/BehaveGuard/stargazers) ## 摘要基于签名的防御回答的是*“这段代码是否匹配已知的恶意软件？”*这个问题，因此在结构上对新型威胁视而不见。**BehaveGuard** 将检测重新定义为行为异常问题：它利用 eBPF 对 Linux 主机上的每个进程进行插桩，从观察期中学习每个进程的*正常*行为模型，并实时标记出具有统计学意义的偏差。每个受监控的进程每 30 秒被总结为一个 **427 维特征向量**，涵盖系统调用频率和 n-gram 结构、网络连接图、文件访问模式、进程树动态，以及五个专门构建的威胁层（进程注入、容器逃逸、LOLBin 滥用、反取证和 DNS 隧道）。LSTM 序列自编码器和变分自编码器的组合会产生一个经过校准的 0-100 异常分数；类似 SHAP 的解释器会用通俗易懂的语言呈现每个判定结果。该检测器能够捕捉到签名引擎遗漏的零日行为——凭据转储、反向 shell、横向移动——同时保持可解释性和可由操作员调优。 ## 为什么选择行为检测？一个 Python Web 服务器*通常*会打开 `/var/www/` 下的文件，建立出站 HTTPS 连接，并读取其配置。如果它突然读取 `/etc/shadow`，连接到 4444 端口上的境外 IP，并生成一个 shell —— 这就是一次违规，**即使该恶意软件以前从未被见过。** 签名无法表达“这个进程正在做它从未做过的事情”。而 BehaveGuard 可以。 ## 架构 ``` ┌──────────────────────── Linux Kernel (eBPF) ────────────────────────┐ │ 9 BCC programs on kprobes / tracepoints / LSM hooks │ │ syscall · network · file · process │ │ injection · container-escape · lolbin · anti-forensic · dns-tunnel │ └───────────────────────────────┬─────────────────────────────────────┘ │ ring buffers ┌───────────────────────────────▼─────────────────────────────────────┐ │ User space (Python) │ │ Collector ─▶ Feature Extractor (427-dim) ─▶ Ensemble (LSTM + VAE) │ │ │ │ │ │ ▼ ▼ │ │ Event Store (SQLite) Anomaly Scorer (0–100) │ │ │ │ │ Explainer ◀────────────┤ │ │ ▼ │ │ Dashboard (Dash) ◀── REST API + WS ◀──────── Alert Manager ──▶ Channels │ :8050 :8888 dedup/suppress webhook/email/syslog └──────────────────────────────────────────────────────────────────────┘ ``` 特征/评分/存储/告警层是**纯 Python 实现，且在导入时不需要 torch、numpy 或 BCC**（繁重的依赖项会被延迟导入），这使得代码库可以在任何平台上进行测试。参见 [`docs/architecture.md`](docs/architecture.md)。 ### 特征向量（427 维） | 模块 | 维度 | 示例 | |---|---|---| | 系统调用频率 | 335 | 单个系统调用的相对频率 | | 系统调用 bigrams | 50 | 哈希处理的连续系统调用对 | | 网络 | 10 | 唯一 IP/端口、字节速率、Tor 端口、RFC1918、**DNS 大小/速率/最大值** | | 文件 | 7 | 系统目录中的文件、路径熵、**日志删除、时间戳篡改** | | 进程 | 22 | shell 生成、提权、**注入目标、命名空间更改、pivot_root、15× LOLBin** | | 时间维度 | 3 | 时间窗口持续时间、事件/秒、活动占空比 | ## 威胁模型与检测矩阵 | 威胁 | eBPF 层 / 信号 | 特征 | MITRE ATT&CK | |---|---|---|---| | 凭据转储 | 读取 `/etc/shadow`、`/root/.ssh` 文件 | `files_in_system_dirs` | T1003 | | 反向 shell | 执行 shell + 连接异常端口 | `is_shell_spawned`、网络速率 | T1059 / T1571 | | 横向移动 | 连接到 RFC1918 + 执行 `ssh` | `is_connecting_to_rfc1918` | T1021 | | 数据渗出 | 大量读取 + 大量数据传出 | `bytes_sent_per_second` | T1041 | | 提权 | `setuid`/`ptrace` + 内存写入 | `privilege_escalation_attempt`、`is_injection_target` | T1068 / T1055 | | 进程注入 | `security_ptrace`、`process_vm_writev`、`/proc//mem` | `is_injection_target` | T1055 | | 容器逃逸 | `setns` / `unshare` / `pivot_root` | `namespace_change_count`、`pivot_root_attempt` | T1611 | | LOLBin 滥用 | 执行监控列表 (wget/curl/nc/…) | `lolbin_*` 独热编码 | T1218 | | 反取证 | 对 `/var/log` 执行 `unlink`/`utimensat`/`truncate` | `log_deletion_count`、`timestamp_modification_count` | T1070 | | DNS 隧道 | 超大 UDP/53 查询 | `max_dns_payload_bytes`、`dns_query_rate` | T1048 / T1071.004 | ### 五大高级防御层（机制 · 内核钩子 · 缓解措施） | 支柱 | 机制 | 内核钩子 | 特征 | 缓解向量 | |---|---|---|---|---| | **进程注入** | 一个进程写入另一个进程的内存以受信任的身份运行，从而绕过基于进程的基线 | `security_ptrace_access_check` (LSM, 仅 ATTACH)、`process_vm_writev`、`mem_write` (`/proc//mem`) | `is_injection_target` | 隔离注入器；`kernel.yama.ptrace_scope=2`；seccomp 拒绝 `process_vm_writev` | | **容器逃逸** | 容器操纵命名空间以打破隔离并访问宿主机 | `setns`、`unshare`、`pivot_root` | `namespace_change_count`、`pivot_root_attempt` | 丢弃 `CAP_SYS_ADMIN`；seccomp 拒绝命名空间 syscall；user namespaces；只读 rootfs | | **LOLBin 滥用** | 已签名的、合法的二进制文件 (wget/curl/nc/…) 获取并运行 payload —— 没有恶意软件需要签名 | `sched_process_exec` + 16 项监控列表 | `lolbin_execution_count`、15× `lolbin_` | 执行白名单 (fapolicyd/SELinux)；移除不必要的解释器；出站过滤 | | **反取证** | 删除/截断日志和时间戳篡改，以抹除证据并破坏时间线 | `security_inode_unlink`、`truncate`、`utimensat` (→ `/var/log`) | `log_deletion_count`、`timestamp_modification_count` | 仅追加/不可变日志 (`chattr +a`)；远程 syslog 转发；auditd；FIM | | **DNS 隧道** | 编码在超大 DNS 查询中的 C2/数据渗出，以绕过信任 53 端口的防火墙 | `udp_sendmsg` → `:53`，payload > 100 B | `avg_dns_query_size`、`dns_query_rate`、`max_dns_payload_bytes` | 强制使用受控解析器；阻断直接的 `:53` 出站；查询大小/速率限制；DoH 检查 | 每一层都会自过滤 BehaveGuard 自身的 PID (`-DOWN_PID`)，并进行防御性加载（缺少某个钩子的内核只会禁用该层）。完整的设计说明：[`docs/architecture.md` §5](docs/architecture.md)。每个支柱在 `tests/simulations/` 和 `tests/unit/test_features.py` 中都有独立的模拟和断言。 ## 快速开始 ``` # 1. 安装（Linux，以 root 身份） — 拉取 BCC + 该 package sudo bash scripts/install_deps.sh # 2. 初始化，然后学习“正常”状态约 60 分钟 sudo behaveguard init sudo behaveguard train --duration 60 # 3. 运行（collector + scorer + alerts + API + dashboard） sudo behaveguard run # 🌐 Dashboard: http://localhost:8050 🔌 API: http://localhost:8888 ``` ## 环境要求 - **Linux 5.15+**（支持 eBPF ring buffers、LSM hooks） - **Python 3.10+** - **root 权限**（加载 eBPF 程序所需 —— 关于为什么这是安全的，请参阅 [SECURITY.md](SECURITY.md)） ## CLI 参考 | 命令 | 描述 | |---|---| | `behaveguard init` | 检查 eBPF 支持，初始化存储，生成 API token | | `behaveguard train --duration 60 [--process nginx]` | 从实时的“正常”活动中学习基线 | | `behaveguard run [--no-dashboard]` | 启动监控（收集器 + API + dashboard） | | `behaveguard status` | 查看已训练的模型和未确认的告警数量 | | `behaveguard alerts --last 1h [--severity HIGH]` | 列出最近的告警 | | `behaveguard explain --pid 1234` | 解释为什么某个进程看起来可疑 | | `behaveguard whitelist add --pid 1234 \| --process backup` | 抑制已知良好的进程 | ## API 参考所有 `/api/v1` 路由都需要 Bearer token（在 `init` 时生成）；`/api/v1/health` 是公开的。速率限制：100 次请求/分钟/IP。参见 [`docs/api_reference.md`](docs/api_reference.md)。 ``` TOKEN=$(cat ~/.behaveguard/api_token) curl http://localhost:8888/api/v1/health # {"status":"ok","version":"1.0.0","uptime_seconds": …} curl -H "Authorization: Bearer $TOKEN" http://localhost:8888/api/v1/alerts curl -H "Authorization: Bearer $TOKEN" http://localhost:8888/api/v1/processes curl -H "Authorization: Bearer $TOKEN" -X POST \ http://localhost:8888/api/v1/alerts/suppress \ -d '{"process_name":"backup","reason":"noisy","max_score_suppress":60}' ``` 实时告警流 (WebSocket)：`ws://localhost:8888/ws/alerts?token=$TOKEN`。 ## 训练你自己的基线 `behaveguard train` 会在给定的时间窗口内观察实时进程，假设该时间段是良性的，并拟合出一个按进程划分的 bundle（LSTM + VAE + normalizer + threshold），保存在 `~/.behaveguard/models//` 下。使用 `--process nginx` 对单个进程重新训练。参见 [`docs/ml_models.md`](docs/ml_models.md)。 ## 性能 - 特征提取是纯 Python 实现的，在单核上每秒可处理**数千个时间窗口**（`python scripts/benchmark.py`）。 - eBPF 程序会自过滤收集器自身的 PID，并使用 ring buffers 实现低开销的内核到用户态传输；背压机制会执行丢弃并计数，而不是阻塞内核生产者。 - 存储使用带保留期轮转的 SQLite (WAL)；模型体积小（每个进程 `<` 几 MB）。 ## 对比 | | BehaveGuard | Falco | OSSEC | Snort | |---|---|---|---|---| | 检测模型 | **行为机器学习（按进程）** | 规则 | 日志/HIDS 规则 | 网络签名 | | 捕捉零日行为 | ✅ | 部分（规则） | ❌ | ❌ | | eBPF 内核插桩 | ✅ | ✅ | ❌ | ❌ | | 按进程学习的基线 | ✅ | ❌ | ❌ | ❌ | | 可解释的判定 | ✅（SHAP 风格） | 规则名称 | 规则 ID | 签名 ID | | 容器逃逸 / 注入 / DNS 隧道层 | ✅ | 部分 | ❌ | 部分 | ## 参考文献 1. Gregg, B. *BPF Performance Tools*, Addison-Wesley, 2019. 2. Forrest, S. et al. "A Sense of Self for Unix Processes." *IEEE S&P*, 1996. — 系统调用异常检测。 3. Malhotra, P. et al. "LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection." *ICML Anomaly Detection Workshop*, 2016. 4. Kingma, D. P. & Welling, M. "Auto-Encoding Variational Bayes." *ICLR*, 2014. 5. Lundberg, S. & Lee, S. "A Unified Approach to Interpreting Model Predictions" (SHAP). *NeurIPS*, 2017. 6. MITRE ATT&CK® — https://attack.mitre.org/ ## 许可证 MIT © urrra39

标签：Apex, DNS 反向解析, Docker镜像, IP 地址批量处理, 安全, 异常检测, 机器学习, 超时处理, 逆向工具