nadimpallipavan/secmon

GitHub: nadimpallipavan/secmon

基于 ELK + FastAPI + Redis 的本地安全监控平台，实现多源日志采集、威胁检测和实时告警的一体化方案。

Stars: 0 | Forks: 0

# SecMon — 安全监控平台 [![CI](https://static.pigsec.cn/wp-content/uploads/repos/cas/ad/ad5834178f7599af9fdda11629d49cae07f2997beec49821b2920eff5bfd50e7.svg)](https://github.com/nadimpallipavan/secmon/actions/workflows/ci.yml) 一个完整且可运行的安全监控平台，能够**收集、解析、分析和可视化**来自多个来源的日志，通过可编辑的规则引擎**检测可疑活动**，并发送**实时警报** — 所有这一切都在本地通过 Docker 进行编排，**无需任何外部云凭据**。 ``` ┌─────────────┐ syslog / auth / web ───▶│ Logstash │──┐ (grok-parse → normalize → forward) (files, TCP/UDP 5514) └─────────────┘ │ ▼ log generator / apps ───────────────▶ ┌──────────────────┐ bulk ┌───────────────┐ POST /ingest │ FastAPI (api) │──────────▶│ Elasticsearch │ │ normalize + │ logs-* │ logs-* │ │ detection engine│ │ alerts-* │ └────────┬─────────┘ └──────┬────────┘ alert │ queue (RPUSH) │ ▼ ▼ ┌──────────────┐ BLPOP ┌───────────────┐ │ Redis │◀────┐ │ Kibana │ │ queue+windows│ │ │ dashboards │ └──────────────┘ ┌───┴──────┐ └───────────────┘ │ worker │ dedup/rate-limit │ (alerts) │ → file / webhook / Pub/Sub └──────────┘ ``` ## 为什么做出这些设计选择 - **FastAPI 是 Elasticsearch 的唯一写入者。** Logstash 解析多源日志并将它们*转发*到 `POST /ingest`，而不是自行写入 ES。这保证了每个事件 — 无论来自哪个来源 — 都流经相同的标准化 → 检测 → 存储路径，因此检测永远不依赖于日志的来源。没有双重索引。 - **Redis 滑动窗口用于基于速率的规则。** 暴力破解和流量激增检测使用 Redis 有序集合（分数 = 时间戳）：修剪旧分数，添加新分数，读取基数。O(log n)，自动过期，并在 API worker 之间共享。 - **解耦的警报 worker。** API 将触发的警报排入 Redis 队列并立即返回；一个单独的 worker 负责发送它们。缓慢的 webhook 永远不会对日志摄取造成背压。 - **本地优先，云端可选。** Webhook 和 GCP Pub/Sub 都是可选的，并受导入保护。警报*始终*被写入本地文件、存储在 ES 中，并可通过 `/alerts` 查询。 - **用于测试的内存窗口存储。** 规则引擎接受可插拔的 `WindowStore`，因此完整的检测套件可以在没有 Redis 的情况下在 `pytest` 中运行。 ## 技术栈 | 组件 | 角色 | |----------------|---------------------------------------------------| | FastAPI | 摄取 API、标准化、检测引擎、指标 | | Elasticsearch | 存储和搜索 (`logs-*`, `alerts-*`) | | Logstash | 多源摄取 + grok 解析 | | Kibana | 仪表板和索引模式 | | Redis | 警报队列、滑动窗口计数器、速率限制 | | Worker | 警报分发 (文件 / webhook / Pub/Sub) | | Docker Compose | 编排 | ## 快速开始 ``` # 从项目根目录 cp .env.example .env # (bin/demo.sh does this for you) docker compose up -d --build # boots ES, Kibana, Logstash, Redis, api, worker, setup # 首次运行等待约 60-120 秒（ES 启动 + Kibana），然后植入带有嵌入式攻击的数据： python scripts/generate_logs.py --api http://localhost:8000 --duration 20 --rate 40 ``` 或者使用单条命令 (Linux/macOS/Git-Bash/WSL)： ``` bash bin/demo.sh # 或者：make demo ``` 然后探索： - **指标：** `curl -s http://localhost:8000/status | python -m json.tool` - **警报：** `curl -s http://localhost:8000/alerts | python -m json.tool` - **API 文档：** http://localhost:8000/docs - **Kibana：** http://localhost:5601 → **Dashboard → “SecMon — Threat Overview”** `setup` 容器会在启动时自动安装 ES 索引模板，并导入 Kibana 数据视图和仪表板。 ## 标准化事件 schema 在存储/检测之前，每一行日志都会被强制转换为这种通用格式 (`app/schema.py`)。在 Elasticsearch 中，`timestamp` 存储为 `@timestamp`。 | 字段 | 类型 | 描述 | |---------------|---------|----------------------------------------------------| | `timestamp` | date | ISO-8601 UTC (在 ES 中为 `@timestamp`) | | `source` | keyword | `syslog` \| `auth` \| `web` \| `app` \| … | | `host` | keyword | 源主机 | | `event_type` | keyword | `ssh_login_failed`, `http_request`, `sudo`, … | | `severity` | keyword | `info` \| `low` \| `medium` \| `high` \| `critical`| | `src_ip` | ip | 源 IP (支持 CIDR/范围查询) | | `dest_ip` | ip | 目标 IP | | `dest_port` | integer | 目标端口 | | `user` | keyword | 用户名 | | `http_method` | keyword | HTTP 方法 | | `url` | keyword | 请求路径 | | `status_code` | integer | HTTP 状态 | | `bytes` | long | 响应字节 | | `country` | keyword | 地理位置 (尽力而为) | | `message` | text | 人类可读的摘要 | | `raw` | text | 原始日志行 | ### 警报 schema (`alerts-*`) `@timestamp`, `rule`, `rule_type`, `severity`, `description`, `src_ip`, `user`, `host`, `count`, `window_seconds`, `dedup_key`, `message`, `matched_events`。 ## 摄取 `POST /ingest` 接受以下任何格式： ``` // 1) batch wrapper (what the generator sends) { "events": [ { "source": "web", "src_ip": "1.2.3.4", "url": "/admin", "status_code": 403 } ] } // 2) a bare array (what Logstash json_batch sends) [ { "...": "..." } ] // 3) raw line + format hint -> server-side parser expands it { "events": [ { "format": "auth", "raw": "Jan 10 12:34:56 h sshd[1]: Failed password for root from 9.9.9.9 port 22 ssh2" } ] } ``` 支持的 `format` 提示：`syslog`, `auth`/`ssh`, `nginx`/`apache`/`web`, `json`/`app`。 ### Logstash 路径 `logstash/pipeline/logstash.conf` 从文件 (`data/logs/{syslog,auth,access,app}.log`) 和 TCP/UDP `5514` 进行摄取，通过 grok 将每个来源解析为标准化的 schema，然后批量 POST 到 `/ingest`。要运行它： ``` python scripts/generate_logs.py --write-files # writes raw lines into data/logs/*.log # Logstash 实时跟踪、解析并将它们转发到 API → 检测照常运行。 ``` ## 检测规则规则存在于 **`app/rules/rules.yaml`** 中，并在 API 启动时重新加载。系统开箱即用提供四种规则类型： | 规则 | 类型 | 触发条件 … | |-------------------------|------------------------|-----------------------------------------------------------| | `ssh_brute_force` | `brute_force` | 同一个 `src_ip` 在 60 秒内失败登录次数 ≥5 | | `unauthorized_access` | `unauthorized_access` | 请求受限路径或出现 401/403 (200 ⇒ critical)| | `anomalous_traffic` | `traffic_spike` | >60 请求/30 秒 **或** >15 个不同端口/30 秒 (端口扫描) | | `privilege_escalation` | `privilege_escalation` | 可疑的 sudo/su (`NOT in sudoers`, 认证失败, …) | 每个规则都有一个 `cooldown_seconds`，可将单次攻击合并为一个警报（因此突发攻击不会淹没警报索引 — worker 中的通知速率限制是第二层防护）。 ### 添加新规则追加到 `app/rules/rules.yaml` 并重启 API (`docker compose restart api`)： ``` - name: my_rule type: brute_force # brute_force | unauthorized_access | traffic_spike | privilege_escalation description: "What it detects" severity: high enabled: true match: # AND of field -> allowed values, applied to each event event_type: ["ssh_login_failed"] group_by: src_ip # window key for rate rules threshold: 10 window_seconds: 120 cooldown_seconds: 120 ``` 对于全新的*类型*，请在 `app/rules_engine.py` 中添加 `_rule_` 处理程序。 ## 实时警报当规则触发时，API 会将警报写入 `alerts-*` 并将其排入 Redis 队列。 **worker** (`worker/alert_worker.py`) 消费队列并跨多个 sink 进行分发： 1. **本地文件 + stdout** — 始终执行 (`data/alerts.log`)；可靠的兜底方案。 2. **REST webhook** — 如果设置了 `ALERT_WEBHOOK_URL`。 3. **GCP Pub/Sub** — 如果设置了 `GCP_PROJECT` + `GCP_PUBSUB_TOPIC` 且客户端可用。 **去重 / 速率限制：** 对于每个 `dedup_key` (`rule:src_ip`)，在 `ALERT_RATE_LIMIT_SECONDS` 内会抑制外部发送，这是通过 Redis 的 `SET NX EX` 锁实现的。**严重性路由：** 只有达到或超过 `ALERT_MIN_SEVERITY` 的警报才会被发送到外部（文件 sink 仍然会记录所有内容）。 ## 仪表板保存的对象已提交在 `kibana/export.ndjson`，并由 `setup` 容器自动导入。它们是由 `scripts/build_kibana_export.py` 生成的（重新运行它以再次生成）。 - **数据视图：** `logs-*`, `alerts-*` - **仪表板 “SecMon — Threat Overview”：** 随时间变化的事件量、随时间变化的按严重性划分的警报、主要源 IP、失败与成功的登录对比、按严重性划分的警报。如果您需要手动重新导入： ``` curl -s -X POST "http://localhost:5601/api/saved_objects/_import?overwrite=true" \ -H "kbn-xsrf: true" --form file=@kibana/export.ndjson ``` ## 环境变量 (`.env`) | 变量 | 默认值 | 用途 | |----------------------------|--------------------------------|------------------------------------------| | `ES_HOST` | `http://elasticsearch:9200` | Elasticsearch URL | | `LOGS_INDEX_PREFIX` | `logs` | 每日日志索引前缀 | | `ALERTS_INDEX_PREFIX` | `alerts` | 每日警报索引前缀 | | `REDIS_URL` | `redis://redis:6379/0` | Redis 连接 | | `ALERT_QUEUE` | `alerts:queue` | worker 消费的 Redis 列表 | | `RULES_PATH` | `app/rules/rules.yaml` | 检测规则文件 | | `ALERT_WEBHOOK_URL` | *(空)* | 用于警报的可选 REST webhook | | `ALERT_RATE_LIMIT_SECONDS` | `120` | 每个 `rule:src_ip` 的去重窗口 | | `ALERT_MIN_SEVERITY` | `medium` | 外部发送的最低严重性 | | `ALERT_LOG_FILE` | `data/alerts.log` | 本地兜底警报日志 | | `GCP_PROJECT` | *(空)* | 用于 Pub/Sub 的可选 GCP 项目 | | `GCP_PUBSUB_TOPIC` | *(空)* | 可选的 Pub/Sub 主题 | **所有与云/webhook 相关的内容都是可选的。** 在 `.env` 为空的情况下，该平台在本地完全可以正常运行。 ### 可选：webhook 和 GCP ``` ALERT_WEBHOOK_URL=https://hooks.example.com/secmon # e.g. a Slack/Discord/HTTP receiver GCP_PROJECT=my-project GCP_PUBSUB_TOPIC=secmon-alerts # GOOGLE_APPLICATION_CREDENTIALS=/path/key.json（挂载到 worker 容器中） ``` ## API 参考 | 方法 | 路径 | 描述 | |--------|------------|--------------------------------------------------------| | POST | `/ingest` | 摄取批次；标准化、索引、检测、排队警报 | | GET | `/alerts` | 最近的警报 (`?severity=high&limit=50`) | | GET | `/status` | 事件/警报计数、按严重性和规则划分的警报 | | GET | `/health` | 存活状态 + ES/Redis 状态 | | GET | `/docs` | Swagger UI | ## 测试 ``` pip install -r requirements.txt pytest -q ``` `tests/test_parsers.py` 覆盖每个解析器；`tests/test_rules.py` 使用内存窗口存储（无需 Redis）覆盖所有四种规则类型。 ## 项目布局 ``` . ├── docker-compose.yml # all services + one-shot `setup` provisioner ├── Dockerfile # api + worker + setup image ├── app/ │ ├── main.py # FastAPI: /ingest /alerts /status /health │ ├── schema.py # normalized Event schema │ ├── normalize.py # source parsers (mirror the grok patterns) │ ├── rules_engine.py # YAML-driven detection engine │ ├── redis_client.py # Redis + pluggable sliding-window store │ ├── es_client.py # ES connection, daily indices, bulk writes │ ├── alerting.py # dedup/rate-limit/route/dispatch │ ├── gcp.py # optional Pub/Sub (guarded) + fallback │ ├── config.py # env-driven settings │ └── rules/rules.yaml # ← editable detection rules ├── worker/alert_worker.py # consumes the alert queue ├── logstash/ # pipeline + grok patterns + config ├── elasticsearch/templates/ # logs-* / alerts-* index templates ├── kibana/export.ndjson # data views + dashboard (auto-imported) ├── scripts/ │ ├── generate_logs.py # realistic traffic + embedded attacks │ ├── provision.py # install templates + import Kibana objects │ └── build_kibana_export.py # regenerate the saved objects ├── tests/ # pytest: parsers + rules ├── bin/ # setup.sh, seed.sh, demo.sh, wait_for_ready.sh └── Makefile ``` ## 故障排除 - **Elasticsearch 退出 / `vm.max_map_count`** (Linux)：`sudo sysctl -w vm.max_map_count=262144`。 - **首次启动缓慢：** ES + Kibana 需要 60–120 秒。`bin/wait_for_ready.sh` 会为您轮询。 - **没有警报？** 确保您已填充数据 (`make seed`)，并且 `api`/`worker` 容器是健康的：`docker compose ps`, `docker compose logs -f worker`。 - **Kibana 仪表板为空：** 将时间选择器设置为 *Last 24 hours*，并通过 `/status` 确认数据存在。如有需要，使用上面的 curl 命令重新导入。

标签：内容过滤, 搜索引擎查询, 版权保护, 越狱测试, 逆向工具