mstiri/trivy-epss-kev-exporter

GitHub: mstiri/trivy-epss-kev-exporter

将 Trivy Operator 的漏洞报告与 EPSS 评分和 CISA KEV 数据关联，以 Prometheus 指标形式暴露基于实际利用风险的漏洞优先级数据。

Stars: 0 | Forks: 0

# trivy-epss-kev-exporter [![ci](https://static.pigsec.cn/wp-content/uploads/repos/2026/06/72be8ccb35182046.svg)](https://github.com/mstiri/trivy-epss-kev-exporter/actions/workflows/ci.yml) [![release](https://img.shields.io/github/v/release/mstiri/trivy-epss-kev-exporter)](https://github.com/mstiri/trivy-epss-kev-exporter/releases/latest) [![Go 版本](https://img.shields.io/github/go-mod/go-version/mstiri/trivy-epss-kev-exporter)](https://github.com/mstiri/trivy-epss-kev-exporter/blob/main/go.mod) [![许可证](https://img.shields.io/github/license/mstiri/trivy-epss-kev-exporter?v=2)](https://github.com/mstiri/trivy-epss-kev-exporter/blob/main/LICENSE) [![Go 报告卡](https://goreportcard.com/badge/github.com/mstiri/trivy-epss-kev-exporter?v=2)](https://goreportcard.com/report/github.com/mstiri/trivy-epss-kev-exporter) 一个只读的 Prometheus exporter，将 [Trivy Operator](https://github.com/aquasecurity/trivy-operator) `VulnerabilityReport` 的 CVE 转化为**具备漏洞利用感知能力**的指标 —— 为每个 CVE 补充其 **EPSS 分数** (FIRST/EPSS) 以及是否存在于 **CISA KEV** 中，让你能够针对*实际可能被利用的*漏洞发出告警，而不仅仅是依据原始的 CVSS 计数。它**不会向集群写入任何内容**，也**不发送任何告警**：它仅暴露指标。告警功能存在于你的 `PrometheusRule` / Alertmanager 中，并基于以下指标进行配置。 ## 为什么需要它一次集群扫描可能会暴露成千上万个 CVE。仅靠严重程度作为处理队列是不合理的 —— 大多数严重 (CRITICAL) 级别的漏洞从未被利用过，而有些漏洞*今天就*在被利用。EPSS 为每个 CVE 提供每日被利用的概率；CISA KEV 列出了已知在野被利用的 CVE。将这些信息与你的工作负载结合起来，可以让你毫不留情地确定优先级： “在 KEV 列表中，且位于面向互联网的 Deployment 上”的优先级绝对高于“严重级别，但未评分”。 ## 工作原理 ``` Trivy Operator ──(VulnerabilityReport CRDs)──▶ informer ──▶ enrich(CVE) ──▶ /metrics ▲ ▲ EPSS bulk CSV ─(daily)─────────────────┘ │ CISA KEV JSON ─(daily)─────────────────────┘ ``` 两个数据源都会被批量加载并缓存在内存中；每次查找都是本地操作（在 scrape 时不会针对单个 CVE 发起 API 调用）。报告或数据源发生更改都会触发重新 enrichment。请参阅 [`CLAUDE.md`](CLAUDE.md) 以获取完整的架构说明和稳定的指标契约。 ## 指标每个 `(cve × workload × container × resource)` 组合对应一个时间序列： | 指标 | 类型 | 含义 | |---|---|---| | `trivy_vuln_epss_score` | gauge | EPSS 被利用概率 `0.0–1.0`（如果数据源中不存在该 CVE 则为 `0`） | | `trivy_vuln_epss_percentile` | gauge | EPSS 百分位数 `0.0–1.0` | | `trivy_vuln_kev` | gauge | 如果 CVE 在 CISA KEV 目录中则为 `1`，否则为 `0` | | `trivy_vuln_kev_ransomware` | gauge | 如果 KEV 条目与已知的勒索软件活动有关则为 `1` | 标签：`cve`、`namespace`、`workload`、`workload_kind`、`container`、 `resource`、`severity`。此外还有可操作性自身指标（`trivy_exporter_feed_last_success_timestamp_seconds`、`…_cache_synced`、 `…_build_info` 等）。 ### 告警示例 (PromQL) ``` # 工作负载上存在已知被利用的 (CISA KEV) CVE trivy_vuln_kev == 1 # 高预测利用概率 trivy_vuln_epss_score > 0.5 # 数据源已过期（36 小时内未成功刷新） time() - trivy_exporter_feed_last_success_timestamp_seconds > 36 * 3600 ``` ## 快速开始需要集群中运行有 Trivy Operator（它会生成此 exporter 读取的 `VulnerabilityReport` CRD）。 ``` # 使用您的 kubeconfig 在本地运行： go run ./cmd/trivy-epss-kev-exporter --kubeconfig ~/.kube/config curl -s localhost:8080/metrics | grep trivy_vuln_ # 或者容器镜像： docker run --rm -p 8080:8080 -v ~/.kube/config:/kubeconfig:ro \ ghcr.io/mstiri/trivy-epss-kev-exporter:latest --kubeconfig /kubeconfig ``` 在集群内部，可通过单副本 `Deployment` + `ServiceMonitor` 以及一个 **只读**的 `ClusterRole` 进行部署（对 `vulnerabilityreports` 拥有 `get/list/watch` 权限，如果启用了工作负载汇总，则还需要对 `replicasets` 的读取权限）。该 exporter 在 `8080` 端口提供 `/metrics`、 `/healthz` 和 `/readyz` 服务。 ## 配置 | 参数 (环境变量) | 默认值 | 说明 | |---|---|---| | `--epss-feed-url` (`EPSS_FEED_URL`) | `https://epss.empiricalsecurity.com/epss_scores-current.csv.gz` | gzip 压缩的 CSV 批量数据源 | | `--kev-feed-url` (`KEV_FEED_URL`) | `https://raw.githubusercontent.com/cisagov/kev-data/refs/heads/develop/known_exploited_vulnerabilities.json` | CISA KEV 目录的 GitHub 镜像 | | `--feed-refresh-interval` | `24h` | 两个数据源大约每天重新计算一次 | | `--feed-http-timeout` | `2m` | 下载数据源时每次抓取的 HTTP 超时时间 | | `--namespaces` (`NAMESPACES`) | `` (空) | 逗号分隔的白名单；为空 = 所有 namespace。**不**接受 `"all"` 作为关键字 | | `--enable-rollup` | `true` | 将 ReplicaSet 工作负载向上汇总至其所属的 Deployment（需要 `replicasets` 的读取 RBAC 权限） | | `--enable-ransomware` | `true` | 输出 `trivy_vuln_kev_ransomware` gauge 指标 | | `--metrics-port` / `--metrics-path` | `8080` / `/metrics` | 同时提供 `/healthz` 和 `/readyz` | | `--log-level` | `info` | `info`、`debug` 或 `trace` | | `--workers` | `2` | 消耗工作队列的并发调和 worker 数量 | | `--resync-interval` | `0` (关闭) | informer 重同步；这不是捕获数据源变更的方式 —— 请保持为 `0` | | `--kubeconfig` | `` (空) | kubeconfig 路径；为空 = 集群内配置 | | `--user-agent` | `trivy-epss-kev-exporter/ (+repo URL)` | 发送数据源请求时使用的 User-Agent 标头 | ## 部署说明 ### ServiceMonitor：必须设置 `honorLabels: true` 该 exporter 为每个漏洞指标设置了一个 `namespace` 标签，用于标识 **工作负载的** namespace（例如 `litellm`、`kube-system`）。Prometheus Operator 也会注入来自抓取目标 namespace 的 `namespace` 标签（通常是 exporter 运行所在的 `trivy-system`）。如果不设置 `honorLabels: true`，Prometheus 会将每个序列中的工作负载 namespace 覆盖为 `trivy-system` —— 这将导致无法按 namespace 进行细分统计。 ## 数据来源 - **EPSS** — FIRST/EPSS 每日批量评分数据源。 - **CISA KEV** — 已知被利用漏洞 (Known Exploited Vulnerabilities) 目录。两者都缓存在内存中，并具备优雅降级能力（刷新失败时会保留上次的有效数据），同时暴露新鲜度相关的自身指标。 ## 贡献欢迎贡献代码！请参阅 [`CONTRIBUTING.md`](CONTRIBUTING.md)。如需引导式浏览代码库，请从 [`docs/architecture.md`](docs/architecture.md) 开始；完整的指标参考位于 [`docs/metrics.md`](docs/metrics.md)；完整的设计原理和稳定的指标契约位于 [`CLAUDE.md`](CLAUDE.md)。 ## 许可证 Apache-2.0。

标签：API集成, EVTX分析, Go, GPT, Ruby工具, Trivy Operator, 可观测性, 子域名突变, 日志审计, 漏洞管理, 自定义请求头, 请求拦截