markgraziano-twlo/vuln-digest

GitHub: markgraziano-twlo/vuln-digest

一个 72 小时滚动式漏洞情报聚合平台,通过多源数据融合与 AI 验证,帮助安全团队在扫描器更新间隙快速识别和分诊正在被利用或已有 PoC 的新兴威胁。

Stars: 0 | Forks: 0

# Vuln Digest **Operational vulnerability intelligence platform** that surfaces critical threats with evidence of active exploitation or known proof-of-concept exploits. Designed to answer: *"What emerging vulnerability intelligence requires immediate attention before our security scanners complete their next assessment cycle?"* **Owner:** mgraziano-twlo **Status:** Production **Last Major Update:** 2026-06-11 ## Table of Contents - [What This Does](#what-this-does) - [Core Design Principles](#core-design-principles) - [Architecture Overview](#architecture-overview) - [Data Sources](#data-sources) - [Key Features](#key-features) - [Getting Started](#getting-started) - [Configuration](#configuration) - [How It Works](#how-it-works) - [Troubleshooting](#troubleshooting) - [Maintenance](#maintenance) - [Cost & Performance](#cost--performance) - [Trust & Data Quality](#trust--data-quality) - [Deployment](#deployment) - [Documentation Index](#documentation-index) ## What This Does Vuln Digest is **not** a historical vulnerability database. It's a **72-hour rolling intelligence digest** that: 1. **Aggregates** vulnerability data from 10+ open sources (NVD, GitHub, CISA KEV, Hacker News, Reddit, Telegram, etc.) 2. **Classifies** threats into tiers based on evidence of exploitation (Tier 0 = actively exploited, Tier 1 = POC exists, Tier 2 = high-potential risk) 3. **Correlates** tier 0/1 findings with actual cloud infrastructure via Wiz Security Platform 4. **Filters** false positives from discussion feeds using AI threat intelligence extraction 5. **Presents** a triage queue sorted by newest material changes first **Use Cases:** - Daily security triage: "What emerged before our scanners detected it?" - Incident response: "Is this CVE being actively exploited?" - Patch prioritization: "Which CVEs threaten our infrastructure?" - Threat hunting: "What 0-days are researchers discussing?" ## Core Design Principles ### 1. Evidence-Based, Never Fabricate - Every claim includes source URLs, confidence levels, and timestamps - `insufficient_data` / `Not Exploited` is a valid status — we never guess - All tier classifications require evidence (keywords + source credibility) ### 2. Operational Focus (Not Archival) - **Dashboard default:** 72-hour window - **Database retention:** 7 days (auto-cleanup) - **Sort order:** Newest material changes first (not by CVSS or tier) - **Goal:** Show what's new/changed, not everything ever ### 3. Incremental Updates - **First run:** 72-hour backfill across all sources - **Subsequent runs:** Fetch only ~1 hour of changes since last run - **Change detection:** SHA256 fingerprints on material fields (tier, status, CVSS, evidence) - **Cost control:** AI analysis cached in database (~$1/month, not $64/month) ### 4. Graceful Degradation - Works without optional API keys (OpenAI, Wiz) - Source failures logged but don't block other sources - Missing data shown as "pending" not as fake data ### 5. Data Completeness Over Speed - Full pagination on NVD API (all pages fetched, not just first 100) - No silent truncation of fetched data - Validation alerts when bounds hit (rate limits, API errors) ## Architecture Overview ┌───────────────────── DATA SOURCES ─────────────────────────┐ │ │ │ CVE Metadata Exploit Detection Discussion │ │ └─ NVD API └─ GitHub Code Search └─ Hacker News│ │ └─ GitHub Advisories └─ Exploit-DB └─ Reddit │ │ └─ CISA KEV └─ Metasploit └─ Telegram │ │ └─ Google OSV └─ GitHub Gists └─ RSS Feeds │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌───────────────────── INGESTION LAYER ──────────────────────┐ │ │ │ Smart Fetch Strategy: │ │ • 72h backfill on first run │ │ • 1h incremental updates │ │ • Full pagination (no truncation) │ │ • Metadata tracks last fetch time │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌───────────────────── AI VALIDATION ────────────────────────┐ │ │ │ OpenAI GPT-5.4-mini (Optional, ~$1/month): │ │ • Filters false positives from discussion feeds │ │ • Extracts threat intelligence (urgency, affected vendors) │ │ • Conservative auto-dismiss (high confidence only) │ │ • Cached in database (analyze once, use forever) │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌───────────────────── SQLITE DATABASE ──────────────────────┐ │ │ │ observations (CVEs) non_cve_observations │ │ • Fingerprinting • AI analysis cached │ │ • Change detection • Tier/status classification │ │ • 7-day rolling retention • 7-day rolling retention │ │ │ │ changes (History) analyst_states (Annotations) │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌──────────────────── PROCESSING LAYER ──────────────────────┐ │ │ │ Prioritization Engine: │ │ • Tier 0: Active exploitation (CISA KEV, vendor confirm) │ │ • Tier 1: POC available (GitHub, Exploit-DB, Metasploit) │ │ • Tier 2: High-potential risk (EPSS + remote + high-value)│ │ • Status: confirmed/suspected/unconfirmed/insufficient_data│ │ │ │ Scanner Correlation (Optional, requires Wiz): │ │ • Exact CVE match (high confidence) │ │ • Technology match (medium confidence) │ │ • Only tier 0/1 correlated (tier 2 too noisy) │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌────────────────────── FASTAPI SERVICE ─────────────────────┐ │ │ │ /api/cves - CVE feed with filtering │ │ /api/non-cve-exploits - Non-CVE signals │ │ /api/correlations - Wiz resource matches │ │ /api/summary - Statistics │ │ /api/refresh - Trigger data refresh │ │ /api/refresh/status - Check refresh progress │ │ / - Dashboard │ │ /docs - Interactive API docs │ │ │ └──────────────────────────┬───────────────────────────────────┘ │ ▼ ┌──────────────────────── DASHBOARD ─────────────────────────┐ │ │ │ Catalogued (CVEs) Emerging (Non-CVEs) Correlation│ │ • Tier 0/1/2 filtering • AI-validated • Resource │ │ • Status filtering • Threat intel • matches │ │ • Time window (1-72h) • GitHub Gists • Phase 2 │ │ • Evidence display • Discussion feeds │ │ │ │ About Tab: Explains methodology, sources, trust score │ │ │ └──────────────────────────────────────────────────────────────┘ ## Data Sources ### CVE Metadata & Enrichment | Source | Update Frequency | Auth | What It Provides | |--------|-----------------|------|------------------| | **NVD API** | Incremental | None (key optional) | CVE metadata, CVSS scores, descriptions, affected products | | **GitHub Security Advisories** | Incremental | PAT (free) | Vendor-confirmed vulnerabilities, affected package versions | | **CISA KEV Catalog** | Daily | None | 1,607+ known exploited vulnerabilities (authoritative tier 0) | | **Google OSV** | Incremental | None | Package ecosystem vulnerabilities (npm, PyPI, Maven, etc.) | ### Exploit & POC Detection | Source | Update Frequency | Auth | What It Provides | |--------|-----------------|------|------------------| | **GitHub Code Search** | Per-CVE | PAT (free) | Searches for exploit repositories and POC code | | **Exploit-DB** | Daily mirror | None | Public exploit database | | **Metasploit Framework** | Check per-CVE | None | High-confidence exploit module detection | ### Curated Research Feeds (High Trust, No AI Needed) | Source | Why Trusted | What It Provides | |--------|-------------|------------------| | **Google Project Zero** | Established research team | Cutting-edge vulnerability research | | **GitHub Security Lab** | Official GitHub team | Open source security research | | **Microsoft MSRC** | Vendor advisories | Microsoft security advisories | | **Google TAG** | Government-backed intel | Threat Analysis Group intelligence | | **Palo Alto Unit 42** | Enterprise security research | Threat research and analysis | | **Snyk Security Research** | Application security focus | Library/framework vulnerabilities | | **Wiz Research** | Cloud security focus | Cloud and container vulnerabilities | These feeds are checked for: 1. **CVE mentions** → Enriches existing CVE records with researcher context 2. **Zero-day mentions** → Detects "CVE pending" or "actively exploited" + no CVE ID yet **No AI validation needed** - Very low false positive rate from trusted sources. ### Breach Intelligence (AI-Validated) | Source | What It Provides | AI Filter | |--------|------------------|-----------| | **Google News** | Vendor breach announcements, incident reports | ✅ GPT-5.4-mini | | **Direct Security RSS** | BleepingComputer, Krebs, SecurityWeek, Dark Reading | ✅ GPT-5.4-mini | **Tier 1 Vendor Focus:** Monitors 189 critical vendors from ServiceNow Third-Party Risk Management (TPRM) - Top 20 vendors checked every refresh - Random sample of 10 additional vendors per cycle - Focus on confirmed breaches only (no rumors) - Filters: confirmed_by_vendor OR confirmed_by_evidence OR high-confidence claimed_by_actor **Hybrid AI Deduplication:** $0.66/month vs $54/month pure AI (99% cost reduction) - Deterministic rules catch 95% (FREE, instant) - AI validates borderline cases only (5%, ~$0.66/month) - Company name fuzzy matching, threat actor correlation, temporal proximity See `docs/ai/HYBRID_DEDUPLICATION.md` for technical details. ### Community Discussion Feeds (AI-Validated) | Source | Why Noisy | AI Filter | |--------|-----------|-----------| | **Hacker News** | Tool announcements, Show HN posts, news articles | ✅ GPT-5.4-mini | | **Reddit** (r/netsec, r/blueteam, r/cybersecurity) | Educational content, memes, off-topic | ✅ GPT-5.4-mini | | **Telegram** (security channels) | Spam, ads, false rumors | ✅ GPT-5.4-mini | | **GitHub Gists** | Random snippets, tutorials, non-exploit code | ✅ GPT-5.4-mini | **AI validation extracts:** - Signal type (active exploitation, POC available, tool announcement, false positive) - Urgency level (critical/high/medium/low/informational) - Affected vendors/products/versions - Threat actor info (researcher name, credibility) - Recommended actions - Monitoring keywords **Conservative auto-dismiss:** Only high-confidence false positives (tool announcements, news coverage) with informational urgency. ### Infrastructure Correlation (Optional) | Source | Auth | What It Provides | |--------|------|------------------| | **Wiz Security Platform** | OAuth | Correlates tier 0/1 CVEs with actual cloud resources (AWS, Azure, GCP) | ## Key Features ### 1. Incremental Sync Strategy (Cost & Performance) **Problem:** Fetching 72 hours of data every hour wastes API quota and time. **Solution:** # First run: 72-hour backfill fetch_hours = 72 # Subsequent runs: ~1 hour since last fetch last_fetch = get_last_fetch_time("nvd") hours_since = (now - last_fetch).total_seconds() / 3600 fetch_hours = max(1, int(hours_since * 1.1)) # 10% buffer update_last_fetch_time("nvd") **Result:** - First refresh: ~2-3 minutes (full backfill) - Hourly refreshes: ~30 seconds (incremental) - 95% reduction in API calls ### 2. AI Analysis Caching (97% Cost Reduction) **Problem:** Re-analyzing the same signal every refresh costs $64/month. **Solution:** - AI analysis stored in database with `ai_analyzed` flag - Only NEW signals analyzed (never seen before) - Cached results used for subsequent fetches **Result:** - **Without caching:** 50 signals × 72 fetches = 3,600 analyses/month = $64/month - **With caching:** 50 signals × 1 analysis = 50 analyses/month = $1/month - **Savings:** 98.6% cost reduction ### 2b. Hybrid AI Deduplication (99% Cost Reduction) **Problem:** Pure AI deduplication costs $54/month (253 pairwise comparisons). **Solution:** - Phase 1: Deterministic rules (company fuzzy match, threat actor, temporal) - FREE, catches 95% - Phase 2: AI validation for borderline cases only (60-90% similarity) - ~$0.66/month - Conservative merging: High confidence only, fallback to separate on error **Result:** - **Pure AI:** 253 comparisons/refresh × 24 refreshes = 6,072 calls/day = $54/month - **Hybrid:** ~3 borderline cases/refresh × 24 refreshes = 72 calls/day = $0.66/month - **Savings:** 99% cost reduction ($54 → $0.66) **Example:** Oracle + Oracle PeopleSoft + Oracle PeopleSoft customers → 1 consolidated card (1 AI call = $0.0003) ### 3. Change Detection (SHA256 Fingerprinting) **Problem:** How do we detect when a CVE's tier or evidence changes? **Solution:** material_fields = { "priority_tier": cve["priority_tier"], "status": cve["status"], "cvss_score": cve["cvss_score"], "exploit_available": cve["exploit_available"], "evidence_urls": sorted([e["source_url"] for e in evidence]) } fingerprint = hashlib.sha256(json.dumps(material_fields, sort_keys=True).encode()).hexdigest() if fingerprint != existing_fingerprint: cve["last_changed"] = now # Update timestamp **Result:** - Dashboard sorts by `last_changed` (newest material changes first) - Changes tracked in `changes` table for audit trail - Immaterial changes (metadata updates) don't trigger false "new" status ### 4. Full Pagination (No Data Loss) **Problem (Fixed 2026-05-30):** NVD returns 194 CVEs but code only fetched first 100. **Solution:** all_results = [] start_index = 0 while True: response = await fetch(start_index=start_index) results = response.get("vulnerabilities", []) all_results.extend(results) total = response.get("totalResults", 0) if len(all_results) >= total or not results: break start_index += len(results) await asyncio.sleep(1.0) # Rate limit respect **Result:** - 100% data completeness (all pages fetched) - No silent truncation - Validation alerts if API returns unexpected total ### 5. Conservative AI Auto-Dismiss **Problem:** Don't want to miss real threats, but also don't want 80% noise. **Solution:** def should_auto_dismiss(threat_intel): return ( threat_intel.confidence == "high" and threat_intel.urgency == "informational" and threat_intel.signal_type in ["false_positive", "tool_announcement", "news_coverage"] ) **Result:** - Only dismisses high-confidence false positives - Real threats always kept for analyst review - Ambiguous signals kept (better to show than miss) ## Getting Started ### Prerequisites - **Python 3.11+** - **SQLite** (included with Python) - **Git** (for cloning) ### Local Development Setup # 1. Clone the repository git clone https://github.com/twilio-internal/vuln-digest.git cd vuln-digest # 2. Create virtual environment python3 -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # 3. Install dependencies pip install -r requirements.txt # 4. Set environment variables (optional, see Configuration) export GITHUB_TOKEN="ghp_..." # For higher rate limits export OPENAI_API_KEY="sk-proj-..." # For AI validation export WIZ_CLIENT_ID="..." # For scanner correlation export WIZ_CLIENT_SECRET="..." # For scanner correlation # 5. Run the application uvicorn src.main:app --host 127.0.0.1 --port 8000 --reload # 6. Access the dashboard open http://127.0.0.1:8000 # 7. Trigger initial data refresh (optional, auto-runs on startup) curl -X POST http://127.0.0.1:8000/api/refresh ### First Run Behavior 1. **Database created** at `data/vuln_digest.db` (auto-created if missing) 2. **72-hour backfill** runs automatically on startup (~2-3 minutes) 3. **Metadata file** created at `data/digest_metadata.json` (tracks last fetch times) 4. **Dashboard accessible** at http://127.0.0.1:8000 immediately (shows refreshing state) ### Verifying It Works # Check refresh status curl http://localhost:8000/api/refresh/status | jq '.state' # Should show: "idle" or "completed" after initial refresh # Check CVE count curl http://localhost:8000/api/summary | jq '.total_cves' # Should show: 20-50 CVEs (typical 72h window) # Check source health curl http://localhost:8000/api/summary | jq '.source_health' # Should show: "ok" for most sources # Check database sqlite3 data/vuln_digest.db "SELECT COUNT(*) FROM observations" # Should match or exceed CVE count ## Configuration All configuration via **environment variables**. No `.env` file needed (but supported if present). ### Required (None - Works Out of Box) The application runs with **no required configuration**. All API keys are optional. ### Optional: GitHub API (Recommended) **Purpose:** Higher rate limits (5k/hr vs 60/hr) and POC search capability. export GITHUB_TOKEN="ghp_YOUR_TOKEN_HERE" # OR export GH_TOKEN="ghp_YOUR_TOKEN_HERE" # OR use `gh auth login` (GitHub CLI) **How to get:** 1. Go to https://github.com/settings/tokens 2. Generate new token (classic) 3. Scopes needed: `public_repo` (read-only) **Fallback:** Uses `gh auth token` or runs without auth (60 req/hr). ### Optional: OpenAI API (Recommended for Non-CVE Feeds) **Purpose:** AI threat intelligence extraction for discussion feeds (Hacker News, Reddit, Telegram). export OPENAI_API_KEY="sk-proj-YOUR_KEY_HERE" **How to get:** 1. Go to https://platform.openai.com/api-keys 2. Create new secret key 3. Copy and set as environment variable **Cost:** ~$1/month for typical usage (50 signals/day) **Fallback:** Non-CVE signals still appear, but without AI analysis (no auto-dismiss, no threat intelligence extraction). ### Optional: Wiz Scanner Integration **Purpose:** Correlate tier 0/1 CVEs with actual cloud resources. export WIZ_CLIENT_ID="your-wiz-client-id" export WIZ_CLIENT_SECRET="your-wiz-client-secret" export WIZ_API_ENDPOINT="https://api.us1.app.wiz.io/graphql" # Default **How to get:** 1. Contact Wiz admin for service account 2. See `docs/integrations/WIZ_INTEGRATION.md` for full setup 3. See `docs/setup/OTK_SECRETS_SETUP.md` for deployment secrets **Fallback:** Application works without Wiz (no scanner correlation tab). ### Optional: NVD API Key (Higher Rate Limits) **Purpose:** 50 requests per 30 seconds (vs 5 without key). export NVD_API_KEY="your-nvd-api-key" **How to get:** 1. Request key at https://nvd.nist.gov/developers/request-an-api-key 2. Free, usually approved in 1-2 business days **Fallback:** Works fine without key for typical usage (<5 requests per 30s). ## How It Works ### Data Flow 1. INGEST (src/ingestion/*.py) ├─ Fetch from all sources (NVD, GitHub, CISA, HN, etc.) ├─ Deduplicate by URL/CVE ID └─ Classify as CVE or non-CVE signal 2. ENRICH (src/processing/prioritization.py) ├─ Extract evidence keywords (exploit, POC, weaponized, etc.) ├─ Assign tier (0/1/2/unclassified) ├─ Assign status (confirmed/suspected/insufficient_data) └─ Calculate confidence (high/medium/low) 3. AI VALIDATE (src/ai/signal_validator.py) [Non-CVE only] ├─ Check if signal already analyzed (database lookup) ├─ If new: Call OpenAI GPT-5.4-mini with structured output ├─ Extract threat intelligence (urgency, affected vendors, etc.) ├─ Auto-dismiss high-confidence false positives └─ Cache result in database 4. CORRELATE (src/processing/scanner_correlation.py) [Tier 0/1 only] ├─ Query Wiz for CVE exact match (high confidence) ├─ Query Wiz for package/version match (medium confidence) └─ Return affected cloud resources 5. PERSIST (src/storage.py) ├─ Fingerprint material fields (SHA256) ├─ Compare to existing fingerprint ├─ Update last_changed if different ├─ Store in database (observations or non_cve_observations) └─ Track in changes table 6. SERVE (src/main.py) ├─ Load from database on request ├─ Apply filters (tier, status, time window) ├─ Sort by last_changed DESC └─ Return JSON to frontend 7. DISPLAY (dashboard/index.html) ├─ Render cards with tier badges ├─ Show AI threat intelligence section ├─ Display evidence with source URLs └─ Sort by newest material changes first ### Tier Classification Logic **Location:** `src/processing/prioritization.py` #### Tier 0: Actively Exploited **Criteria:** - CISA KEV listing (authoritative) - Keywords: "actively exploited", "in the wild", "exploitation observed" - Confidence: High (CISA KEV) or Medium (trusted researcher) **Status:** - `confirmed` if high confidence - `suspected` if medium confidence **Example:** CVE-2024-12345 Tier: 0 Status: confirmed Evidence: CISA KEV listing (2024-05-28), Vendor advisory confirms active exploitation #### Tier 1: POC Available **Criteria:** - Public POC code (GitHub, Exploit-DB, Metasploit) - Keywords: "proof of concept", "exploit code", "metasploit module", "poc available" - Confidence: High (Metasploit) or Medium (unverified GitHub repo) **Status:** - `confirmed` if high confidence (Metasploit, Exploit-DB with verified badge) - `suspected` if medium confidence (random GitHub repo) **Example:** CVE-2024-67890 Tier: 1 Status: suspected Evidence: GitHub POC repository (medium confidence), No vendor confirmation yet #### Tier 2: High-Potential Threat Risk **Criteria (ALL must be true):** 1. **EPSS ≥ 0.10** (top 10% exploitation probability) 2. **Remotely exploitable** (AV:N, PR:None/Low, UI:None) 3. **High-value weakness** (RCE, Auth Bypass, SSRF, Path Traversal) 4. **Critical infrastructure** (VPNs, Firewalls, Identity, Virtualization, Ubiquitous Libraries) **Status:** - `suspected` (EPSS is ML prediction, not confirmed exploitation) **Example:** CVE-2024-99999 Tier: 2 Status: suspected CVSS: 9.8 (AV:N/PR:N/UI:N) EPSS: 0.42 (top 1%) Product: Fortinet FortiOS SSL-VPN Weakness: Authentication Bypass (CWE-287) #### Unclassified **Criteria:** - CVSS < 7.0 - No exploitation evidence - Not critical infrastructure - Low EPSS score **Status:** - `insufficient_data` ### AI Threat Intelligence Extraction **Model:** OpenAI GPT-5.4-mini **Cost:** ~$0.0006 per signal (~$1/month) **Location:** `src/ai/signal_validator.py` **Input (Normalized Signal):** { "signal_id": "NO-CVE-2026-05-29-HN-1E1D", "title": "Microsoft 0-day feud escalates...", "source": "Hacker News", "url": "https://...", "published_date": "2026-05-29T14:37:41Z", "full_text": "...", "matched_keywords": ["0-day", "exploit", "Windows"], "engagement_metrics": {"score": 342, "comments": 87} } **Output (Threat Intelligence):** { "signal_type": "exploit_release_announced", "urgency": "high", "confidence": "high", "affected_vendors": ["Microsoft"], "affected_products": ["Windows"], "affected_versions": [], "threat_actor": "Nightmare Eclipse", "threat_actor_credibility": "established", "exploit_release_date": "July 14", "patch_available": false, "attack_vector": "network", "cve_pending": false, "poc_url": null, "vendor_advisory_url": "https://...", "recommended_actions": [ "Audit Windows deployments and patch status", "Prepare incident response procedures for July 14", "Monitor Microsoft Security Response Center", "Heightened Windows endpoint monitoring starting July 13" ], "monitoring_keywords": ["Nightmare Eclipse", "Microsoft", "Windows", "July 14"], "key_findings": "Credible researcher (6 prior 0-days) threatens Windows exploit on July 14. High likelihood given track record and public feud context.", "reasoning": "HIGH urgency: Established researcher with proven track record (6 Windows 0-days) announces specific release date (July 14). Public feud increases likelihood of actual release." } **Auto-Dismiss Logic:** # Only dismiss if ALL conditions met: if ( threat_intel.confidence == "high" and threat_intel.urgency == "informational" and threat_intel.signal_type in ["false_positive", "tool_announcement", "news_coverage"] ): signal["analyst_status"] = "dismissed" signal["ai_auto_dismissed"] = True ## Troubleshooting ### Issue: No CVEs Appearing **Symptoms:** - Dashboard shows "No vulnerabilities match this filter" - `/api/summary` returns `total_cves: 0` **Diagnosis:** # Check if refresh completed curl http://localhost:8000/api/refresh/status | jq '.state' # Should be: "idle" or "completed" # Check source health curl http://localhost:8000/api/summary | jq '.source_health' # Look for "error" status on any source # Check database sqlite3 data/vuln_digest.db "SELECT COUNT(*) FROM observations" # Should be > 0 **Common Causes:** 1. **Refresh still running** - Wait 2-3 minutes for initial backfill 2. **NVD API rate limit** - Set `NVD_API_KEY` env var 3. **Network issues** - Check firewall/proxy settings 4. **Database locked** - Close any other SQLite connections **Fix:** # Trigger manual refresh curl -X POST http://localhost:8000/api/refresh # Check logs tail -f /tmp/vuln_digest.log # Or wherever you're logging ### Issue: AI Threat Intelligence Not Showing **Symptoms:** - Non-CVE cards missing "🎯 Threat Intel" section - Product field shows broken data (dates, researcher names) - Database shows `ai_analyzed=0` **Diagnosis:** # Check if OpenAI key set env | grep OPENAI_API_KEY # Check API response curl http://localhost:8000/api/non-cve-exploits?limit=1 | jq '.exploits[0] | has("ai_threat_intel")' # Should return: true **Common Causes:** 1. **No OpenAI API key** - AI validation skipped 2. **Invalid API key** - Check key is correct 3. **Signal already in DB before AI added** - Needs re-analysis **Fix:** # Set API key export OPENAI_API_KEY="sk-proj-..." # Restart server pkill -f uvicorn uvicorn src.main:app --host 127.0.0.1 --port 8000 # Force re-analysis (delete signal from DB) sqlite3 data/vuln_digest.db "DELETE FROM non_cve_observations WHERE ai_analyzed = 0" # Trigger refresh curl -X POST http://localhost:8000/api/refresh ### Issue: Broken Product Extraction (Dates/Names) **Example:** `PRODUCT: Microsoft • June • Childs • Windows` **Root Cause:** Signal was never analyzed by AI (missing `ai_threat_intel` field). **Fix:** See "AI Threat Intelligence Not Showing" above. ### Issue: Scanner Correlation Not Working **Symptoms:** - "Scanner Correlation" tab shows "No Wiz credentials configured" - `/api/correlations` returns empty array **Diagnosis:** # Check if Wiz creds set env | grep WIZ_CLIENT_ID env | grep WIZ_CLIENT_SECRET # Test Wiz auth curl -X POST $WIZ_API_ENDPOINT \ -H "Content-Type: application/json" \ -d '{"grant_type":"client_credentials","client_id":"'$WIZ_CLIENT_ID'","client_secret":"'$WIZ_CLIENT_SECRET'","audience":"wiz-api"}' # Should return: {"access_token": "..."} **Common Causes:** 1. **No Wiz credentials** - Scanner correlation disabled 2. **Invalid credentials** - Check client ID/secret 3. **Wrong API endpoint** - Check region (us1, eu1, etc.) **Fix:** # Set Wiz credentials export WIZ_CLIENT_ID="your-client-id" export WIZ_CLIENT_SECRET="your-client-secret" export WIZ_API_ENDPOINT="https://api.us1.app.wiz.io/graphql" # Or your region # Restart server pkill -f uvicorn uvicorn src.main:app --host 127.0.0.1 --port 8000 # Trigger correlation refresh curl -X POST http://localhost:8000/api/correlations/refresh ### Issue: High Refresh Times (>5 minutes) **Symptoms:** - Initial refresh takes >5 minutes - Hourly refreshes take >2 minutes **Diagnosis:** # Check NVD pagination # Look for logs like: "NVD: Fetching page 1/5 (100 results)" # Check GitHub rate limits curl -H "Authorization: Bearer $GITHUB_TOKEN" https://api.github.com/rate_limit | jq '.rate' # Check database size ls -lh data/vuln_digest.db **Common Causes:** 1. **No NVD API key** - Rate limited to 5 requests per 30s 2. **Full backfill every refresh** - Metadata file missing 3. **Large database** - Needs cleanup (>100MB) **Fix:** # Set NVD API key export NVD_API_KEY="your-nvd-key" # Check metadata file exists ls -la data/digest_metadata.json # If missing, first refresh will be slow (72h backfill) # Clean up old database (if >7 days old data) sqlite3 data/vuln_digest.db "DELETE FROM observations WHERE last_seen < datetime('now', '-7 days')" sqlite3 data/vuln_digest.db "VACUUM" ### Issue: Memory/CPU Usage High **Symptoms:** - Process using >1GB RAM - CPU at 100% during refresh **Diagnosis:** # Check process stats ps aux | grep uvicorn # Check database size du -sh data/vuln_digest.db # Check number of signals sqlite3 data/vuln_digest.db "SELECT COUNT(*) FROM observations" sqlite3 data/vuln_digest.db "SELECT COUNT(*) FROM non_cve_observations" **Common Causes:** 1. **Database not cleaning up** - Records >7 days old still present 2. **Too many AI analyses** - OpenAI API calls piling up 3. **Memory leak in refresh loop** - Restart server **Fix:** # Clean up old data sqlite3 data/vuln_digest.db "DELETE FROM observations WHERE last_seen < datetime('now', '-7 days')" sqlite3 data/vuln_digest.db "DELETE FROM non_cve_observations WHERE last_seen < datetime('now', '-7 days')" sqlite3 data/vuln_digest.db "VACUUM" # Restart server pkill -f uvicorn uvicorn src.main:app --host 127.0.0.1 --port 8000 ## Maintenance ### Daily Tasks **None required** - Application is fully automated. ### Weekly Tasks 1. **Check source health** (2 minutes) curl http://localhost:8000/api/summary | jq '.source_health' # Look for "error" status on any source 2. **Review auto-dismissed signals** (5 minutes) # Check if real threats were dismissed sqlite3 data/vuln_digest.db "SELECT signal_id, title FROM non_cve_observations WHERE ai_auto_dismissed = 1 ORDER BY last_seen DESC LIMIT 10" ### Monthly Tasks 1. **Review AI cost** (2 minutes) - Check OpenAI usage dashboard: https://platform.openai.com/usage - Should be ~$1/month for typical usage - If >$5/month, investigate (possible spam/loop) 2. **Review database size** (2 minutes) du -sh data/vuln_digest.db # Should be <50MB for 7-day rolling window # If >100MB, run VACUUM 3. **Check for stale data** (2 minutes) # Verify 7-day cleanup is working sqlite3 data/vuln_digest.db "SELECT COUNT(*), MIN(last_seen), MAX(last_seen) FROM observations" # MIN should be within 7 days ### Quarterly Tasks 1. **Review trust score** (30 minutes) - See `docs/fixes/TRUST_AUDIT_REPORT.md` - Spot-check 10-20 CVEs for accuracy - Verify tier classifications are correct 2. **Update dependencies** (30 minutes) pip list --outdated # Review and update requirements.txt pip install -r requirements.txt ### On-Demand Tasks #### Backfill Historical Data # Delete metadata to force 72h backfill rm data/digest_metadata.json # Trigger refresh curl -X POST http://localhost:8000/api/refresh #### Force Re-Analysis of Non-CVE Signals # Mark all signals as unanalyzed sqlite3 data/vuln_digest.db "UPDATE non_cve_observations SET ai_analyzed = 0" # Trigger refresh (will re-analyze all) curl -X POST http://localhost:8000/api/refresh #### Reset Database (Nuclear Option) # Backup first cp data/vuln_digest.db data/vuln_digest.db.backup # Delete database rm data/vuln_digest.db rm data/digest_metadata.json # Restart server (will create fresh DB) pkill -f uvicorn uvicorn src.main:app --host 127.0.0.1 --port 8000 ## Cost & Performance ### API Costs | Service | Cost | Usage | Monthly Cost | |---------|------|-------|--------------| | **OpenAI GPT-5.4-mini** | $0.15/1M input + $0.60/1M output | ~50 signals/day | ~$1 | | **GitHub API** | Free | 5k requests/hr (with token) | $0 | | **NVD API** | Free | 50 requests/30s (with key) | $0 | | **Wiz API** | Varies | Depends on Wiz contract | $0 (included) | **Total:** ~$1/month (just OpenAI) ### Performance Metrics | Operation | First Run | Incremental | Notes | |-----------|-----------|-------------|-------| | **Initial backfill** | ~2 minutes | N/A | 72-hour window, all sources | | **Hourly refresh** | N/A | 30-60 seconds | 1-hour incremental | | **Breach intel processing** | ~15 seconds | N/A | 23 signals (was 160s before parallelization) | | **AI analysis** | ~2 seconds/signal | N/A | Cached after first analysis, max 15 concurrent | | **Scanner correlation** | ~5 seconds | N/A | Only tier 0/1 CVEs | | **Database size** | N/A | 20-50MB | 7-day rolling window | **Recent Optimizations (2026-06-11):** - **Phase 1:** Early exit guards (EPSS, Wiz) - skip empty operations - **Phase 2:** Parallel breach processing - 160s → 15s (10x speedup) - **Phase 3:** Concurrency limits - max 10 HTTP, 15 AI (prevents rate limiting) ### Token Usage (OpenAI) - **Input:** ~2,000 tokens/signal (system prompt + signal text) - **Output:** ~450 tokens/signal (structured threat intelligence) - **Cost per signal:** ~$0.0006 - **Monthly (50 signals/day):** $0.90 ## Data Quality & Limitations ### What Works Well | Area | Current State | Notes | |------|---------------|-------| | **Data Completeness** | Pagination implemented | All CVE pages fetched, no truncation, validation on API mismatches | | **Deduplication** | Multi-source merging | Evidence preserved across NVD/GitHub/CISA, hybrid AI for breach intel | | **Classification** | Evidence-based tiers | Keywords from trusted sources, confidence levels tracked | | **Change Tracking** | SHA256 fingerprinting | Detects material changes (tier/status/evidence), immaterial updates ignored | | **Performance** | Incremental sync | 72h initial, ~1h updates, parallelized breach processing (15s vs 160s) | | **Cost** | AI caching + hybrid dedup | ~$1/month OpenAI (97% caching + 99% dedup savings) | | **Security** | SSL verification, rate limits | Certificate validation enabled, concurrency bounded (10 HTTP, 15 AI) | ### Known Limitations 1. **No alerting** - Tier 0 CVEs don't trigger notifications (manual review only) 2. **No Grafana dashboards** - Operational metrics not visualized 3. **Single-region Wiz** - Only US1 tested 4. **EPSS dependency** - Tier 2 relies on FIRST.org uptime ### Past Issues (Fixed) - ✅ **2026-05-30:** Fixed arbitrary data truncation (50% CVE loss) - ✅ **2026-05-30:** Fixed missing NVD pagination (94+ CVEs missed) - ✅ **2026-05-31:** Fixed redundant AI analysis (97% cost waste) - ✅ **2026-05-31:** Fixed wrong tier badges (display bug) - ✅ **2026-06-11:** Fixed SSL certificate verification bypass (CRITICAL security issue) - ✅ **2026-06-11:** Added concurrency limits (prevents rate limiting/blocking) - ✅ **2026-06-11:** Implemented hybrid AI deduplication ($54/month → $0.66/month) - ✅ **2026-06-11:** 10x breach processing speedup (160s → 15s) See `docs/fixes/TRUST_AUDIT_REPORT.md` and `docs/security/SECURITY_REVIEW_2026-06-12.md` for full details. ## Deployment This application deploys to **Twilio's One Twilio Kubernetes (OTK)** platform via GitOps. ### Repositories | Repo | Purpose | |------|---------| | [vuln-digest](https://github.com/twilio-internal/vuln-digest) | Application code, Helm chart, Buildkite pipeline | | [vuln-digest-deploy](https://github.com/twilio-internal/vuln-digest-deploy) | Rendered Kubernetes manifests (Argo CD watches this) | ### Deployment Flow Push code → Buildkite → Docker build → Helm render → PR to deploy repo → Auto-merge → Argo CD sync → K8s deploy ### Key Resources - **Argo CD:** https://argo-cd.mgmt.otk.twilioinfra.com/applications - **Buildkite:** https://www.buildkite.com/twilio/twilio-internal-vuln-digest - **Grafana Logs:** https://g-61605d934d.grafana-workspace.us-east-1.amazonaws.com/d/QLMdzcw4k/view-app-logs - **Datadog Dashboard:** https://app.datadoghq.com/dashboard/hjs-3ik-qhy ### Secrets Management Use OTK sealed secrets: otk secret create vuln-digest-secrets \ --from-literal=GITHUB_TOKEN="ghp_..." \ --from-literal=OPENAI_API_KEY="sk-proj-..." \ --from-literal=WIZ_CLIENT_ID="..." \ --from-literal=WIZ_CLIENT_SECRET="..." \ --namespace your-namespace See `docs/setup/OTK_SECRETS_SETUP.md` for full setup. ### Multi-Environment Setup To add staging/production: 1. Create values file: `chart/values-{env}.yaml` 2. Add render-manifests step in `.buildkite/pipeline.yaml` 3. Deploy to different namespace per environment See [OTK Multi-Environment Pipeline](https://internal-product-docs.twilio.com/docs/one-twilio-kubernetes-platform/tutorials/deployments/multi-environment-pipeline) ## Documentation Index ### Core Documentation - **CLAUDE.md** - AI assistant guidance (design principles, conventions, past bugs) - **docs/project/ARCHITECTURE.md** - System architecture deep-dive - **docs/project/SCHEMA.md** - Database schema and data models - **docs/project/PROJECT_PLAN.md** - Original project plan and milestones ### Setup Guides - **docs/setup/ai-validation-setup.md** - OpenAI API setup - **docs/setup/reddit-setup.md** - Reddit API setup - **docs/setup/telegram-setup.md** - Telegram bot setup - **docs/setup/OTK_SECRETS_SETUP.md** - Kubernetes secrets management ### Integration Guides - **docs/integrations/WIZ_INTEGRATION.md** - Wiz Security Platform integration - **docs/integrations/SCANNER_CORRELATION.md** - Scanner correlation architecture - **docs/integrations/SOURCE_COVERAGE.md** - Data source inventory ### UI/UX Documentation - **docs/ui/DASHBOARD_PRODUCT_SPEC.md** - Dashboard design specification - **docs/ui/CARD_LAYOUT_REDESIGN.md** - CVE card layout design - **docs/ui/NON_CVE_CARD_IMPROVEMENTS.md** - Non-CVE card fixes - **docs/ui/UI_INSPIRATION.md** - Design inspiration and references ### Fixes & Audits - **docs/fixes/TRUST_AUDIT_REPORT.md** - Comprehensive system audit (2026-05-30) - **docs/fixes/NON_CVE_PERSISTENCE_FIX.md** - AI caching implementation - **docs/fixes/AI_MODEL_UPDATE.md** - GPT-4o → GPT-5.4-mini migration - **docs/fixes/AI_COST_ANALYSIS_2026.md** - Detailed cost breakdown - **docs/security/SECURITY_REVIEW_2026-06-12.md** - Security & performance review (2026-06-11) - **docs/ai/HYBRID_DEDUPLICATION.md** - Hybrid AI deduplication architecture (2026-06-11) ## Quick Reference ### Common Commands # Start local development server uvicorn src.main:app --host 127.0.0.1 --port 8000 --reload # Trigger manual refresh curl -X POST http://localhost:8000/api/refresh # Check refresh status curl http://localhost:8000/api/refresh/status | jq '.state' # View CVE summary curl http://localhost:8000/api/summary | jq '.' # Check database size du -sh data/vuln_digest.db # Clean up old data sqlite3 data/vuln_digest.db "DELETE FROM observations WHERE last_seen < datetime('now', '-7 days'); VACUUM;" # View source health curl http://localhost:8000/api/summary | jq '.source_health' # Check OpenAI cost # Visit: https://platform.openai.com/usage ### Key Files - `src/main.py` - FastAPI application entry point - `src/storage.py` - Database operations - `src/ai/signal_validator.py` - AI threat intelligence extraction - `src/processing/prioritization.py` - Tier classification logic - `src/ingestion/` - Data source integrations - `dashboard/index.html` - Frontend UI - `data/vuln_digest.db` - SQLite database - `data/digest_metadata.json` - Last fetch timestamps ### Support Contacts - **Owner:** mgraziano-twlo - **Team:** Twilio Security - **Repo:** https://github.com/twilio-internal/vuln-digest - **Issues:** https://github.com/twilio-internal/vuln-digest/issues **Last Updated:** 2026-06-11 **Version:** 2.1 **Status:** Production ## Development Quickstart with Makefile A Makefile is provided for common development tasks: # Show all available commands make help # Start development server (with auto-reload) make dev # Start production server (background) make server # Check server status make status # View logs make logs # Stop server make stop # Restart server make restart # Trigger data refresh make refresh # Database operations make db-backup # Backup database make db-stats # Show statistics make db-reset # Reset database (WARNING: deletes all data) # Code quality make lint # Run linters make format # Format code make test # Run tests # Deployment make ngrok # Expose local server via ngrok make deploy-dev # Deploy to dev (OTK) # Utilities make clean # Clean cache and temp files ### Quick Start # First time setup make install # Start server make dev # In another terminal, check status make status # Access dashboard at http://localhost:8000
标签:GPT, Petitpotam, 威胁情报, 安全运营, 开发者工具, 扫描框架, 数据聚合, 漏洞管理, 逆向工具