hasamba/DFIR-Companion

GitHub: hasamba/DFIR-Companion

一款本地运行的 AI 辅助数字取证分诊工具,将调查截图与工件自动转化为取证时间线、发现、IOC 和事件报告。

Stars: 4 | Forks: 1

DFIR Companion logo

# DFIR Companion [![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](LICENSE) A localhost digital-forensics / incident-response companion. A browser extension captures screenshots of your investigation (Velociraptor, EDR/SIEM dashboards, Security Onion, Splunk4DFIR, VolWeb, VirusTotal, etc.) as evidence; a local server stores them, runs **windowed AI vision analysis** into an accumulating per-case investigation state, and serves a **live dashboard** plus exportable reports. Everything runs on your machine — the companion binds to `127.0.0.1` only, evidence stays on disk, and the AI provider is yours to choose. ## Screenshots ### Executive Summary & Recommended Next Steps AI-generated case summary and AI-prioritized remediation actions (Critical → Medium), each with rationale and a pointer to the finding or artifact it came from. DFIR Companion — AI executive summary and prioritized remediation next steps ### Forensic Timeline 31 corroborated events from Chainsaw · THOR · Suricata · CrowdStrike Falcon — severity filters, per-row triage tags (`initial-access`, `c2-comms`, `key-evidence`, …), import change tracking (+19 new events banner with expandable diff), and analyst star / bulk-action controls. DFIR Companion — forensic timeline with 31 events, severity filters, triage tags, and import tracking ### Attack Path Narrative · MITRE ATT&CK Kill Chain · Findings Full attacker-path write-up from initial access to ransomware attempt, an interactive kill chain (click a tactic to expand its events), and the top findings with confidence scores. DFIR Companion — attack path narrative, MITRE ATT&CK kill chain, and findings ### Findings 8 AI-generated findings (2 Critical · 2 High · 2 Medium · 1 Low) — each with a confidence %, analyst triage tags, MITRE technique links, and a synthesis freshness diff (+8 new since last run). DFIR Companion — findings with confidence scores, analyst triage tags, and MITRE ATT&CK links ### Evidence Chain Graph Process trees + lateral movement across DC01, FS01, and WKSTN-JSMITH stitched into one causal attack graph. Derived deterministically from importer-populated fields — no AI, no cost, runs offline. DFIR Companion — evidence chain graph with process trees and lateral movement across hosts ### IOCs with Threat-Intel Enrichments 15 indicators (IPs · domains · hashes · files · processes · URL) enriched against VirusTotal, AbuseIPDB, ThreatFox, URLhaus, and MalwareBazaar — verdict badges, detection scores, `NEW` import highlights, and analyst `confirmed-malicious` / `pivot-point` triage labels. DFIR Companion — IOCs with VirusTotal, AbuseIPDB, ThreatFox, URLhaus, and MalwareBazaar enrichments ### Customer Exposure & Compromised Assets · IoC Graph **Customer Exposure** (top): credential-leak check for the victim org's own domains and emails against HIBP / DeHashed / Shodan — breach names, exposed services, no raw passwords stored. **Compromised Assets & IoC graph** (bottom): interactive graph linking victim hosts and accounts to the indicators that touched each — Host / Account toggles, fullscreen, drag-to-pin nodes. DFIR Companion — customer exposure panel and compromised assets IoC graph ### Key Investigative Questions 8 standard DFIR questions auto-answered from the synthesized case (answered ✅ / partial 🟡 / unknown ❓), each with an evidence pointer or a "collect this next" directive. DFIR Companion — key investigative questions with answers and evidence pointers ## What it produces For each case the AI builds and keeps up to date: - **Forensic timeline** — real incident events with their *true* timestamps read from the artifacts (process create, logon, network connection, file MAC times…), sorted chronologically. Distinct from the capture/analysis log. - **Findings** — granular, per-technique analytic conclusions, each with severity and MITRE ATT&CK mapping. - **IOCs**, **MITRE ATT&CK** coverage, and an **attacker-path** narrative (kill chain). - **Attack phases** — the timeline grouped into temporal **bursts** (activity clustered by time gap), each labelled with its dominant ATT&CK tactic — the *when did each stage happen* view, complementary to the categorical kill chain. Deterministic, no AI call. - **Beacon / C2 detection** — outbound connection channels (host → dest:port) whose inter-arrival intervals are too regular to be human traffic, the classic C2 callback signature. Derived from the network events; severity High for public destinations. A hunting lead, not a verdict. Deterministic, no AI call. Dashboard panel + report §4.9. - **Log gap analysis** — suspiciously long **silent periods** in the timeline. A gap where *every* source went dark is the classic log-tampering signature (cleared Event Logs, a stopped collector/auditd, deleted logs) — flagged High and escalated to a finding; one tool going quiet while others keep logging is a partial coverage blindspot (Medium). Density-aware (won't flag normal quiet in a sparse timeline) with optional working-hours filtering. A lead, not proof. Deterministic, no AI call. Dashboard *Timeline Gaps* panel + report §3.3. - **Adversary hints** — known **MITRE ATT&CK groups** ranked by how much their technique set overlaps the case's, as early hypothesis fuel. Offline (a bundled dataset, no AI/network); sub-technique-aware, so an **exact** sub-technique match (highlighted) outranks a base-technique-only one. Each card shows aliases, sectors/regions, the overlap ratio, the exact-match count, and the shared techniques. Statistical similarity, **not attribution**. - **Compromised assets** — the victim hosts and user accounts, with an interactive **asset ↔ IoC graph** showing which indicators touched each. - **Key investigative questions** — initial access, lateral movement, compromised users/hosts, exfiltration, dwell time… each with an answer and a pointer to where to find/confirm it (or what to collect next). - **Investigation threads** — open leads and resolved ones. - **Reports** — a full incident-report in **Markdown, HTML, and PDF** (one-click print-to-PDF), plus CSV and JSON exports. ## Features ### Capture & ingest - **MV3 browser extension** — timer + event-driven capture (navigation / tab switch / click), `Ctrl+Shift+S` hotkey, offline queue + auto-sync, per-case Start/Stop. Attaches to an existing case from a dropdown — it never creates one. - **Case management in the dashboard** — **+ New case** is the one place cases are born; captures to an unknown case are rejected. Five built-in **templates** pre-load incident-type investigation questions + import/hunt hints (save your own too). - **Import screenshots** — multi-select PNG/JPEG/WebP from any tool, through the same ingest path as the extension. - **One Import button** — drop any artifact file; the server auto-detects the format and routes it. Optional minimum-severity floor at the gate. - **Import undo/redo** — an import that floods the dashboard can be rolled back to the exact pre-import state — findings, IOCs, timeline, MITRE, attacker path (and redone), restored verbatim with no AI call. Undo/Redo buttons sit next to the Import button; a per-case stack keeps multiple levels (`DFIR_IMPORT_UNDO_DEPTH`). - **Evidence-first** — written to disk + append-only audit log before any analysis; exact-hash (SHA-256) duplicate detection (`DFIR_DEDUP=off` to disable). - **Localhost only** — binds `127.0.0.1` (CORS + Private-Network-Access so the extension origin can reach it). ### Evidence importers All importers are **deterministic (no AI call)**, read the artifact's own timestamps, and tag events with the real tool name for cross-source correlation. The same file can be re-imported without duplicating the timeline. | Format | Key sources | Severity derived from | |---|---|---| | **SIEM / EDR JSON** | Elastic, Kibana, Splunk, QRadar, any JSON/NDJSON export | Windows/Sysmon per-EID table | | **Chainsaw** | EVTX hunt JSON/JSONL (`chainsaw hunt --json`) | Matched Sigma rule level | | **Hayabusa** | `json-timeline` or `csv-timeline` | Matched Sigma rule level | | **Velociraptor** | JSON array, JSONL, or artifact map | Sigma/YARA verdict or per-EID | | **THOR (Nextron)** | JSON-Lines scan output | THOR alert level | | **Suricata / Zeek** | `eve.json`, Zeek JSON logs; telemetry → IOCs only | Alert priority / notice severity | | **Cyber Triage** | JSONL / JSON / CSV timeline | Cyber Triage item score | | **M365 / Entra ID** | UAL, Entra sign-in + audit logs | BEC tradecraft table / Entra riskLevel | | **AWS CloudTrail** | Records JSON, NDJSON, Athena | API action table (IAM/logging/S3/secrets) | | **GCP / Azure** | Cloud Audit Logs, Azure Activity Log | Action table (IAM/logging/secrets) | | **Plaso** | `psort` CSV (dynamic + l2tcsv) | — (Info events) | | **Sandbox reports** | CAPEv2 `report.json`, Falcon Sandbox summary | Sample verdict + behavioural signatures | | **Memory forensics** | Volatility 3 (`-r json`) + Rekall: pslist/pstree, netscan, malfind, cmdline, svcscan | malfind injected code → High (T1055); listings → Info/Low evidence | | **Email** | `.eml` (RFC 2822), best-effort `.msg` | SPF/DKIM/DMARC fail → sender spoof heuristics (T1566 Phishing) | | **Linux auditd** | raw `audit.log` / `ausearch` records, `aureport` tables | Record-type table (logins, account mgmt, sudo, SELinux, audit tampering) | | **systemd journald** | `journalctl -o json` / `-o json-pretty` | syslog PRIORITY + tradecraft bumps (sshd, sudo, useradd) | | **sysdig / Falco** | Falco alert JSON, sysdig `-j` event JSON | Falco rule priority; raw syscalls → Info telemetry | | **CSV** | Velociraptor / EDR exports | — | | **Generic logs** | Firewall, syslog, VPN; repetitive lines → counted patterns | AI-triaged | ### AI analysis - **Two-phase** — cheap per-window vision **extraction** → forensic timeline; strong text-only **synthesis** → findings, IOCs, MITRE ATT&CK, attack path, narrative, key questions, next steps. - **Providers** — OpenAI, OpenRouter, Ollama, local LiteLLM (or any OpenAI-compatible endpoint), Gemini. Optional **two-tier** (cheap extract + strong synth); context-window budgeting + bounded, truncation-tolerant output (no spurious OpenRouter 402 / context 400s). - **EDR/XDR + SIEM consoles are evidence** — detections are extracted; analyst tool-navigation is filtered out, with an incident-signal allowlist so a real detection is never dropped. - **Severity-aware findings** — a Critical/High row becomes a finding; a deterministic safety net auto-creates one (`AUTO` badge) for any high-severity event synthesis missed. - **Efficient, grounded synthesis** — live debounced re-synthesis during capture; skip-if-unchanged; stratified event selection + a *compromised assets ← IoCs* grounding digest. - **AI-input anonymization** — reversibly tokenizes internal IPs/users/hosts/domains/emails/paths and one-way-redacts secrets (adversary IOCs preserved). Entities auto-discover from the timeline **and screenshots**, each removable; default on. ### Correlation & deduplication - **Cross-source correlation** — the same artifact seen by different tools collapses into one corroborated event (shared hash / same path in a time window / exact duplicate), tagged with the real tool names. Idempotent — re-importing never doubles the timeline. ### Investigation workflow - **Ask the case** — free-form Q&A grounded in the full timeline; unknown answers direct you to what artifact to collect and where - **Response Playbook** — recommended next steps + Critical/High findings become a trackable checklist (status, priority, assignee, due date, reorder, custom tasks); opt-in IR-templates expand findings into Contain → Investigate → Eradicate → Recover phases. Survives synthesis; renders into the report. - **Triage tags & comments** — label any entity (`confirmed-malicious`, `false-positive`, …) and attach notes; synced live over WebSocket; survive synthesis. - **Bulk actions** — multi-select timeline events or IOCs and star / tag / mark-legitimate / (IOCs) enrich or copy — each one batched write + a single re-synthesis. - **IOC whitelist** (Settings) — persistent known-good patterns (CIDR / exact / regex) auto-mark matching IOCs legitimate on import; global, CSV/JSON import-export; opt-in. - **NSRL known-good hashes** (Settings) — auto-marks matching forensic events + IOCs legitimate on import (reversible) to cut false positives. Either a flat hash set (paste an `NSRLFile.txt` / hashdeep CSV / hash list, or pre-load via `DFIR_NSRL_FILE`) or **direct query of the full NSRL RDS SQLite DB** (`DFIR_NSRL_DB` / connect in-UI — the real ~160 GB set, queried on demand, never loaded into RAM). Keys on sha256/md5; global, opt-in. - **IOC corroboration** — a **⊕ N** badge per IOC for how many distinct tools observed it (panel, report, CSV). - **IOC flagged-only filter** — one click hides everything except indicators a threat-intel engine rated malicious/suspicious. - **Hunt-pivot generator** — one click on any event/IOC emits Velociraptor VQL, KQL, ES|QL, SPL, Sigma, YARA, and Suricata queries; offline, no AI. - **Velociraptor** (opt-in, API config) — run a pivot as a fleet hunt; or **triage bundles** (Settings): browse artifacts → save bundles → run as a hunt (label/OS + min-severity) → auto-collect + import + synthesize, with per-artifact params/exclude filters. - **AI-suggested fleet hunts** — the AI reads the findings and proposes proactive Velociraptor VQL hunts to sweep the whole fleet for the same tradecraft; review each hunt's VQL + rationale, then one-click deploy across all enrolled endpoints. - **AI-suggested playbook hunts** — for each *endpoint-related* Response Playbook task, the AI proposes a Velociraptor hunt; a task tied to **one** host deploys as a single-endpoint **collection** (`collect_client`), anything broader as a **fleet hunt** — review the VQL, then one-click deploy from the Playbook panel. - **Scope + legitimacy** — set a time window; mark findings/IOCs/events legitimate (reversible); all views re-project. - **Freshness** — "last synthesized N ago" + what-changed diff; "last import N ago" + `NEW` row highlights. ### Threat-intel enrichment (off by default — opt-in per case) - **Sources** — VirusTotal, Hunting.ch (MalwareBazaar · ThreatFox · URLhaus · YARAify), CrowdStrike Falcon TI, AbuseIPDB, MISP, YETI, RockyRaccoon (Windows process prevalence + anomalous parent/child detection) - **Local vs external** — MISP/YETI queries stay on-box; third-party sources require an explicit per-case opt-in; enabling a source re-checks every existing IOC against it - **Reachability gate** — self-hosted instances are health-probed before sending indicators; auto-resumes when back online ### Customer exposure (separate from IOC enrichment) - **Checks the victim org's own assets** — HIBP, LeakCheck, DeHashed (email breaches), Shodan (exposed hosts/ports/CVEs); per-provider opt-in - **OPSEC boundary** — only analyst-entered customer domains are queried; adversary/IOC domains are never sent; raw passwords never persisted ### Dashboard & reports - **Live dashboard** over WebSocket — collapsible, drag-to-reorder sections, scope bar, clickable evidence links, and severity/corroboration badges. - **Dark / light theme** — header toggle (🌙/☀️); follows OS preference, remembers manual choice. - **Forensic timeline rows** — affected host + clickable finding links (jump + flash); report timeline (§3.1) has a matching Host column. - **Manual add** — record an event or IOC the AI missed (tagged `manual`, survives re-analysis). - **MITRE techniques** link to [attack.mitre.org](https://attack.mitre.org/) everywhere. - **Asset ↔ IoC graph** — which IoC touched which asset; interactive with Host/Account/Service toggles, zoom, fullscreen. Also a report section. - **Evidence Chain graph** — process trees + lateral movement stitched into a cross-host attack graph, every edge auditable. Dashboard panel + report §4.8. - **Timeline Swimlane** — severity/tactic × time chart; click-a-dot detail, Shift-select → mark-legitimate, PNG export; static SVG in the report. - **Reports** — Markdown + HTML + one-click **PDF** + CSVs (findings, IOCs, timelines) + JSON state + **Word (.docx)** — all from the **Export** menu. - **ATT&CK Navigator layer** — MITRE techniques coloured by worst severity, ready to upload into [ATT&CK Navigator](https://mitre-attack.github.io/attack-navigator/). - **STIX 2.1 bundle** — portable bundle (IOC STIX patterns + ATT&CK attack-patterns + `indicates` relationships) for OpenCTI, MISP, Anomali, etc. - **Investigation snapshot** — **Export → Investigation snapshot (JSON)** bundles the whole case; **Import snapshot…** restores it as a new case on another machine. No AI keys or machine config included. - **Redacted case package** — **Export → Redacted case package (ZIP)**: IPs/hosts/users replaced with consistent tokens, PII blurred in screenshots, adversary indicators preserved. - **AI executive summary** — ✨ management-facing summary (no ATT&CK ids/hashes/tool names), saved into the report. - **Narrative Timeline** — prose story for non-technical stakeholders; generated in synthesis, editable, report §3.2. - **Push to DFIR-IRIS** — one click (or `npm run iris:push`) maps assets/IOCs/timeline/tasks; idempotent. `DFIR_IRIS_URL` + `DFIR_IRIS_KEY`. - **Timesketch push** — **Export → Timesketch JSONL** or one-click **Push** (find-or-creates the sketch). `DFIR_TIMESKETCH_*`. - **Export to Notion** — push a case into a managed Notion page block; your own notes outside it are never touched. `DFIR_NOTION_TOKEN`. - **Push to ClickUp** — export the Response Playbook as ClickUp tasks; re-push updates in place. `DFIR_CLICKUP_TOKEN`. - **Notifications** — findings / playbook / milestones to **Slack** / **MS Teams** / **Telegram** / **SMTP**; per-channel threshold + toggles. Opt-in; managed in **Settings → Notifications**. - **Report templates** — global branded layouts (accent colour, header/footer, section order). Built-ins editable in place; pick one per case. Managed in **Settings → Report Templates**. - **Mobile companion** — read-only PWA at **`/mobile`**: findings, timeline, IOCs with threat-intel verdicts. Offline app-shell. ### Ops - **Logging to file** — every line tees to the console + a global session log + a per-case audit trail; `DFIR_LOG_LEVEL` (+ live Settings toggle, `DFIR_LOG_DIR`). `debug` traces AI calls, captures, OCR, anonymization, enrichment - **Portable Windows EXE** — zip attached to every GitHub Release; unzip + double-click, no Node install required - **Docker / Docker Compose** — `docker compose up`; evidence on a host volume, no bundled AI backend - **Customizable AI prompts** — override any of the 6 prompts via env var or file; edits apply without restart (`npm run prompts:eject` to dump defaults) - **Demo case** — `npm run seed-demo` seeds a fully-populated GlobalTech Industries scenario for local exploration - **CLI scripts** — `reanalyze`, `synthesize`, `coverage`, `verify:ai`, `clean-timeline` (see below) ## Repository layout 52.43-DFIR-Companion/ ├── companion/ Node/TS localhost server (the core). See companion/README.md. ├── extension/ Chrome/Comet MV3 capture extension. See extension/README.md. ├── public/ │ └── dashboard.html Live dashboard, served by the companion at /dashboard. ├── docs/ │ └── superpowers/plans/ The original 4 implementation plans. ├── Dockerfile Single-image build (server + dashboard + add-on); no Ollama/LiteLLM. ├── docker-compose.yml Localhost-only Compose: ./cases volume, add-on → ./addon. └── cases/ Evidence + state output (gitignored). Location set by DFIR_CASES_ROOT. ## How the pieces fit Browser (Comet/Chrome) Localhost companion (127.0.0.1:4773) ┌─────────────────────┐ POST ┌───────────────────────────────────────┐ │ DFIR Capture (MV3) │ /captures ──▶ │ ingest → evidence (screenshots+jsonl) │ │ timer + events │ │ │ │ └─────────────────────┘ │ ▼ per-window AI extraction (cheap) │ │ forensic timeline ──▶ synthesis (strong)│ Dashboard / Reports ◀── WS /ws, │ findings, IOCs, MITRE, attacker path, │ GET /cases/:id/state │ key questions, threads │ └─────────────────────┘ └───────────────────────────────────────┘ **Two-phase analysis:** a cheap vision model reads each screenshot into the forensic timeline; a stronger model does the single holistic synthesis call (findings, MITRE, attacker path, questions). Configure both via `.env` — see `companion/README.md`. ## Quick start 1. **Companion** (the server): git clone https://github.com/hasamba/DFIR-Companion.git cd DFIR-Companion/companion npm install cp .env.example .env # set DFIR_AI_PROVIDER / MODEL / KEY (or leave AI off) npm run dev # serves http://127.0.0.1:4773 (dashboard at /dashboard) 2. **Extension** (capture): cd DFIR-Companion/extension npm install npm run build # then load extension/dist as an unpacked extension The popup only **attaches** to an existing case — you create cases in the dashboard. 3. Open `http://127.0.0.1:4773/dashboard`, click **+ New case** to create your case (it connects automatically). Then in the extension popup pick that case from the **Case** dropdown (**Refresh cases** if it isn't listed yet) and **Start**. Browse your evidence — the dashboard updates live. Full configuration, HTTP endpoints, the case-folder layout, and the analysis model are documented in **[companion/README.md](companion/README.md)**. ## Docker / Docker Compose Run the whole thing — companion server + dashboard + the browser add-on — in one container. **No Ollama or LiteLLM are bundled**; for AI you point `DFIR_AI_*` at any OpenAI-compatible endpoint (a model you host, a remote provider, or an Ollama/LiteLLM you run separately). With AI left unset the container still does full capture and all the deterministic importers. **Localhost-only by design:** the container binds `0.0.0.0` internally, but Compose publishes the port to `127.0.0.1` on your host — so the dashboard is never exposed on your network. 1. **Start it** (build from source): git clone https://github.com/hasamba/DFIR-Companion.git cd DFIR-Companion docker compose up -d --build # → http://127.0.0.1:4773/dashboard Or pull the prebuilt image from GHCR instead of building: docker compose pull && docker compose up -d # image: ghcr.io/hasamba/dfir-companion:latest 2. **Load the add-on** (capture). The container writes the pre-built, unpacked extension to `./addon` on first start. In Chrome/Comet open `chrome://extensions`, enable **Developer mode**, click **Load unpacked**, and select **`./addon/dist`** (a packaged `dfir-companion-extension.zip` is dropped there too). 3. Open `http://127.0.0.1:4773/dashboard`, click **+ New case**, then pick that case in the extension popup and **Start**. **Data & config:** - Evidence and case state persist in **`./cases`** on the host (mounted volume) — survives restarts and image rebuilds. - Configure via the `environment:` block in [`docker-compose.yml`](docker-compose.yml), or uncomment `env_file: - .env` to use a `.env` file (copy `companion/.env.example`). - To reach an AI endpoint running on the host, use `http://host.docker.internal:/v1` (on Linux without Docker Desktop, also uncomment the `extra_hosts` line in the compose file). ## Environment variables (`companion/.env`) All companion behavior is configured via env vars (`companion/.env` or shell). Copy `companion/.env.example` to start — it has inline comments for every variable. ### Core | Variable | Default | Meaning | |---|---|---| | `DFIR_CASES_ROOT` | `./cases` | Case folder location; relative paths resolve against `companion/` | | `DFIR_PORT` | `4773` | Server port (must match the extension and dashboard) | | `DFIR_HOST` | `127.0.0.1` | Bind interface; Docker image sets `0.0.0.0`, Compose re-maps to localhost on the host | | `DFIR_MAX_BODY_MB` | `256` | Max upload size in MB; raise if large SIEM/EDR exports fail with HTTP 413 | | `DFIR_LOG_LEVEL` | `info` | Log verbosity (`debug`/`info`/`warn`/`error`). Tees to console + `logs/session-
标签:AI风险缓解, 人工智能, 多模态安全, 安全报告, 库, 应急响应, 数字取证, 数据可视化, 用户模式Hook绕过, 自动化分析, 自动化攻击, 自动化脚本, 跨站脚本, 逆向工具