一款本地运行的 AI 辅助数字取证分诊工具,将调查截图与工件自动转化为取证时间线、发现、IOC 和事件报告。
Stars: 4 | Forks: 1
# DFIR Companion
[](LICENSE)
A localhost digital-forensics / incident-response companion. A browser extension
captures screenshots of your investigation (Velociraptor, EDR/SIEM dashboards, Security Onion, Splunk4DFIR, VolWeb, VirusTotal, etc.) as
evidence; a local server stores them, runs **windowed AI vision analysis** into an
accumulating per-case investigation state, and serves a **live dashboard** plus
exportable reports.
Everything runs on your machine — the companion binds to `127.0.0.1` only, evidence
stays on disk, and the AI provider is yours to choose.
## Screenshots
### Executive Summary & Recommended Next Steps
AI-generated case summary and AI-prioritized remediation actions (Critical → Medium), each with
rationale and a pointer to the finding or artifact it came from.
### Forensic Timeline
31 corroborated events from Chainsaw · THOR · Suricata · CrowdStrike Falcon — severity filters, per-row
triage tags (`initial-access`, `c2-comms`, `key-evidence`, …), import change tracking
(+19 new events banner with expandable diff), and analyst star / bulk-action controls.
### Attack Path Narrative · MITRE ATT&CK Kill Chain · Findings
Full attacker-path write-up from initial access to ransomware attempt, an interactive kill chain
(click a tactic to expand its events), and the top findings with confidence scores.
### Findings
8 AI-generated findings (2 Critical · 2 High · 2 Medium · 1 Low) — each with a confidence %,
analyst triage tags, MITRE technique links, and a synthesis freshness diff (+8 new since last run).
### Evidence Chain Graph
Process trees + lateral movement across DC01, FS01, and WKSTN-JSMITH stitched into one causal
attack graph. Derived deterministically from importer-populated fields — no AI, no cost, runs offline.
### IOCs with Threat-Intel Enrichments
15 indicators (IPs · domains · hashes · files · processes · URL) enriched against VirusTotal,
AbuseIPDB, ThreatFox, URLhaus, and MalwareBazaar — verdict badges, detection scores, `NEW` import
highlights, and analyst `confirmed-malicious` / `pivot-point` triage labels.
### Customer Exposure & Compromised Assets · IoC Graph
**Customer Exposure** (top): credential-leak check for the victim org's own domains and emails
against HIBP / DeHashed / Shodan — breach names, exposed services, no raw passwords stored.
**Compromised Assets & IoC graph** (bottom): interactive graph linking victim hosts and accounts
to the indicators that touched each — Host / Account toggles, fullscreen, drag-to-pin nodes.
### Key Investigative Questions
8 standard DFIR questions auto-answered from the synthesized case
(answered ✅ / partial 🟡 / unknown ❓), each with an evidence pointer or a "collect this next" directive.
## What it produces
For each case the AI builds and keeps up to date:
- **Forensic timeline** — real incident events with their *true* timestamps read from
the artifacts (process create, logon, network connection, file MAC times…), sorted
chronologically. Distinct from the capture/analysis log.
- **Findings** — granular, per-technique analytic conclusions, each with severity and
MITRE ATT&CK mapping.
- **IOCs**, **MITRE ATT&CK** coverage, and an **attacker-path** narrative (kill chain).
- **Attack phases** — the timeline grouped into temporal **bursts** (activity clustered
by time gap), each labelled with its dominant ATT&CK tactic — the *when did each stage
happen* view, complementary to the categorical kill chain. Deterministic, no AI call.
- **Beacon / C2 detection** — outbound connection channels (host → dest:port) whose
inter-arrival intervals are too regular to be human traffic, the classic C2 callback
signature. Derived from the network events; severity High for public destinations. A
hunting lead, not a verdict. Deterministic, no AI call. Dashboard panel + report §4.9.
- **Log gap analysis** — suspiciously long **silent periods** in the timeline. A gap where
*every* source went dark is the classic log-tampering signature (cleared Event Logs, a
stopped collector/auditd, deleted logs) — flagged High and escalated to a finding; one tool
going quiet while others keep logging is a partial coverage blindspot (Medium). Density-aware
(won't flag normal quiet in a sparse timeline) with optional working-hours filtering. A lead,
not proof. Deterministic, no AI call. Dashboard *Timeline Gaps* panel + report §3.3.
- **Adversary hints** — known **MITRE ATT&CK groups** ranked by how much their technique
set overlaps the case's, as early hypothesis fuel. Offline (a bundled dataset, no
AI/network); sub-technique-aware, so an **exact** sub-technique match (highlighted) outranks
a base-technique-only one. Each card shows aliases, sectors/regions, the overlap ratio, the
exact-match count, and the shared techniques. Statistical similarity, **not attribution**.
- **Compromised assets** — the victim hosts and user accounts, with an interactive
**asset ↔ IoC graph** showing which indicators touched each.
- **Key investigative questions** — initial access, lateral movement, compromised
users/hosts, exfiltration, dwell time… each with an answer and a pointer to where to
find/confirm it (or what to collect next).
- **Investigation threads** — open leads and resolved ones.
- **Reports** — a full incident-report in **Markdown, HTML, and PDF** (one-click print-to-PDF),
plus CSV and JSON exports.
## Features
### Capture & ingest
- **MV3 browser extension** — timer + event-driven capture (navigation / tab switch / click), `Ctrl+Shift+S` hotkey, offline queue + auto-sync, per-case Start/Stop. Attaches to an existing case from a dropdown — it never creates one.
- **Case management in the dashboard** — **+ New case** is the one place cases are born; captures to an unknown case are rejected. Five built-in **templates** pre-load incident-type investigation questions + import/hunt hints (save your own too).
- **Import screenshots** — multi-select PNG/JPEG/WebP from any tool, through the same ingest path as the extension.
- **One Import button** — drop any artifact file; the server auto-detects the format and routes it. Optional minimum-severity floor at the gate.
- **Import undo/redo** — an import that floods the dashboard can be rolled back to the exact pre-import state — findings, IOCs, timeline, MITRE, attacker path (and redone), restored verbatim with no AI call. Undo/Redo buttons sit next to the Import button; a per-case stack keeps multiple levels (`DFIR_IMPORT_UNDO_DEPTH`).
- **Evidence-first** — written to disk + append-only audit log before any analysis; exact-hash (SHA-256) duplicate detection (`DFIR_DEDUP=off` to disable).
- **Localhost only** — binds `127.0.0.1` (CORS + Private-Network-Access so the extension origin can reach it).
### Evidence importers
All importers are **deterministic (no AI call)**, read the artifact's own timestamps, and tag events with the real tool name for cross-source correlation. The same file can be re-imported without duplicating the timeline.
| Format | Key sources | Severity derived from |
|---|---|---|
| **SIEM / EDR JSON** | Elastic, Kibana, Splunk, QRadar, any JSON/NDJSON export | Windows/Sysmon per-EID table |
| **Chainsaw** | EVTX hunt JSON/JSONL (`chainsaw hunt --json`) | Matched Sigma rule level |
| **Hayabusa** | `json-timeline` or `csv-timeline` | Matched Sigma rule level |
| **Velociraptor** | JSON array, JSONL, or artifact map | Sigma/YARA verdict or per-EID |
| **THOR (Nextron)** | JSON-Lines scan output | THOR alert level |
| **Suricata / Zeek** | `eve.json`, Zeek JSON logs; telemetry → IOCs only | Alert priority / notice severity |
| **Cyber Triage** | JSONL / JSON / CSV timeline | Cyber Triage item score |
| **M365 / Entra ID** | UAL, Entra sign-in + audit logs | BEC tradecraft table / Entra riskLevel |
| **AWS CloudTrail** | Records JSON, NDJSON, Athena | API action table (IAM/logging/S3/secrets) |
| **GCP / Azure** | Cloud Audit Logs, Azure Activity Log | Action table (IAM/logging/secrets) |
| **Plaso** | `psort` CSV (dynamic + l2tcsv) | — (Info events) |
| **Sandbox reports** | CAPEv2 `report.json`, Falcon Sandbox summary | Sample verdict + behavioural signatures |
| **Memory forensics** | Volatility 3 (`-r json`) + Rekall: pslist/pstree, netscan, malfind, cmdline, svcscan | malfind injected code → High (T1055); listings → Info/Low evidence |
| **Email** | `.eml` (RFC 2822), best-effort `.msg` | SPF/DKIM/DMARC fail → sender spoof heuristics (T1566 Phishing) |
| **Linux auditd** | raw `audit.log` / `ausearch` records, `aureport` tables | Record-type table (logins, account mgmt, sudo, SELinux, audit tampering) |
| **systemd journald** | `journalctl -o json` / `-o json-pretty` | syslog PRIORITY + tradecraft bumps (sshd, sudo, useradd) |
| **sysdig / Falco** | Falco alert JSON, sysdig `-j` event JSON | Falco rule priority; raw syscalls → Info telemetry |
| **CSV** | Velociraptor / EDR exports | — |
| **Generic logs** | Firewall, syslog, VPN; repetitive lines → counted patterns | AI-triaged |
### AI analysis
- **Two-phase** — cheap per-window vision **extraction** → forensic timeline; strong text-only **synthesis** → findings, IOCs, MITRE ATT&CK, attack path, narrative, key questions, next steps.
- **Providers** — OpenAI, OpenRouter, Ollama, local LiteLLM (or any OpenAI-compatible endpoint), Gemini. Optional **two-tier** (cheap extract + strong synth); context-window budgeting + bounded, truncation-tolerant output (no spurious OpenRouter 402 / context 400s).
- **EDR/XDR + SIEM consoles are evidence** — detections are extracted; analyst tool-navigation is filtered out, with an incident-signal allowlist so a real detection is never dropped.
- **Severity-aware findings** — a Critical/High row becomes a finding; a deterministic safety net auto-creates one (`AUTO` badge) for any high-severity event synthesis missed.
- **Efficient, grounded synthesis** — live debounced re-synthesis during capture; skip-if-unchanged; stratified event selection + a *compromised assets ← IoCs* grounding digest.
- **AI-input anonymization** — reversibly tokenizes internal IPs/users/hosts/domains/emails/paths and one-way-redacts secrets (adversary IOCs preserved). Entities auto-discover from the timeline **and screenshots**, each removable; default on.
### Correlation & deduplication
- **Cross-source correlation** — the same artifact seen by different tools collapses into one corroborated event (shared hash / same path in a time window / exact duplicate), tagged with the real tool names. Idempotent — re-importing never doubles the timeline.
### Investigation workflow
- **Ask the case** — free-form Q&A grounded in the full timeline; unknown answers direct you to what artifact to collect and where
- **Response Playbook** — recommended next steps + Critical/High findings become a trackable checklist (status, priority, assignee, due date, reorder, custom tasks); opt-in IR-templates expand findings into Contain → Investigate → Eradicate → Recover phases. Survives synthesis; renders into the report.
- **Triage tags & comments** — label any entity (`confirmed-malicious`, `false-positive`, …) and attach notes; synced live over WebSocket; survive synthesis.
- **Bulk actions** — multi-select timeline events or IOCs and star / tag / mark-legitimate / (IOCs) enrich or copy — each one batched write + a single re-synthesis.
- **IOC whitelist** (Settings) — persistent known-good patterns (CIDR / exact / regex) auto-mark matching IOCs legitimate on import; global, CSV/JSON import-export; opt-in.
- **NSRL known-good hashes** (Settings) — auto-marks matching forensic events + IOCs legitimate on import (reversible) to cut false positives. Either a flat hash set (paste an `NSRLFile.txt` / hashdeep CSV / hash list, or pre-load via `DFIR_NSRL_FILE`) or **direct query of the full NSRL RDS SQLite DB** (`DFIR_NSRL_DB` / connect in-UI — the real ~160 GB set, queried on demand, never loaded into RAM). Keys on sha256/md5; global, opt-in.
- **IOC corroboration** — a **⊕ N** badge per IOC for how many distinct tools observed it (panel, report, CSV).
- **IOC flagged-only filter** — one click hides everything except indicators a threat-intel engine rated malicious/suspicious.
- **Hunt-pivot generator** — one click on any event/IOC emits Velociraptor VQL, KQL, ES|QL, SPL, Sigma, YARA, and Suricata queries; offline, no AI.
- **Velociraptor** (opt-in, API config) — run a pivot as a fleet hunt; or **triage bundles** (Settings): browse artifacts → save bundles → run as a hunt (label/OS + min-severity) → auto-collect + import + synthesize, with per-artifact params/exclude filters.
- **AI-suggested fleet hunts** — the AI reads the findings and proposes proactive Velociraptor VQL hunts to sweep the whole fleet for the same tradecraft; review each hunt's VQL + rationale, then one-click deploy across all enrolled endpoints.
- **AI-suggested playbook hunts** — for each *endpoint-related* Response Playbook task, the AI proposes a Velociraptor hunt; a task tied to **one** host deploys as a single-endpoint **collection** (`collect_client`), anything broader as a **fleet hunt** — review the VQL, then one-click deploy from the Playbook panel.
- **Scope + legitimacy** — set a time window; mark findings/IOCs/events legitimate (reversible); all views re-project.
- **Freshness** — "last synthesized N ago" + what-changed diff; "last import N ago" + `NEW` row highlights.
### Threat-intel enrichment (off by default — opt-in per case)
- **Sources** — VirusTotal, Hunting.ch (MalwareBazaar · ThreatFox · URLhaus · YARAify), CrowdStrike Falcon TI, AbuseIPDB, MISP, YETI, RockyRaccoon (Windows process prevalence + anomalous parent/child detection)
- **Local vs external** — MISP/YETI queries stay on-box; third-party sources require an explicit per-case opt-in; enabling a source re-checks every existing IOC against it
- **Reachability gate** — self-hosted instances are health-probed before sending indicators; auto-resumes when back online
### Customer exposure (separate from IOC enrichment)
- **Checks the victim org's own assets** — HIBP, LeakCheck, DeHashed (email breaches), Shodan (exposed hosts/ports/CVEs); per-provider opt-in
- **OPSEC boundary** — only analyst-entered customer domains are queried; adversary/IOC domains are never sent; raw passwords never persisted
### Dashboard & reports
- **Live dashboard** over WebSocket — collapsible, drag-to-reorder sections, scope bar, clickable evidence links, and severity/corroboration badges.
- **Dark / light theme** — header toggle (🌙/☀️); follows OS preference, remembers manual choice.
- **Forensic timeline rows** — affected host + clickable finding links (jump + flash); report timeline (§3.1) has a matching Host column.
- **Manual add** — record an event or IOC the AI missed (tagged `manual`, survives re-analysis).
- **MITRE techniques** link to [attack.mitre.org](https://attack.mitre.org/) everywhere.
- **Asset ↔ IoC graph** — which IoC touched which asset; interactive with Host/Account/Service toggles, zoom, fullscreen. Also a report section.
- **Evidence Chain graph** — process trees + lateral movement stitched into a cross-host attack graph, every edge auditable. Dashboard panel + report §4.8.
- **Timeline Swimlane** — severity/tactic × time chart; click-a-dot detail, Shift-select → mark-legitimate, PNG export; static SVG in the report.
- **Reports** — Markdown + HTML + one-click **PDF** + CSVs (findings, IOCs, timelines) + JSON state + **Word (.docx)** — all from the **Export** menu.
- **ATT&CK Navigator layer** — MITRE techniques coloured by worst severity, ready to upload into [ATT&CK Navigator](https://mitre-attack.github.io/attack-navigator/).
- **STIX 2.1 bundle** — portable bundle (IOC STIX patterns + ATT&CK attack-patterns + `indicates` relationships) for OpenCTI, MISP, Anomali, etc.
- **Investigation snapshot** — **Export → Investigation snapshot (JSON)** bundles the whole case; **Import snapshot…** restores it as a new case on another machine. No AI keys or machine config included.
- **Redacted case package** — **Export → Redacted case package (ZIP)**: IPs/hosts/users replaced with consistent tokens, PII blurred in screenshots, adversary indicators preserved.
- **AI executive summary** — ✨ management-facing summary (no ATT&CK ids/hashes/tool names), saved into the report.
- **Narrative Timeline** — prose story for non-technical stakeholders; generated in synthesis, editable, report §3.2.
- **Push to DFIR-IRIS** — one click (or `npm run iris:push`) maps assets/IOCs/timeline/tasks; idempotent. `DFIR_IRIS_URL` + `DFIR_IRIS_KEY`.
- **Timesketch push** — **Export → Timesketch JSONL** or one-click **Push** (find-or-creates the sketch). `DFIR_TIMESKETCH_*`.
- **Export to Notion** — push a case into a managed Notion page block; your own notes outside it are never touched. `DFIR_NOTION_TOKEN`.
- **Push to ClickUp** — export the Response Playbook as ClickUp tasks; re-push updates in place. `DFIR_CLICKUP_TOKEN`.
- **Notifications** — findings / playbook / milestones to **Slack** / **MS Teams** / **Telegram** / **SMTP**; per-channel threshold + toggles. Opt-in; managed in **Settings → Notifications**.
- **Report templates** — global branded layouts (accent colour, header/footer, section order). Built-ins editable in place; pick one per case. Managed in **Settings → Report Templates**.
- **Mobile companion** — read-only PWA at **`/mobile`**: findings, timeline, IOCs with threat-intel verdicts. Offline app-shell.
### Ops
- **Logging to file** — every line tees to the console + a global session log + a per-case audit trail; `DFIR_LOG_LEVEL` (+ live Settings toggle, `DFIR_LOG_DIR`). `debug` traces AI calls, captures, OCR, anonymization, enrichment
- **Portable Windows EXE** — zip attached to every GitHub Release; unzip + double-click, no Node install required
- **Docker / Docker Compose** — `docker compose up`; evidence on a host volume, no bundled AI backend
- **Customizable AI prompts** — override any of the 6 prompts via env var or file; edits apply without restart (`npm run prompts:eject` to dump defaults)
- **Demo case** — `npm run seed-demo` seeds a fully-populated GlobalTech Industries scenario for local exploration
- **CLI scripts** — `reanalyze`, `synthesize`, `coverage`, `verify:ai`, `clean-timeline` (see below)
## Repository layout
52.43-DFIR-Companion/
├── companion/ Node/TS localhost server (the core). See companion/README.md.
├── extension/ Chrome/Comet MV3 capture extension. See extension/README.md.
├── public/
│ └── dashboard.html Live dashboard, served by the companion at /dashboard.
├── docs/
│ └── superpowers/plans/ The original 4 implementation plans.
├── Dockerfile Single-image build (server + dashboard + add-on); no Ollama/LiteLLM.
├── docker-compose.yml Localhost-only Compose: ./cases volume, add-on → ./addon.
└── cases/ Evidence + state output (gitignored). Location set by DFIR_CASES_ROOT.
## How the pieces fit
Browser (Comet/Chrome) Localhost companion (127.0.0.1:4773)
┌─────────────────────┐ POST ┌───────────────────────────────────────┐
│ DFIR Capture (MV3) │ /captures ──▶ │ ingest → evidence (screenshots+jsonl) │
│ timer + events │ │ │ │
└─────────────────────┘ │ ▼ per-window AI extraction (cheap) │
│ forensic timeline ──▶ synthesis (strong)│
Dashboard / Reports ◀── WS /ws, │ findings, IOCs, MITRE, attacker path, │
GET /cases/:id/state │ key questions, threads │
└─────────────────────┘ └───────────────────────────────────────┘
**Two-phase analysis:** a cheap vision model reads each screenshot into the forensic
timeline; a stronger model does the single holistic synthesis call (findings, MITRE,
attacker path, questions). Configure both via `.env` — see `companion/README.md`.
## Quick start
1. **Companion** (the server):
git clone https://github.com/hasamba/DFIR-Companion.git
cd DFIR-Companion/companion
npm install
cp .env.example .env # set DFIR_AI_PROVIDER / MODEL / KEY (or leave AI off)
npm run dev # serves http://127.0.0.1:4773 (dashboard at /dashboard)
2. **Extension** (capture):
cd DFIR-Companion/extension
npm install
npm run build # then load extension/dist as an unpacked extension
The popup only **attaches** to an existing case — you create cases in the dashboard.
3. Open `http://127.0.0.1:4773/dashboard`, click **+ New case** to create your case (it
connects automatically). Then in the extension popup pick that case from the **Case**
dropdown (**Refresh cases** if it isn't listed yet) and **Start**. Browse your evidence —
the dashboard updates live.
Full configuration, HTTP endpoints, the case-folder layout, and the analysis model
are documented in **[companion/README.md](companion/README.md)**.
## Docker / Docker Compose
Run the whole thing — companion server + dashboard + the browser add-on — in one container.
**No Ollama or LiteLLM are bundled**; for AI you point `DFIR_AI_*` at any OpenAI-compatible
endpoint (a model you host, a remote provider, or an Ollama/LiteLLM you run separately). With AI
left unset the container still does full capture and all the deterministic importers.
**Localhost-only by design:** the container binds `0.0.0.0` internally, but Compose publishes the
port to `127.0.0.1` on your host — so the dashboard is never exposed on your network.
1. **Start it** (build from source):
git clone https://github.com/hasamba/DFIR-Companion.git
cd DFIR-Companion
docker compose up -d --build # → http://127.0.0.1:4773/dashboard
Or pull the prebuilt image from GHCR instead of building:
docker compose pull && docker compose up -d
# image: ghcr.io/hasamba/dfir-companion:latest
2. **Load the add-on** (capture). The container writes the pre-built, unpacked extension to
`./addon` on first start. In Chrome/Comet open `chrome://extensions`, enable **Developer
mode**, click **Load unpacked**, and select **`./addon/dist`** (a packaged
`dfir-companion-extension.zip` is dropped there too).
3. Open `http://127.0.0.1:4773/dashboard`, click **+ New case**, then pick that case in the
extension popup and **Start**.
**Data & config:**
- Evidence and case state persist in **`./cases`** on the host (mounted volume) — survives
restarts and image rebuilds.
- Configure via the `environment:` block in [`docker-compose.yml`](docker-compose.yml), or
uncomment `env_file: - .env` to use a `.env` file (copy `companion/.env.example`).
- To reach an AI endpoint running on the host, use `http://host.docker.internal:/v1`
(on Linux without Docker Desktop, also uncomment the `extra_hosts` line in the compose file).
## Environment variables (`companion/.env`)
All companion behavior is configured via env vars (`companion/.env` or shell). Copy `companion/.env.example` to start — it has inline comments for every variable.
### Core
| Variable | Default | Meaning |
|---|---|---|
| `DFIR_CASES_ROOT` | `./cases` | Case folder location; relative paths resolve against `companion/` |
| `DFIR_PORT` | `4773` | Server port (must match the extension and dashboard) |
| `DFIR_HOST` | `127.0.0.1` | Bind interface; Docker image sets `0.0.0.0`, Compose re-maps to localhost on the host |
| `DFIR_MAX_BODY_MB` | `256` | Max upload size in MB; raise if large SIEM/EDR exports fail with HTTP 413 |
| `DFIR_LOG_LEVEL` | `info` | Log verbosity (`debug`/`info`/`warn`/`error`). Tees to console + `logs/session-