Lona44/find-evil-ir-agent

GitHub: Lona44/find-evil-ir-agent

Stars: 0 | Forks: 0

# find-evil-ir-agent Autonomous incident-response agent for the [SANS FIND EVIL!](https://findevil.devpost.com/) hackathon (April 15 – June 15, 2026). Extends Protocol SIFT's autonomous incident response capability with a multi-agent LangGraph architecture and a built-in misalignment-evaluation layer that catches the agent's own hallucinations before they reach the analyst. ## Why this exists Autonomous IR agents fail in two distinct ways: 1. They miss findings — false negatives, the classic problem. 2. They invent findings — false positives, fabricated artefact citations, confidently wrong narratives. The second failure mode is the one that erodes analyst trust in agentic SOC tools. This project addresses it directly: every finding the agent produces is independently verified by a second agent that has read-only access to the underlying evidence, and any unverified claim is either flagged or stripped from the final report. ## Architecture ┌─────────────────────────────────────────────────┐ │ Evidence │ │ disk images · memory captures · logs · pcaps │ │ · remote endpoints via MCP │ └────────────────────────┬────────────────────────┘ │ read-only access │ ┌────────────────────────▼────────────────────────┐ │ Investigator Agent │ │ • Plans investigation │ │ • Executes SIFT tools (Volatility, Plaso, │ │ Sleuthkit, …) via tool-use │ │ • Produces candidate findings with citations │ └────────────────────────┬────────────────────────┘ │ candidate findings │ ┌────────────────────────▼────────────────────────┐ │ Validator Agent │ │ • Re-checks each finding against the cited │ │ artefact (offset / line / hash) │ │ • Flags hallucinations and unsupported claims │ │ • Returns confirmed / inferred / rejected │ └────────────────────────┬────────────────────────┘ │ validated findings + flags │ ┌────────────────────────▼────────────────────────┐ │ Reporter Agent │ │ • Composes structured investigative narrative │ │ • Distinguishes confirmed vs inferred │ │ • Cites every claim to a specific artefact │ └────────────────────────┬────────────────────────┘ │ ┌────────────────────────▼────────────────────────┐ │ Audit + Accuracy Report │ │ • Tool execution logs (timestamps, tokens) │ │ • Hallucination rate per case │ │ • Citation coverage │ └─────────────────────────────────────────────────┘ Full architecture description: [`docs/architecture.md`](docs/architecture.md). ## Mandatory FIND EVIL! capabilities The competition rules require three capabilities. Each one is owned by a specific component in the architecture: | Required capability | How it's implemented | | --- | --- | | **Self-correction** — the agent detects and resolves errors or inconsistencies in its own output without human intervention | The Validator agent re-checks every Investigator finding; mismatches trigger a re-investigation loop until either the claim is confirmed or marked rejected | | **Accuracy validation** — all findings are traceable to specific artifacts, files, offsets, or log entries | Every finding emitted by the Investigator carries a citation tuple `(artefact_path, offset_or_line, content_hash)`; the Validator independently re-reads at that exact location | | **Analytical reasoning** — output is presented as a structured investigative narrative, not a raw execution log | The Reporter agent composes a Markdown narrative grouped by phase (acquisition → analysis → conclusion), with confirmed and inferred findings rendered differently | ## Relationship to existing work This project sits at the intersection of two prior pieces of work: - **[Unified AI Misalignment Framework](https://github.com/Lona44/unified-ai-misalignment-framework)** — the eval methodology that underpins the Validator agent. The framework's hallucination-detection patterns are reused here to grade the Investigator's output before it reaches the analyst. - **Agent Arena** ([procurement-intelligence](https://github.com/Lona44/procurement-intelligence)) — the multi-agent LangGraph + human-in-the-loop voting pattern that this IR pipeline is built on. ## Tech stack - **Agent framework:** LangGraph + Claude (via the Anthropic API). Comparable agentic architectures permitted by the rules. - **Runtime:** Linux terminal, SANS SIFT Workstation environment. - **SIFT tools wrapped:** Volatility 3, Plaso, Sleuthkit, log2timeline, Wireshark/tshark (initial set; extended during development). - **Remote evidence:** MCP servers for endpoint queries. - **Audit:** Structured execution logs (JSONL) with per-call timestamps, tool inputs, and token usage. ## Repository layout . ├── LICENSE Apache 2.0 ├── README.md this file ├── pyproject.toml Python package config ├── agents/ LangGraph agent definitions │ ├── investigator.py primary IR agent — analyses evidence │ ├── validator.py self-correction agent — verifies findings │ ├── reporter.py structured narrative composer │ └── prompts/ system prompts (auditable, version-controlled) ├── tools/ SIFT tool wrappers + MCP integration │ ├── sift_tools.py wrappers for Volatility, Plaso, Sleuthkit │ ├── mcp_endpoints.py remote-endpoint MCP server endpoints │ └── audit.py audit-trail and token-usage logger ├── evals/ accuracy evaluation │ ├── hallucination_check.py reuses Unified Framework methodology │ ├── citation_check.py verifies every claim has an artefact citation │ └── scenarios/ test cases (synthetic evidence packages) ├── infra/ deployment + execution environment │ ├── Dockerfile SIFT-Workstation-compatible │ ├── compose.yml │ └── requirements.txt ├── docs/ required submission artefacts │ ├── architecture.md full architecture description │ ├── accuracy-report.md self-assessment of false positives / hallucinations │ ├── evidence-dataset.md what the agent was tested against │ └── execution-logs/ sample run logs ├── scripts/ │ ├── run.sh local execution entry point │ └── seed_evidence.sh set up test evidence └── tests/ └── test_agents.py smoke tests ## Setup git clone https://github.com/Lona44/find-evil-ir-agent.git cd find-evil-ir-agent # Python 3.12+ python -m venv .venv source .venv/bin/activate pip install -e ".[dev]" # Configure cp .env.example .env # Set ANTHROPIC_API_KEY, optional MCP endpoints # Seed test evidence (synthetic — see docs/evidence-dataset.md for sources) ./scripts/seed_evidence.sh # Run against the seeded case ./scripts/run.sh --case demo ## Development plan The submission window is 25 May → 15 June 2026. Tracked roughly as: - **Phase 1 — Scaffold + Investigator agent** (target: end of week 1) Plan-execute-observe loop over a single evidence type (memory captures). Volatility 3 wrapped as a tool. Basic citation tuples. - **Phase 2 — Validator + self-correction loop** (target: end of week 2) Validator re-checks Investigator findings. Loop triggers on mismatch. Hallucination-detection eval ported from Unified AI Misalignment Framework. - **Phase 3 — Reporter + audit trail** (target: end of week 3) Structured Markdown narrative. JSONL audit log with timestamps and token usage. Tie every report claim to a specific tool execution. - **Phase 4 — Demo video, accuracy report, polish** (target: 15 June) Live terminal-execution demo (≤5 min, with at least one self-correction sequence per the rules). Self-assessment of false positives / missed artefacts / hallucinations. ## Author Ma'alona Mafaufau — independent AI-safety researcher (Auckland NZ). Site: [approxiomresearch.com](https://approxiomresearch.com). ## License Apache License 2.0 — see [LICENSE](LICENSE).