Lona44/find-evil-ir-agent
GitHub: Lona44/find-evil-ir-agent
Stars: 0 | Forks: 0
# find-evil-ir-agent
Autonomous incident-response agent for the [SANS FIND EVIL!](https://findevil.devpost.com/) hackathon (April 15 – June 15, 2026). Extends Protocol SIFT's autonomous incident response capability with a multi-agent LangGraph architecture and a built-in misalignment-evaluation layer that catches the agent's own hallucinations before they reach the analyst.
## Why this exists
Autonomous IR agents fail in two distinct ways:
1. They miss findings — false negatives, the classic problem.
2. They invent findings — false positives, fabricated artefact citations, confidently wrong narratives.
The second failure mode is the one that erodes analyst trust in agentic SOC tools. This project addresses it directly: every finding the agent produces is independently verified by a second agent that has read-only access to the underlying evidence, and any unverified claim is either flagged or stripped from the final report.
## Architecture
┌─────────────────────────────────────────────────┐
│ Evidence │
│ disk images · memory captures · logs · pcaps │
│ · remote endpoints via MCP │
└────────────────────────┬────────────────────────┘
│
read-only access
│
┌────────────────────────▼────────────────────────┐
│ Investigator Agent │
│ • Plans investigation │
│ • Executes SIFT tools (Volatility, Plaso, │
│ Sleuthkit, …) via tool-use │
│ • Produces candidate findings with citations │
└────────────────────────┬────────────────────────┘
│
candidate findings
│
┌────────────────────────▼────────────────────────┐
│ Validator Agent │
│ • Re-checks each finding against the cited │
│ artefact (offset / line / hash) │
│ • Flags hallucinations and unsupported claims │
│ • Returns confirmed / inferred / rejected │
└────────────────────────┬────────────────────────┘
│
validated findings + flags
│
┌────────────────────────▼────────────────────────┐
│ Reporter Agent │
│ • Composes structured investigative narrative │
│ • Distinguishes confirmed vs inferred │
│ • Cites every claim to a specific artefact │
└────────────────────────┬────────────────────────┘
│
┌────────────────────────▼────────────────────────┐
│ Audit + Accuracy Report │
│ • Tool execution logs (timestamps, tokens) │
│ • Hallucination rate per case │
│ • Citation coverage │
└─────────────────────────────────────────────────┘
Full architecture description: [`docs/architecture.md`](docs/architecture.md).
## Mandatory FIND EVIL! capabilities
The competition rules require three capabilities. Each one is owned by a specific component in the architecture:
| Required capability | How it's implemented |
| --- | --- |
| **Self-correction** — the agent detects and resolves errors or inconsistencies in its own output without human intervention | The Validator agent re-checks every Investigator finding; mismatches trigger a re-investigation loop until either the claim is confirmed or marked rejected |
| **Accuracy validation** — all findings are traceable to specific artifacts, files, offsets, or log entries | Every finding emitted by the Investigator carries a citation tuple `(artefact_path, offset_or_line, content_hash)`; the Validator independently re-reads at that exact location |
| **Analytical reasoning** — output is presented as a structured investigative narrative, not a raw execution log | The Reporter agent composes a Markdown narrative grouped by phase (acquisition → analysis → conclusion), with confirmed and inferred findings rendered differently |
## Relationship to existing work
This project sits at the intersection of two prior pieces of work:
- **[Unified AI Misalignment Framework](https://github.com/Lona44/unified-ai-misalignment-framework)** — the eval methodology that underpins the Validator agent. The framework's hallucination-detection patterns are reused here to grade the Investigator's output before it reaches the analyst.
- **Agent Arena** ([procurement-intelligence](https://github.com/Lona44/procurement-intelligence)) — the multi-agent LangGraph + human-in-the-loop voting pattern that this IR pipeline is built on.
## Tech stack
- **Agent framework:** LangGraph + Claude (via the Anthropic API). Comparable agentic architectures permitted by the rules.
- **Runtime:** Linux terminal, SANS SIFT Workstation environment.
- **SIFT tools wrapped:** Volatility 3, Plaso, Sleuthkit, log2timeline, Wireshark/tshark (initial set; extended during development).
- **Remote evidence:** MCP servers for endpoint queries.
- **Audit:** Structured execution logs (JSONL) with per-call timestamps, tool inputs, and token usage.
## Repository layout
.
├── LICENSE Apache 2.0
├── README.md this file
├── pyproject.toml Python package config
├── agents/ LangGraph agent definitions
│ ├── investigator.py primary IR agent — analyses evidence
│ ├── validator.py self-correction agent — verifies findings
│ ├── reporter.py structured narrative composer
│ └── prompts/ system prompts (auditable, version-controlled)
├── tools/ SIFT tool wrappers + MCP integration
│ ├── sift_tools.py wrappers for Volatility, Plaso, Sleuthkit
│ ├── mcp_endpoints.py remote-endpoint MCP server endpoints
│ └── audit.py audit-trail and token-usage logger
├── evals/ accuracy evaluation
│ ├── hallucination_check.py reuses Unified Framework methodology
│ ├── citation_check.py verifies every claim has an artefact citation
│ └── scenarios/ test cases (synthetic evidence packages)
├── infra/ deployment + execution environment
│ ├── Dockerfile SIFT-Workstation-compatible
│ ├── compose.yml
│ └── requirements.txt
├── docs/ required submission artefacts
│ ├── architecture.md full architecture description
│ ├── accuracy-report.md self-assessment of false positives / hallucinations
│ ├── evidence-dataset.md what the agent was tested against
│ └── execution-logs/ sample run logs
├── scripts/
│ ├── run.sh local execution entry point
│ └── seed_evidence.sh set up test evidence
└── tests/
└── test_agents.py smoke tests
## Setup
git clone https://github.com/Lona44/find-evil-ir-agent.git
cd find-evil-ir-agent
# Python 3.12+
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
# Configure
cp .env.example .env
# Set ANTHROPIC_API_KEY, optional MCP endpoints
# Seed test evidence (synthetic — see docs/evidence-dataset.md for sources)
./scripts/seed_evidence.sh
# Run against the seeded case
./scripts/run.sh --case demo
## Development plan
The submission window is 25 May → 15 June 2026. Tracked roughly as:
- **Phase 1 — Scaffold + Investigator agent** (target: end of week 1)
Plan-execute-observe loop over a single evidence type (memory captures). Volatility 3 wrapped as a tool. Basic citation tuples.
- **Phase 2 — Validator + self-correction loop** (target: end of week 2)
Validator re-checks Investigator findings. Loop triggers on mismatch. Hallucination-detection eval ported from Unified AI Misalignment Framework.
- **Phase 3 — Reporter + audit trail** (target: end of week 3)
Structured Markdown narrative. JSONL audit log with timestamps and token usage. Tie every report claim to a specific tool execution.
- **Phase 4 — Demo video, accuracy report, polish** (target: 15 June)
Live terminal-execution demo (≤5 min, with at least one self-correction sequence per the rules). Self-assessment of false positives / missed artefacts / hallucinations.
## Author
Ma'alona Mafaufau — independent AI-safety researcher (Auckland NZ).
Site: [approxiomresearch.com](https://approxiomresearch.com).
## License
Apache License 2.0 — see [LICENSE](LICENSE).