arcuru/vuln-scanner
GitHub: arcuru/vuln-scanner
Stars: 0 | Forks: 0
# vuln-scanner
LLM-driven vulnerability scanner that builds up an **investigation directory**
per scan target, accumulating history across runs as the target evolves and
models improve.
recon → hunt → validate → dedupe → consolidate (per run)
## Prerequisites
- Python 3.12+
- [`uv`](https://docs.astral.sh/uv/) for dependency and environment management
- `git` on `$PATH`
- An agent CLI for the chosen backend, authenticated:
- [`claude`](https://github.com/anthropics/claude-code) — default backend
- [`pi`](https://github.com/anthropics/oh-my-pi) — alternative backend
- Anything else you wire up via a custom `[agent.backends.*]` entry
## Install
# Run from a checkout without installing globally
uv run vuln-scanner --help
# Or install as a uv tool (puts `vuln-scanner` on $PATH)
uv tool install .
## Quick start
Create a folder for the investigation, scaffold it against a target, run a
scan, check status.
mkdir cool-project-scan && cd cool-project-scan
# Clone the target into ./target/, write vuln-scanner.toml + MANIFEST.toml
uv run vuln-scanner init https://github.com/user/cool-project
# Run a scan (uses target's current HEAD; pass --sha to pin a commit)
uv run vuln-scanner run -j 8
# Re-run later (target may have new commits, or a newer model is available);
# the next recon reads prior runs and proposes net-new investigations
uv run vuln-scanner run --sha
# See the run history
uv run vuln-scanner status
The investigation folder is self-contained. Move it, archive it, commit it to
its own git repo — it stays consistent.
## Scan archive
This repo also archives scans via a git submodule at [`scans/`](scans/)
([`arcuru/vuln-scans`](https://github.com/arcuru/vuln-scans)). Each subfolder is one
investigation directory: config + manifest + per-run output. The cloned target
and ephemeral worktrees are gitignored, so what's committed is the audit trail.
Clone with `--recurse-submodules` to pull the scan archive alongside the tool.
| Target | Source |
|---|---|
| [`cmprss`](scans/cmprss/) | |
From the repo root, scan an existing target or add a new one:
# Run a scan against a target (anywhere in the tree)
vuln-scanner run -C scans/cmprss -j 8
# Add a new target
mkdir scans/ && cd scans/
vuln-scanner init https://github.com/owner/repo
vuln-scanner run -j 8
## CLI reference
$ vuln-scanner --help
usage: vuln-scanner [-h] {init,run,status} ...
Multi-phase LLM vulnerability scanner over a single target investigation.
positional arguments:
{init,run,status}
init Initialize an investigation directory in cwd (clones
target, writes config).
run Execute one scan run against target/ in cwd.
status List runs in this investigation.
$ vuln-scanner init --help
usage: vuln-scanner init [-h] [-c CONFIG] target_url
positional arguments:
target_url Git URL of the target repo to clone
options:
-c, --config CONFIG Path to a vuln-scanner.toml to copy in (default:
minimal built-in)
$ vuln-scanner run --help
usage: vuln-scanner run [-h] [-C DIR] [--sha SHA] [-j JOBS] [-v]
options:
-C, --dir DIR Investigation directory to operate on (default: cwd)
--sha SHA Target commit SHA to pin (default: keep current target
HEAD)
-j, --jobs JOBS Parallel workers (default: 4)
-v, --verbose
$ vuln-scanner status --help
usage: vuln-scanner status [-h] [-C DIR]
options:
-C, --dir DIR Investigation directory to operate on (default: cwd)
## Investigation directory
`init` scaffolds, `run` produces an immutable per-run directory, `SUMMARY.md`
at the top always points at the latest run:
my-investigation/
vuln-scanner.toml # config (committed)
MANIFEST.toml # target URL, latest-run pointer
target/ # cloned scan target (gitignored)
worktrees/ # ephemeral worktrees (gitignored)
.vuln-scanner.lock # concurrency guard
runs/
2026-05-20T14-30-abc1234/ # ISO timestamp + short target SHA
manifest.toml # tool version, target SHA, status, summary
config.toml # effective config snapshot for this run
logs/.log # agent stdout (or SDK event stream)
transcripts/.jsonl # full Claude transcript per task
recon/
HUNT_QUEUE.json
task.toml # backend, model, session_id, timings
hunt//
FINDING.md
task.toml
validate//
VERIFICATION.md
task.toml
dedupe/FINDINGS.md
consolidate/
SUMMARY.md # cumulative across all runs
task.toml
SUMMARY.md → runs//consolidate/SUMMARY.md
Each run directory is self-describing: `config.toml` is the resolved view of
`vuln-scanner.toml` at run time, and per-task `task.toml` records the exact
backend invocation (argv or SDK options), session UUID, model used, duration,
and cost. The matching `transcripts/.jsonl` is the full Claude
session log copied out of `~/.claude/projects/`.
The `.gitignore` written by `init` covers `target/`, `worktrees/`, and the
lockfile — so you can run `git init` in the investigation folder and track
the config + runs without dragging the target's full history with you.
## How it works
Each phase runs in its own git worktree off `target/`, isolating agent
artifacts per task:
1. **recon** — one task. Maps architecture and produces `HUNT_QUEUE.json`. On
continuation runs, also reads prior runs' `SUMMARY.md` and the git diff
since the prior target SHA, then produces a queue of net-new
investigations and worthwhile revisits.
2. **hunt** — fan-out from the queue. Each entry is one attack class in one
scope. Produces `FINDING.md` per task.
3. **validate** — fan-out from hunt tasks. Adversarial review of each finding.
Produces `VERIFICATION.md` per task.
4. **dedupe** — one task. Groups confirmed findings by root cause, records
rejected investigations so future recon can skip them, and records *failed*
investigations (tasks where the agent crashed or timed out before
producing output — outcome unknown, codepath NOT cleared) so future runs
re-attempt them. Produces `FINDINGS.md`.
5. **consolidate** — one task. Produces the cumulative `SUMMARY.md` with each
finding tagged **NEW** / **PERSISTS** / **FIXED** / **REGRESSED** relative
to prior runs, and a "Failed Investigations" section listing this run's
unknown-outcome tasks.
Each run resumes from `.done` sentinels — if interrupted, re-running `run`
(without `--sha`) picks up where it left off in the same run directory.
Concurrent `run` invocations in the same investigation folder are refused via
the lockfile.
If recon decides there's nothing new to investigate (continuation run on an
unchanged target), it writes an empty queue and the pipeline bails early.
## Configuration
Two layers: a **prompt profile** (Python module with prompt functions, plus
markdown bodies) and a **TOML config** (settings overlay).
### Prompt profile (Python + markdown)
The built-in profile is `vuln-scan` (in `src/vuln_scanner/configs/vuln_scan.py`).
The prompt bodies live alongside it in `src/vuln_scanner/configs/prompts/` —
one `.md` per phase, with `$variable` placeholders substituted at render time:
configs/
vuln_scan.py # settings + glue (loads + renders the .md files)
prompts/
_environment.md # shared snippet injected into every prompt
recon.md # uses $prior_runs_path for continuation runs
hunt.md # uses $attack_class, $scope, $entry_point, …
validate.md
dedupe.md
consolidate.md # uses $prior_runs_path
To tweak what the agents are told, edit the markdown — no Python changes
needed. Required prompt functions on the profile module:
- `recon_prompt(*, prior_runs_path: str = "") -> str`
- `hunt_prompt(*, attack_class, scope, function, entry_point, rationale, arch_summary) -> str`
- `validate_prompt() -> str`
Optional: `dedupe_prompt()`, `consolidate_prompt(output_dir, *, prior_runs_path="")`.
Write your own profile by copying `vuln_scan.py` (and the `prompts/` directory)
and pointing `vuln-scanner.toml` at it via `[scan] prompt_profile = "..."`.
### Settings (TOML)
`init` writes a minimal `vuln-scanner.toml` into the investigation folder
(unless you pass `-c ` to copy in your own):
[scan]
prompt_profile = "vuln-scan"
[agent]
backend = "claude"
# [agent.models]
# recon = "claude-sonnet-4-6"
# hunt = "claude-sonnet-4-6"
# validate = "claude-opus-4-7"
See [`vuln-scanner.example.toml`](vuln-scanner.example.toml) for the full set of options
with comments. Key sections:
| Section | Purpose |
|---|---|
| `attack_classes` (top-level) | Vulnerability categories to scan for |
| `[scan]` | Profile, branch prefix, parallelism, timeouts |
| `[scan.task_timeouts]` | Per-phase timeout overrides (seconds) |
| `[agent]` | Backend name and flags |
| `[agent.models]` | Per-phase model names |
| `[agent.backends.]` | Define custom backends in config |
| `[output]` | Per-phase output filenames |
| `[files]` | File extensions and exclude directories |
### Backends
Built-in backends:
- `claude` — subprocesses the Claude Code CLI (`claude -p`); default.
- `claude-sdk` — uses the in-process [`claude-agent-sdk`](https://pypi.org/project/claude-agent-sdk/)
Python library. Same model and tools as `claude`, but streams structured
events (assistant turns, tool calls, the final `ResultMessage` with token
and cost info) into the per-task log file.
- `pi` — Oh My Pi agent CLI.
Set via `[agent] backend = "..."`.
Custom backends can be defined directly in TOML — no Python code needed:
[agent]
backend = "gemini"
[agent.backends.gemini]
executable = "gemini-cli"
model_flag = "--model"
prompt_flag = "--prompt"
extra_args = ["--yes"]
Fields: `executable` (required), `prompt_flag` (required), `model_flag`
(optional), `extra_args` (optional). The resulting command is:
gemini-cli --yes --model --prompt
For backends needing custom logic beyond flags, implement the `Backend`
protocol in `src/vuln_scanner/claude.py` and add to the `BACKENDS` registry.
### Timeouts
`task_timeout` sets a global default (0 = no timeout). `task_timeouts`
overrides per phase:
[scan]
task_timeout = 0 # global default: no timeout
[scan.task_timeouts]
hunt = 900 # 15 minutes per hunt task
validate = 600
When a timeout is hit, the agent subprocess receives SIGTERM, then SIGKILL
after 5 seconds.
## Development
# Install dev dependencies
uv sync --extra dev
# Run tests
uv run pytest tests/ -v
# Type check
uv run pyright src/
## License
AGPL-3.0-or-later. See [`LICENSE.txt`](LICENSE.txt).