DavidClawson/ripcord
GitHub: DavidClawson/ripcord
Stars: 1 | Forks: 0
# ripcord
**Opaque firmware binary in, queryable fact database out — one command.**
ripcord is a research pipeline for reverse engineering embedded firmware.
It takes a binary with no symbols, no source, and an undocumented hardware
peripheral, and expands it into a structured warehouse of facts —
functions, call graph, MMIO access patterns, decompiled C, behavioral
traces — that deterministic analyzers, formal methods, and LLM agents can
all query without ever re-reading the raw bytes.
The driving target is the **FNIRSI 2C53T oscilloscope** (AT32F403A MCU + an
opaque Gowin FPGA). The hardest, most valuable part of that firmware is the
FPGA acquisition path — timing-critical code talking to a chip with *no
public documentation and no source*. The only way to know what the FPGA
does is to watch the MCU talk to it. ripcord is built to capture that
conversation and turn it into an execution-verified protocol spec.
## The idea in one picture
firmware.bin
│
▼
┌───────────────────┐ deterministic, runs in minutes, no human judgment
│ IDENTIFY │ ISA · load address · chip family (scripts/identify.py)
├───────────────────┤
│ EXTRACT (Ghidra) │ functions · calls · blocks · xrefs · strings
│ │ pcode · decompiled C (PyGhidra headless)
├───────────────────┤
│ RECOVER │ vector tables, func-ptr dispatch, veneers, registrars
│ │ → closes the call-graph reachability gap
├───────────────────┤
│ CLASSIFY │ SVD-resolved peripheral register access · fingerprint
│ │ match library code across compilers
├───────────────────┤
│ TRACE (Renode) │ boot the binary, capture MMIO transcript = ground truth
└─────────┬─────────┘
▼
┌─────────────────────────────────────────┐
│ THE WAREHOUSE │ per-target Parquet tables,
│ build//tables/*.parquet │ queried with DuckDB.
│ (no database file — Parquet is truth) │ THIS is the artifact.
└─────────┬───────────────────────────────┘
│
┌───────┴────────┬─────────────────┬──────────────────┐
▼ ▼ ▼ ▼
scripts/query LLM agent swarm Unicorn / Renode Claude Code
(SQL / DuckDB) (bulk labeling) (VERIFY by (skills + CLI:
execution) drives it all)
Two principles do the heavy lifting:
1. **Execution is the verification oracle — not the compiler.** A claim about
what a function does is confirmed by *running* it (Unicorn) or *tracing*
it (Renode) and diffing register/memory/MMIO deltas against the original.
Compilers catch type errors; execution catches logic errors. No claim
becomes canonical until execution backs it. This is the part most RE
tooling skips (see [Related work](#related-work)).
2. **The database is the artifact — not clean source code.** The deliverable
is a queryable set of facts about the binary. Rendered C, if it exists at
all, is a late-stage *view* over the database, never the goal. (Why:
[`notes/goal-and-approach.md`](./notes/goal-and-approach.md).)
LLM budget is spent only on the *residue* deterministic tools can't resolve.
Everything mechanical — Ghidra extraction, library identification, call
recovery, trace capture — runs unattended in minutes.
## Quick start
# Identify ISA / load address / chip before committing to a full run
scripts/identify.py firmware.bin
# One command: identify → extract → ingest → recover calls → classify → summarize
scripts/ripcord.py firmware.elf # ELF: flags inferred
scripts/ripcord.py firmware.bin --chip AT32F403A --base-addr 0x08004000 # raw binary
# Ask questions over the warehouse + decompiled C + an LLM
scripts/analyze --target stock_v120 "what writes to USART2_DR?"
# Full bottom-up comprehension: smoke-test every function, name them,
# decompose monsters, synthesize subsystem → architecture narratives
uv run python scripts/agents/deep_analysis.py --target stock_v120
# Render a self-contained HTML report
scripts/render/report.py stock_v120
# (optional) Expose the warehouse over MCP for a client without shell access.
# The primary path is Claude Code running the tools + skills above directly.
uv run python scripts/mcp_server.py --build-dir ./build
See [`SETUP.md`](./SETUP.md) for toolchain prerequisites (Ghidra 11.2+ with
PyGhidra, Python 3.11+, `uv`, Snakemake, DuckDB; optionally Renode and a
cross-toolchain to build the test corpus).
## Why the harness is the point
Most "LLM + Ghidra" tools feed a single decompiled function to a model and
ask "what does this do?" — a fragment with no surrounding context. That
starves the model exactly where embedded RE is hardest.
ripcord inverts it. The deterministic pipeline builds a rich, *queryable*
context first; then **Claude Code drives** — running the CLI tools and
skills (`.claude/skills/`) directly to pull precisely the tables, decompiled
bodies, peripheral maps, and execution traces it needs, iteratively, while
reasoning about the binary as a whole *and building new tools mid-task* when
a target demands them. Reusable procedures harden into skills
(`firmware-bringup`, `execution-verify`); execution-verified conclusions land
in the contract ledger (`scripts/contracts/ledger.py`), which is the durable
product. The single-shot API paths (`scripts/analyze`, the agent swarm) stay
for cheap, scoped, *measurable* sub-tasks — fingerprint matching, bulk
function labeling — where a fragment genuinely is enough. An
[MCP server](./scripts/mcp_server.py) remains as optional interop for a
client that can't run the shell; it isn't the primary surface, because
ripcord's data is local Parquet the driver already reads directly.
Comprehension lives in the harness, not the access protocol.
## What's in the warehouse
A `snakemake --cores 4 --resources ghidra=1` run produces typed Parquet
tables per target under `build//tables/`. Agent and validation
stages add more. Highlights:
| table | grain |
|------------------------|--------------------------------------------------------------|
| `functions` | one row per Ghidra-discovered function (incl. `body_hash`) |
| `calls` / `xrefs` | call sites; non-call references (reads, writes, jumps, data) |
| `basic_blocks` | one row per CodeBlock, with containing function |
| `strings` | defined strings in loaded memory |
| `decompiled` | Ghidra decompiled pseudo-C, one row per function |
| `pcode_features` | per-function P-Code opcode histogram + sequence hash |
| `recovered_calls` | recovered indirect call edges (vector table, func ptr, …) |
| `peripheral_xrefs` | SVD-resolved peripheral register accesses |
| `mmio_events` | MemoryIORead/Write from a Renode trace, joinable by PC |
| `unicorn_smoke` | per-function executability (catches code-vs-data misdecode) |
| `ground_truth_functions` | `nm -S` symbols, the regression signal |
All tables are auto-discovered as DuckDB views by `scripts/query`. The
[`notes/queries/`](./notes/queries/) directory holds committed SQL that
doubles as executable documentation and regression tests.
## Current state (2026-05)
Phase 0 complete; Phase 1 library-ID validated end-to-end including blind
recovery on a stripped binary; Phase 3 agent swarm validated end-to-end.
Renode trace capture and Datalog (Souffle) derivations are wired into the
Snakemake DAG. Deep hierarchical analysis, context enrichment, and Unicorn
execution-validation are built on top.
**Fifteen targets across four build ecosystems** live in the warehouse:
5 Raspberry Pi Pico (Cortex-M0+), 2 Zephyr (Cortex-M3), 1 stripped blind-
recovery target, 3 AT32F403A reference builds (GCC + LLVM, the cross-compiler
corpus), and 4 stock FNIRSI 2C53T firmware versions (V1.0.3–V1.2.0) — the
primary target *and* its own differential ground truth.
A few empirical results that fell out (full list and provenance in
[`CLAUDE.md`](./CLAUDE.md) → "Key empirical findings"):
- **Blind recovery on a stripped binary: 86.6% recall, 94.9% precision** —
171/197 functions re-identified with zero symbols.
- **Computed-call recovery closes the reachability gap from 70% unreachable
to 12%** via five recovery mechanisms at ~95% blended precision.
- **Constant-based fingerprinting: 100% precision, cross-compiler.**
- **Execution catches what static analysis can't** — the Unicorn smoke test
flags Ghidra decoding data as code, the #1 failure mode for raw imports.
- **The FNIRSI V1.0.3→V1.0.7 transition was a full architectural rewrite of
the FPGA acquisition path** (USART2-only → DMA/SPI3), confirmed by
byte-identical FreeRTOS port code against a GCC reference build.
## Related work
ripcord's individual ingredients all exist in the wild; the combination —
a structured fact warehouse **plus** an execution-as-verification oracle
**plus** a skills-driven Claude Code harness with a provenance-tracked
contract ledger, pointed at *comprehending* an opaque binary — is the part I
haven't found assembled elsewhere. Honest positioning:
- **LLM + disassembler tools** (Gepetto, G-3PO, aiDAPal, DeGPT) mostly send a
decompiled snippet to a model and write back a rename/comment. ripcord
builds queryable context *first*, so the model never reasons from a
context-free fragment.
- **Persistent structured state is no longer a differentiator.**
[GhidrAssist](https://github.com/jtang613/GhidrAssist) (open source, a
SQLite+graph knowledge DB with a 5-level hierarchy) and **Binary Ninja
Sidekick** (commercial, with provenance and a background validation agent)
both build it. ripcord's separation is that **their validation is static**
— re-analysis and cross-reference queries — whereas ripcord gates every
canonical claim on *execution*.
- **MCP-over-a-disassembler is table stakes — and not where ripcord's value
is.** [GhidraMCP](https://github.com/LaurieWired/GhidraMCP) (9k+ stars) and
[IDA Pro MCP](https://github.com/mrexodia/ida-pro-mcp) are mature; they
expose *live tool calls* over a protocol. ripcord keeps an MCP surface only
as optional interop — the driver (Claude Code) reads the local warehouse
directly via the CLI, so the access protocol is incidental. What's behind
the surface — a *warehouse of execution-verified facts* and the skills that
produce it — is the interesting part.
- **Binary-analysis-as-a-database predates ripcord** —
[ddisasm/GTIRB](https://github.com/GrammaTech/ddisasm) (which shares
ripcord's Souffle/Datalog layer) and CodeQL. ripcord *uses* that technique;
it didn't invent it.
- **Firmware rehosting** (PRETENDER, P2IM, DICE, Fuzzware) already infers
MMIO peripheral models from traces — but the deliverable is "enough model
to fuzz," **not** a legible, falsifiable MCU↔peripheral protocol spec.
Same input class, different output. ripcord aims at the legible boundary
contract those tools leave on the table.
- **Matched-source decomp** (decomp.me, the N64/PSX projects) verifies by
byte-identical recompilation — a *stricter* oracle than ripcord's
behavioral execution diff, but aimed at perfect source recovery, which
ripcord explicitly is **not** trying to produce.
- **Closest precedent to the core thesis:** Patrick Hulin's
[SimTower reimplementation](https://phulin.me/blog/simtower) put an LLM in
a closed loop against a Unicorn emulator as ground truth — the same
execution-as-oracle idea, as a one-off project rather than a general
pipeline.
## Where to go deeper
- [`CLAUDE.md`](./CLAUDE.md) — the dense, authoritative project orientation:
every script, every table, every committed query, current findings.
- [`notes/`](./notes/) — the design log and the FNIRSI target dossier. Start
with [`notes/README.md`](./notes/README.md). Key files:
[`design-decisions.md`](./notes/design-decisions.md) (why each choice was
made), [`pipeline-architecture.md`](./notes/pipeline-architecture.md),
[`scope_acquisition_spec.md`](./notes/scope_acquisition_spec.md) (the
MCU↔FPGA protocol), and
[`renode-at32-bringup.md`](./notes/renode-at32-bringup.md) (the FPGA
emulation oracle in action).
## Scope, honesty, and the FPGA caveat
ripcord is deliberately **generic**. The scope firmware is the proving
ground, not a license to hard-code 2C53T specifics into the core pipeline —
target knowledge lives in `notes/` and in queries, never in the extractors.
The FPGA timing code has *no external ground truth*. ripcord tags every
claim with a provenance level and never presents inferred FPGA behavior as
established fact: an internal dispatch/selector code is not a wire-level
hardware transaction, and a value the firmware *wrote* is observed while a
reply a stub *invented* is unverified until a hardware trace confirms it.
That discipline is the whole reason the execution oracle exists.
## License
[MIT](./LICENSE). Firmware binaries analyzed by the pipeline are **not**
included in this repository; their licensing belongs to their original
authors. The test corpus is built from open SDKs (Pico SDK, Zephyr, the
AT32 SDK) or supplied by the user.