provos/ironcurtain
GitHub: provos/ironcurtain
Stars: 493 | Forks: 65
# IronCurtain
[](https://github.com/provos/ironcurtain/actions/workflows/ci.yml)
[](https://www.npmjs.com/package/@provos/ironcurtain)
[](LICENSE)
[](https://ironcurtain.dev)
**A secure\* runtime for autonomous AI agents, where security policy is derived from a human-readable constitution.**
_\*When someone writes "secure," you should immediately be skeptical. [What do we mean by secure?](https://ironcurtain.dev)_
## Demo
`), a Signal messaging transport for mobile approval, and a daemon mode for scheduled cron jobs. The daemon has an optional [web UI](DAEMON.md#web-ui) (`--web-ui`) for browser-based monitoring and escalation handling. See [RUNNING_MODES.md](RUNNING_MODES.md) for details.
### Multi-agent workflows
IronCurtain orchestrates multiple AI agents through structured workflows. The bundled **vulnerability discovery** workflow hunts memory-safety and logic bugs in native code through a tiered harness pipeline (Tier 1 isolated function → Tier 2 multi-component → Tier 3 full build) with libFuzzer/AFL++ coverage gating, hypothesis-driven `discover`/`triage` states, and a final human report-review gate. The **design-and-code** workflow runs plan / design / implement / review cycles, also with human gates. Each agent runs in its own Docker container with role-specific policy boundaries; the engine manages state transitions, artifact passing, and crash-resume checkpointing automatically. Open source, runs entirely on your machine, enforces per-agent security policies via the constitution-based policy engine, and works with any Docker-containerized agent — comparable in scope to Amazon Kiro and Google Jules for coding tasks, but with first-class security and an extensible workflow definition format.

**The web UI is the intended interface for workflow runs.** Start the daemon, open the printed URL, and drive runs from the Workflows page — the state-machine graph above is live, the agent-message timeline streams with markdown rendering, gate reviews include a workspace + artifact browser, and past runs stay listed.
ironcurtain daemon --web-ui
CLI access is available for scripting, automation, and debugging:
ironcurtain workflow start vuln-discovery \
"Find memory-safety bugs in libical" --workspace ~/src/libical
ironcurtain workflow start design-and-code \
"Build a REST API with authentication"
See [WORKFLOWS.md](WORKFLOWS.md) for the full documentation.
## Customizing Your Policy
The default policy works well for general development, but you can tailor it to your workflow:
**1. Customize your constitution** (optional but recommended):
ironcurtain customize-policy
An LLM-assisted conversation that generates a constitution tailored to your workflow, saved to `~/.ironcurtain/constitution-user.md`. You can also edit this file directly.
**2. Compile the policy:**
ironcurtain compile-policy
Translates your constitution into deterministic rules, generates test scenarios, and verifies them. Compiled artifacts go to `~/.ironcurtain/generated/`.
### Personas
Personas are named policy profiles — each bundles a constitution, compiled policy, persistent workspace, and semantic memory. Use them to run agents with different roles or access levels.
ironcurtain persona create my-assistant # Create a persona
ironcurtain persona compile my-assistant # Compile its policy
ironcurtain start --persona my-assistant "Check my calendar"
In mux mode, `/new my-assistant` spawns a tab using that persona. Personas can also be assigned to cron jobs. See [DAEMON.md](DAEMON.md) for scheduled job configuration.
### Skills
Drop SKILL.md packages under `~/.ironcurtain/skills//` to make purpose-specific guidance (helper scripts, deterministic checks, domain knowledge) available to every Docker agent session. The merged set is staged into a per-bundle host directory and bind-mounted **read-only** into the container at the path the active agent's native discovery walks — Claude Code is pointed at the staging dir via `--add-dir`, Goose scans `~/.config/goose/skills//SKILL.md`. The agent discovers them automatically and decides when to read them based on each skill's frontmatter description. The SKILL.md _format_ is the open standard adopted by Claude Code, Goose, and Codex; only the _discovery path_ differs per agent. Workflows can ship per-state skills inside the workflow package — see [WORKFLOWS.md](WORKFLOWS.md#skills).
## Policy: Constitution → Enforcement
You write intent in plain English; IronCurtain compiles it into deterministic rules:
constitution.md → [Annotate] → [Compile] → [Resolve Lists] → [Generate Scenarios] → [Verify & Repair]
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
tool-annotations compiled-policy dynamic-lists test-scenarios verified policy
.json .json .json .json (or build failure)
1. **Annotate** — Classify each MCP tool's arguments by role (read-path, write-path, delete-path, none).
2. **Compile** — Translate the English constitution into deterministic if/then rules. Categorical references ("major news sites", "my contacts") are emitted as `@list-name` symbolic references.
3. **Resolve Lists** — Resolve symbolic lists to concrete values via LLM knowledge or MCP tool-use (e.g., querying a contacts database). Written to `dynamic-lists.json`, user-editable. Skipped when no lists are present.
4. **Generate Scenarios** — Create test scenarios from the constitution plus mandatory handwritten invariant tests.
5. **Verify & Repair** — Run scenarios against the real policy engine. An LLM judge analyzes failures and generates targeted repairs (up to 2 rounds). Build fails if the policy cannot be verified.
All artifacts are content-hash cached — only changed inputs trigger recompilation.
### What compiled rules look like
A constitution clause like:
- The agent may perform read-only git operations (status, diff, log) within the sandbox without approval.
- The agent must receive human approval before git push, pull, fetch, or any remote-contacting operation.
compiles to:
[
{ "tool": "git_status", "decision": "allow", "condition": { "directory": { "within": "$SANDBOX" } } },
{ "tool": "git_diff", "decision": "allow", "condition": { "directory": { "within": "$SANDBOX" } } },
{ "tool": "git_push", "decision": "escalate", "reason": "Remote-contacting git operations require human approval" }
]
Any call that doesn't match an explicit `allow` or `escalate` rule is **denied by default**.
ironcurtain annotate-tools --server filesystem # Annotate one server (merge with existing)
ironcurtain annotate-tools --all # Re-annotate all servers
ironcurtain compile-policy # Compile constitution into rules and verify
ironcurtain refresh-lists # Re-resolve dynamic lists without full recompilation
ironcurtain refresh-lists --list major-news # Refresh a single list
Review the generated `~/.ironcurtain/generated/compiled-policy.json` — these are the exact rules enforced at runtime.
## Configuration
IronCurtain stores configuration and session data in `~/.ironcurtain/`:
~/.ironcurtain/
├── config.json # User configuration
├── constitution.md # User-local base constitution (overrides package default)
├── constitution-user.md # Your policy customizations (generated by customize-policy)
├── generated/ # User-compiled policy artifacts (overrides package defaults)
├── personas/ # Persona directories (constitution, policy, workspace, memory)
├── skills/ # User-global SKILL.md packages, mounted into every Docker session
├── jobs/ # Cron job definitions, workspaces, and run records
├── sessions/
│ └── {sessionId}/
│ ├── sandbox/ # Per-session filesystem sandbox
│ ├── escalations/ # File-based IPC for human approval
│ ├── audit.jsonl # Per-session audit log
│ └── session.log # Diagnostics
└── workflow-runs/ # Shared-container workflow runs (see below)
Single-session runs (`ironcurtain start`, mux tabs, cron jobs) write under `sessions/`. Shared-container workflow runs write under `workflow-runs/` instead — see the next section.
### Workflow run layout
A workflow definition can opt in to a shared Docker container by setting `settings.sharedContainer: true` in its YAML. In that mode every agent state runs inside the same long-lived container and shares one policy engine instance; between states the orchestrator hot-swaps the active policy so each persona sees its own rules. All artifacts for the run land in a single tree:
~/.ironcurtain/workflow-runs//
├── audit.jsonl # Persona-tagged append-only audit
├── messages.jsonl # Orchestrator message log
├── workspace/ # Agent workspace (filesystem MCP root)
├── bundle/ # Shared container support (claude-state, orientation, sockets, escalations, system-prompt.txt)
├── states/
│ └── ./ # session.log + session-metadata.json per invocation
└── proxy-control.sock # Coordinator UDS for policy hot-swap
No per-session entries are created under `~/.ironcurtain/sessions/` for a shared-container workflow run. User-visible commands (`ironcurtain workflow start|resume|inspect|list`) are unchanged. See [WORKFLOWS.md](WORKFLOWS.md) for authoring workflow definitions and the full lifecycle.
Edit configuration interactively:
ironcurtain config
Key configuration areas: models and API keys, resource budgets (token/step/time/cost limits), auto-approve escalations, web search provider, audit redaction, and memory server LLM settings. See [CONFIG.md](CONFIG.md) for the full reference.
To route LLM traffic through a gateway like LiteLLM or OpenRouter (in both Code Mode and Docker Agent Mode), see [MODEL_ROUTING.md](MODEL_ROUTING.md).
## Built-in Capabilities
IronCurtain ships with six pre-configured MCP servers. All tool calls (except memory) are governed by your compiled policy.
| Server | Tools | Key capabilities |
| -------------------- | ----- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Filesystem** | 14 | Read, write, edit, search files; directory tree; move; diff calculation |
| **Git** | 28 | Full git workflow: status, diff, log, commit, branch, push/pull/fetch, clone, stash, blame |
| **Fetch** | 2 | HTTP GET with HTML-to-markdown conversion; web search (Brave, Tavily, SerpAPI) |
| **GitHub** | 41 | Issues, PRs, code search, reviews via `ghcr.io/github/github-mcp-server`; requires a GitHub personal access token |
| **Google Workspace** | 128 | Gmail, Calendar, Drive, Docs, Sheets — requires OAuth setup via `ironcurtain auth` |
| **Memory** | 5 | Persistent semantic memory with hybrid vector+keyword search, LLM summarization, and automatic compaction. Enabled for persona and cron sessions. |
Read-only operations are allowed by default policy; mutations (writes, pushes, PR creation) escalate for human approval. Tools use `server.tool` naming (e.g., `filesystem.read_file`, `memory.recall`). See [ADDING_MCP_SERVERS.md](ADDING_MCP_SERVERS.md) to add your own.
### Network Passthrough (Docker Agent Mode)
In Docker Agent Mode, the container has no network access — all traffic goes through IronCurtain's MITM proxy. By default, only LLM provider domains are reachable. The agent can request access to additional domains at runtime via the `proxy` virtual MCP server (`add_proxy_domain`). Each request requires human approval via the escalation flow.
Approved domains get a **raw passthrough tunnel** — HTTP, HTTPS, and WebSocket connections are forwarded without content inspection or credential injection. This gives the agent greater utility (calling third-party APIs, streaming data from external services) but means traffic to those domains is **unmediated**. See [SECURITY_CONCERNS.md](docs/SECURITY_CONCERNS.md) Section 2b-i for the threat model and [DEVELOPER_GUIDE.md](DEVELOPER_GUIDE.md) for usage details.
## Security Model
IronCurtain is designed around a specific threat model: **the LLM goes rogue.** This can happen through prompt injection (a malicious email or web page hijacks the agent) or through multi-turn drift (the agent gradually deviates from the user's intent over a long session).
### What IronCurtain enforces
- **Filesystem containment** — Symlink-aware path resolution prevents path traversal and symlink-escape attacks.
- **Per-tool policy** — Each MCP tool call is evaluated against compiled rules. The policy engine classifies tool arguments by role (read-path, write-path, delete-path) to make fine-grained decisions.
- **Structural invariants** — Certain protections are hardcoded and cannot be overridden by the constitution: the agent can never modify its own policy files, audit logs, or configuration.
- **Human escalation** — When policy says "escalate," the agent pauses and the user must explicitly approve or deny. Optionally, an LLM-based auto-approver handles unambiguous cases (see [CONFIG.md](CONFIG.md)).
- **Audit trail** — Every tool call and policy decision is logged to an append-only JSONL audit log.
- **Resource limits** — Token, step, time, and cost budgets prevent runaway sessions.
### Known limitations
This is a research prototype. Known gaps include:
- **Policy compilation fidelity** — The LLM-based compiler can misinterpret constitution intent. The verification pipeline catches many errors but is not exhaustive. Always review the compiled `compiled-policy.json`.
- **V8 isolate boundaries** — Code Mode uses V8 isolates, not OS-level virtualization. A V8 zero-day could allow escape.
- **No outbound content inspection** — An agent allowed to write files could encode sensitive data to bypass content-level controls. Planned: LLM-based intelligibility checks on outbound content.
- **Escalation fatigue** — Too many false-positive escalations can lead to habitual approval. Tune your constitution to minimize unnecessary prompts.
See [docs/SECURITY_CONCERNS.md](docs/SECURITY_CONCERNS.md) for a detailed threat analysis.
## Troubleshooting
| Issue | Guidance |
| --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Missing API key** | Set the environment variable (`ANTHROPIC_API_KEY`, `GOOGLE_GENERATIVE_AI_API_KEY`, or `OPENAI_API_KEY`) or add the corresponding key to `~/.ironcurtain/config.json`. |
| **Sandbox unavailable** | OS-level sandboxing requires `bubblewrap` and `socat`. Install both, or set `"sandboxPolicy": "warn"` in your MCP server config for development. |
| **Budget exhausted** | Adjust limits in `~/.ironcurtain/config.json` under `resourceBudget`. Set any individual limit to `null` to disable it. |
| **Node version errors** | Node.js 22+ is required (`isolated-vm` needs `>=22.0.0`). Maximum supported is Node 25 (`<26`). |
| **Policy doesn't match intent** | Review `compiled-policy.json` to see the generated rules. Run `ironcurtain customize-policy` to refine your constitution, then `ironcurtain compile-policy` to recompile. Specific wording produces better rules — vague phrasing leads to vague policy. |
| **Auto-approve not triggering** | The auto-approver only approves when the user's message explicitly authorizes the action (e.g., "push to origin" for `git_push`). Vague messages always escalate to human review. Verify `autoApprove.enabled` is `true` in `config.json`. |
| **PTY/mux terminal garbled after exit** | Run `reset` in that terminal to restore normal mode. This is needed when the process is killed ungracefully and raw mode is not restored. |
| **Mux/listener: "already running"** | Only one mux or escalation-listener can run at a time. The lock at `~/.ironcurtain/escalation-listener.lock` is auto-cleared if the previous process is dead. If it persists, check the PID in the lock file. |
| **Signal bot not responding** | Verify the signal-cli container is running (`docker ps \| grep ironcurtain-signal`). Check that Signal is configured (`ironcurtain setup-signal`). See [TRANSPORT.md](TRANSPORT.md) for detailed troubleshooting. |
## Development
npm test # Run all tests
npm test -- test/policy-engine.test.ts # Run a single test file
npm test -- -t "denies delete_file" # Run a single test by name
npm run lint # Lint
npm run build # TypeScript compilation + asset copy
See [TESTING.md](TESTING.md) for the full testing guide, including integration test flags and conventions.
### Project Structure
src/
├── index.ts # Entry point
├── cli.ts # CLI command dispatcher
├── config/ # Configuration loading, constitution, MCP server definitions
├── session/ # Multi-turn session management, budgets, loop detection
├── sandbox/ # V8 isolated execution environment
├── trusted-process/ # Policy engine, MCP proxy, audit log, escalation handler
├── pipeline/ # Constitution → policy compilation pipeline
├── escalation/ # Escalation listener: session registry, TUI dashboard, state
├── mux/ # Terminal multiplexer: PTY bridge, renderer, trusted input
├── persona/ # Persona management (create, compile, resolve)
├── memory/ # Memory server integration (config, annotations, path resolution)
├── signal/ # Signal messaging transport (bot daemon, setup, formatting)
├── daemon/ # Unified daemon (Signal + cron scheduler, control socket)
├── cron/ # Cron job management (scheduler, job store, git sync, policy)
├── docker/ # Docker agent mode, PTY session, MITM proxy, registry proxy
├── workflow/ # Multi-agent workflow engine (orchestrator, state machine, gates)
├── web-ui/ # Web UI backend (JSON-RPC dispatch, event bus, workflow manager)
├── servers/ # Built-in MCP servers (fetch, web search providers)
└── types/ # Shared type definitions
packages/
└── memory-mcp-server/ # Standalone memory MCP server (publishable npm package)
## License
[Apache-2.0](LICENSE)
标签:自动化攻击