Abhi-mishra998/aegis

GitHub: Abhi-mishra998/aegis

Stars: 3 | Forks: 1

Aegis Header Aegis tagline

🛡️ A runtime firewall for AI agents that blocks dangerous actions before they run,
cryptographically proves what happened after, and now talks back when you ask it questions.


separator

Watch the 5-minute demo     Read the engineering deep-dive     GitBook

separator

Python FastAPI PostgreSQL Redis OPA Docker React LiveKit Terraform

services containers p95 audit chain voice live license Tests

Live Deployment AWS Mumbai EC2 RDS PostgreSQL

Live Demo Quick Start Architecture Voice Agent


views


## 📖 Table of Contents - [What changed (2026-06-01)](#-what-changed-2026-06-01) - [What Aegis is](#%EF%B8%8F-what-aegis-is) - [The problem, stated plainly](#-the-problem-stated-plainly) - [How Aegis closes the gap](#%EF%B8%8F-how-aegis-closes-the-gap) - [Architecture at a glance](#%EF%B8%8F-architecture-at-a-glance) - [Voice Agent — in the navbar](#%EF%B8%8F-voice-agent--in-the-navbar) - [The 10-stage request pipeline](#-the-10-stage-request-pipeline) - [Cryptographic trust chain](#-cryptographic-trust-chain) - [SDK integration](#-sdk-integration) - [AWS deployment](#%EF%B8%8F-aws-deployment) - [Quick start](#-quick-start) - [Demo packs](#-demo-packs) - [Repository layout](#-repository-layout) - [Tech stack](#%EF%B8%8F-tech-stack) - [Contributing](#-contributing) - [License](#-license)
## 🆕 What changed (2026-06-01) This is a non-trivial revision of Aegis. Major shifts you should know about before you read further: ### Voice Agent now ships inside the Aegis UI A new **Voice Agent** button sits in the navbar of every page. Click it, allow the microphone, and you're talking to a docs-grounded cybersecurity advisor in real time. Pipeline: Deepgram nova-3 STT → Groq llama-3.3-70b (Gemini fallback) → Cartesia sonic-3 TTS, with hybrid BM25 + dense + cross-encoder retrieval over the same GitBook you're reading. The worker runs on a sibling `t3.medium` EC2 and registers outbound with LiveKit Cloud — no inbound port. p50 end-to-end is ~1.3 s in-region. Three independent layers cap each session at 5 minutes (gateway JWT TTL + agent asyncio guard + UI countdown) so a forgotten tab can't burn free-tier quota. See **[Voice Guide](./docs/voice-guide/_index.md)** for the full architecture. ### Live deployment moved from prod to dev The historical two-EC2 production stack at `aegisagent.in` was decommissioned on 2026-06-01 after a full `terraform destroy + apply` rebuild surfaced 11 non-obvious infra bugs ([catalogued here](./docs/operations/deployment.md#non-obvious-gotchas-catalogued-during-the-2026-06-01-dev-rebuild)). The single-EC2 dev environment at **`https://dev.aegisagent.in`** is the live deployment today, sized for ~10 concurrent reviewers at ~$22–45/mo. ### GitBook refreshed across four sprints Every doc page was rewritten against the current code — single-EC2 footprint, Graviton sizing, the kill-switch endpoint `Path(...)` fix, the email-validator `.local` TLD relaxation, the C1/C2/C3 router + middleware extractions from the 47-commit audit playbook, the new `voice-guide/` section. **[Start at `docs/README.md`](./docs/README.md)** for the canonical reference. ### Persona & anti-leak hardening on the voice agent The voice agent's `Modelfile` was rewritten to sound like a peer engineer, not a brochure ("Yeah, kill switch's basically the emergency stop…" instead of "The kill switch in Aegis is designed to immediately halt…"). A streaming regex filter in `agent.py:tts_node` strips Groq's occasional `` tool-call leakage before it reaches Cartesia TTS.
## 🛡️ What Aegis is **Aegis is a runtime control plane for AI agents.** It sits between an agent and the systems it acts on — databases, APIs, Kubernetes clusters, internal tools — and enforces one rule:
**📦 What's in the box** - **13 application services** running across **22 containers** on a single Graviton EC2 (`m6g.medium`) - **38 React UI pages** — every page wired to a live backend, zero mocked data - **Voice Agent in the navbar** — Deepgram → Groq → Cartesia, hybrid RAG over the GitBook - **3 end-to-end demo packs** (db_copilot, devops_agent, support_agent) — populate the UI in ~46 s - **Cryptographically verifiable audit chain** — ed25519 per-row + 16-shard HMAC + daily Merkle root - **Offline chain verifier** — `acp verify-chain` validates without trusting the running system - **Per-service mesh JWT auth** — no single shared secret across services - **Durable billing outbox** — atomic `audit + pending_usage_event` write + 60s drain worker - **Forensics replay** — timeline, blast-radius, cross-source replay, signed PDF export - **PDF compliance export** — EU AI Act / NIST AI RMF / SOC 2 evidence bundles - **SIEM integration** — Splunk, Elastic, Sentinel, Chronicle - **SDK adapters** — LangChain, Anthropic, OpenAI, native Python - **Visual Policy Builder** — GUI → Rego with live traffic simulation - **SSO / OIDC** — Google, Microsoft (Azure AD), Okta OAuth2 - **API key management** — `acp_` prefix long-lived keys with CRUD + 60s cache - **Auto-remediation playbooks** — pre-built response chains for common incidents - **SSE live feed** — per-tenant + per-agent Pub/Sub channels, sub-second propagation - Python 3.11, FastAPI 0.115, PostgreSQL 15, Redis 7, OPA/Rego, React 18, Vite 5, Tailwind, LiveKit Agents 1.5 - End-to-end **p95 ≈ 70 ms** measured on the live deployment (`/system/health`) - Full stack runs locally with **one `docker compose up`** **🎯 What it isn't** - Not an agent framework - Not an LLM inference provider - Not a general-purpose APM - Not a wrapper around someone else's policy engine - Not multi-region or HA (single EC2 today) It sits between your agent code and the world. **One product, not a platform.**

## 🔥 The problem, stated plainly Most AI agent deployments today share one weakness: **the security model assumes the agent will behave.** When it doesn't — through prompt injection, a model hallucination, a compromised credential, or just a bad output — the failure modes are real: | What goes wrong | What happens | Why it's hard to detect | |---|---|---| | **Prompt injection** | Attacker feeds malicious instructions through user-controlled tool outputs (a file the agent reads, a webpage it scrapes, a Slack message it summarizes) — the agent's plan gets hijacked. | The "instruction" is data; treating it as instruction is the bug. Most stacks have no isolation between tool input and the planner. | | **Tool-permission scope creep** | An agent granted "read GitHub" ends up writing because tool permissions weren't fine-grained. | Per-service granularity instead of per-action. No central decision before each call. | | **Audit gaps under load** | Async tools fire-and-forget; failed billing writes get silently swallowed; audit rows missing for billable events. | Fire-and-forget patterns are everywhere. No transactional outbox between the decision and the audit row. | | **Cross-tenant data leakage** | Embeddings or vector stores share tenant indexes; an agent in tenant A retrieves tenant B's data. | Tenant isolation needs to be a primitive, not a code review. | | **Runaway cost from infinite loops** | A cheap model calling itself recursively burns API quota in minutes. | No hard caps on tool-call depth, no rate-limit on per-agent spend. | Aegis is the single-purpose answer to all five.
## ⚙️ How Aegis closes the gap Every `POST /execute` traverses an **11-stage middleware pipeline** in the gateway before any tool runs. Each stage produces a signal; the Decision service combines them into one canonical action: `allow`, `monitor`, `throttle`, `escalate`, `kill`. The decision plus the inputs are written as a signed audit row in the same request lifecycle. The full per-stage walkthrough is at **[docs/architecture/10-stage-pipeline.md](./docs/architecture/10-stage-pipeline.md)** — the short version is: | Stage | Owns | What it stops | |---|---|---| | **0 — Kill switch** | `acp:kill_switch:{tenant_id}` Redis flag | Tenant-wide halt; propagates in < 5 s; fail-closed | | **1 — Auth** | JWT (HS256) + JTI revocation + 1 ms replay window | Stolen tokens, replayed requests, missing tenant context | | **2 — Rate limit** | Per-tenant token bucket + per-agent USD cap | Runaway loops, agents abusing other agents' quotas | | **3 — Inference proxy** | 17-pattern injection classifier + SQL injection regex | Prompt injection through user-controlled tool outputs, stacked SQL, RCE patterns | | **4 — Policy** | OPA / Rego policy bundle | Static rules: deny dangerous tools, allowed-domain checks | | **5 — Behavior** | Isolation Forest baseline per agent | Behavioural drift, anomalous tool sequences, blast-radius patterns | | **6 — Decision** | Signal combiner with per-tenant weight tuning | The actual allow / monitor / throttle / escalate / kill verdict | | **7 — Autonomy** | Per-agent contracts (e.g. `k8s.delete.*` requires approval) | Destructive operations that should never be unilateral | | **8 — Execution** | The actual tool runs (or doesn't) | — | | **9 — Output filter** | Bearer token + API key + PII pattern redaction | Secrets in responses, PII echo | | **10 — Audit** | Append-only signed row + Redis stream XADD | Every decision becomes durable — even 403s and 504s |
## 🏗️ Architecture at a glance ┌────────────────────────────────┐ │ Browser / SDK │ └───────────┬────────────────────┘ │ HTTPS ▼ ┌────────────────────────────────┐ │ ALB ─ acp-dev-alb-1541605899 │ │ → dev.aegisagent.in │ └───────────┬────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────┐ │ EC2 i-0f720c100f904291a m6g.medium ap-south-1a │ │ 22 containers via docker compose │ │ │ │ acp_ui ─ nginx + React SPA + Voice Agent button │ │ │ │ │ ▼ │ │ acp_gateway ─ 11-stage middleware + 22 sub-routers │ │ │ │ │ ├──► identity · registry · policy · decision │ │ ├──► behavior · audit · usage · autonomy │ │ ├──► identity_graph · flight_recorder · forensics│ │ ├──► api (incidents / webhooks / SIEM / reports) │ │ └──► insight (HTTP, audit aggregates) │ │ │ │ pgbouncer ─► acp-postgres-dev (RDS Single-AZ) │ │ acp_redis ◄── ElastiCache acp-redis-dev │ │ acp_opa + bundle_server │ │ acp_{prometheus,grafana,jaeger,alertmanager} │ └──────────────────────────────────────────────────────┘ Sibling EC2 (separate Terraform, separate cadence): ┌──────────────────────────────────────────────────────┐ │ Voice Guide t3.medium ap-south-1a │ │ LiveKit Agents worker — outbound WebRTC, no inbound. │ │ Deepgram nova-3 → Groq llama-3.3-70b → Cartesia s-3 │ │ ChromaDB 1,794 chunks + BM25 + cross-encoder rerank │ └──────────────────────────────────────────────────────┘ For the deep version with every Mermaid diagram, code reference, and table — **[docs/architecture/system-overview.md](./docs/architecture/system-overview.md)**. ### Service inventory (13 application services) | Service | Port (internal) | Database | Owns | |---|---|---|---| | **gateway** | `:8000` | none (Redis only) | Public API surface, 11-stage middleware, 22 sub-routers including `/voice/*` | | **identity** | `:8000` | `acp_identity` | JWT issuance, user CRUD, SSO config, agent credentials | | **registry** | `:8000` | `acp_registry` | Agents and tool-permission registry | | **policy** | `:8000` | OPA-local | OPA bundle host, Rego policy CRUD and simulation | | **decision** | `:8000` | Redis only | 5-signal risk synthesis, kill-switch, per-tenant signal weights | | **behavior** | `:8000` | `acp_behavior` | Behavioral firewall, per-agent baselines, degraded-mode policy | | **audit** | `:8000` | `acp_audit` | Signed audit chain, transparency roots, analyst notes, aggregations | | **usage** | `:8000` | `acp_usage` | Per-tenant usage records, billing outbox consumer | | **api** | `:8000` | `acp_api` | Incidents, API keys, webhooks, SIEM, scheduled reports | | **identity_graph** | `:8000` | `acp_identity_graph` | Typed graph nodes/edges, trust score, drift, compromise sim | | **flight_recorder** | `:8000` | `acp_flight_recorder` | Per-execution timelines, steps, snapshots, artifacts | | **autonomy** | `:8000` | `acp_autonomy` | Multi-agent contracts, playbooks, human override events | | **forensics** | `:8000` | reads `acp_audit` | Investigation listing, replay, blast-radius, PDF export | | **insight** | `:8000` | reads `acp_audit` | Audit aggregations exposed to UI dashboards | `gateway`, `forensics`, and `insight` don't own their own databases — they read from sibling services under read-only DSNs.
## 🎙️ Voice Agent — in the navbar Click the **Voice Agent** button (top-right of every page on `dev.aegisagent.in`), allow your microphone, and you're talking to the Aegis Voice Guide. It's a docs-grounded cybersecurity advisor — answers questions about the kill switch, the audit chain, the deploy flow, the demo packs, anything in the GitBook. **Live state:** | Layer | Component | |---|---| | Frontend | `ui/src/components/VoiceAgent/{VoiceAgentButton,VoiceAgentPanel,AnimatedOrb}.jsx` — lazy-loads ~152 KB gzipped of LiveKit JS only on click | | Gateway bridge | `services/gateway/routers/voice.py` — mints 5-min LiveKit JWTs with `RoomAgentDispatch(agent_name="aegis-guide")` | | Worker | Sibling `t3.medium` EC2 running LiveKit Agents 1.5; outbound-only WebRTC to LiveKit Cloud Build tier; no inbound port | | Pipeline | Deepgram nova-3 STT → Groq `llama-3.3-70b-versatile` (Gemini Flash-Lite as `FallbackAdapter`) → Cartesia `sonic-3` TTS | | Turn detection | Silero VAD + LiveKit `MultilingualModel` (ONNX, q8 quantized) | | RAG | Hybrid BM25 + dense (`all-MiniLM-L6-v2`) + cross-encoder rerank (`ms-marco-MiniLM-L-6-v2`); **1,794 chunks from 103 GitBook docs** | | Persona | `voice-agent/agent/persona/Modelfile` (Ollama syntax) — senior engineer voice, no nanny disclaimers | **Engineered cost control — three independent timeout layers:** 1. Gateway JWT TTL — 5 min (`TOKEN_TTL_SECONDS=300`) 2. Agent asyncio guard — 5 min (`AEGIS_SESSION_MAX_SECONDS`, env-configurable) 3. UI countdown — visible 5:00 → 0:00 in the panel; amber under 60s, red under 30s, auto-closes at 0 **Observed latency** (production session 2026-06-01): p50 end-to-end ~1.3 s in `ap-south-1`. RAG adds ~120 ms; Groq TTFT ~400–540 ms dominates. 📚 Full design rationale at **[docs/voice-guide/_index.md](./docs/voice-guide/_index.md)** — 4 pages covering UI integration, RAG + LLM strategy, and AWS deployment.
## 🚦 The 10-stage request pipeline For a single `POST /execute`, the gateway executes ten stages plus a final audit emit. Each stage has its own latency histogram, its own deny counter, its own Grafana panel. flowchart LR A[POST /execute] --> S0[Stage 0: Kill switch] S0 --> S1[Stage 1: Auth + JTI replay] S1 --> S2[Stage 2: Rate limit + per-agent cap] S2 --> S3[Stage 3: Inference proxy
17-pattern injection scan] S3 --> S4[Stage 4: Policy / OPA] S4 --> S5[Stage 5: Behavior baseline] S5 --> S6[Stage 6: Decision combiner] S6 --> S7[Stage 7: Autonomy contract] S7 --> S8[Stage 8: Execute] S8 --> S9[Stage 9: Output filter] S9 --> S10[Stage 10: Signed audit row] S10 --> Resp[Response] Walked through on a real `DROP TABLE` attempt at **[docs/architecture/flow-of-a-decision.md](./docs/architecture/flow-of-a-decision.md)**.
## 🔐 Cryptographic trust chain Aegis's claim isn't "we logged it." It's "we logged it AND any tampering is mathematically detectable." | Layer | Mechanism | Verifiable by | |---|---|---| | **Per-row signature** | ed25519 signs `(event_hash, prev_hash, content_hash)` for every audit row | `acp verify-chain` (offline CLI) | | **HMAC chain across 16 shards** | Each row's `prev_hash` links to the previous row in its shard — write contention bounded by sharding | Chain verifier walks every shard independently | | **Daily Merkle root** | All rows for a UTC day → Merkle tree → root sealed at midnight, signed with the day's key, chained to the previous day's root | `acp verify-root` validates a root against archived signature | | **Per-receipt inclusion proof** | Every receipt carries the Merkle path from its row to the day's root | Customer can prove inclusion using only the row's receipt + the archived root | Any tampering — change one row, delete a shard, swap a signing key — breaks one of these layers and the chain verifier flags it. Detailed crypto at **[docs/security/crypto-audit-chain.md](./docs/security/crypto-audit-chain.md)**.
## 📦 SDK integration Two integration styles — direct HTTP, or use the SDK. ### Direct (any language) # 1. Mint a token TOKEN=$(curl -sS -X POST https://dev.aegisagent.in/auth/token \ -H "Content-Type: application/json" \ -H "X-Tenant-ID: 00000000-0000-0000-0000-000000000001" \ -d '{"email":"","password":""}' \ | jq -r '.data.access_token') # 2. Execute a tool curl -sS -X POST https://dev.aegisagent.in/execute \ -H "Authorization: Bearer $TOKEN" \ -H "X-Tenant-ID: 00000000-0000-0000-0000-000000000001" \ -H "X-Agent-ID: " \ -H "Content-Type: application/json" \ -d '{"tool_name":"db.query","payload":{"query":"SELECT 1"}}' Response carries the full decision envelope: `action`, `risk_score`, `findings`, the upstream tool result, and a `receipt_url`. ### Python SDK from acp_client import AegisClient client = AegisClient( base_url="https://dev.aegisagent.in", tenant_id="00000000-0000-0000-0000-000000000001", token=os.environ["ACP_TOKEN"], ) # /execute returns a Decision object; receipt is fetched lazily decision = client.execute(agent_id=AGENT_ID, tool="db.query", payload={"query": "SELECT 1"}) print(decision.action, decision.risk_score, decision.findings) # Verify the chain offline — never trusts the running gateway client.verify_chain() # raises if any row's signature or prev_hash is wrong LangChain and Anthropic adapters in `sdk/integrations/` — full reference at **[docs/api/reference.md](./docs/api/reference.md)**.
## ☁️ AWS deployment The live deployment is one EC2, one RDS, one ElastiCache, three S3 buckets. Terraform-managed under `infra/terraform/environments/dev/`. Deploys via tarball → S3 → SSM, no GitHub Actions on the EC2. | Resource | Value | Cost (approx /mo) | |---|---|---| | EC2 | `i-0f720c100f904291a` · `m6g.medium` (1 vCPU / 4 GB Graviton) · 50 GB gp3 | ~$28 always-on / ~$11 stop-overnight | | RDS | `acp-postgres-dev` · `db.t4g.micro` · Single-AZ · 20 GB gp3 | ~$15 | | ElastiCache | `acp-redis-dev` · single `cache.t3.micro` | ~$9 | | ALB | `acp-dev-alb-1541605899` · HTTPS:443 + HTTP:80→HTTPS | ~$18 | | S3 | `acp-dev-backups-628478` (versioned, 30-day) + statuspage + ALB logs | ~$3 | | Secrets Manager | 5 secrets under `acp-dev/*` | ~$2 | | **Aegis core total** | | **~$70/mo** always-on | | Voice Agent EC2 | `t3.medium` (separate Terraform) | ~$22–45/mo (stop-between-demos / always-on) | **Sizing notes:** - `m6g.medium` was chosen because `t4g.small` (2 GB) OOMed during the initial healthcheck race and `t4g.medium`/`large` returned `InsufficientInstanceCapacity` in `ap-south-1a` on the resize attempt. `m6g.medium` was immediately available. - RDS is Single-AZ for cost; production-grade would step to Multi-AZ. - No NAT — the EC2 is in a public subnet, IGW egress. **11 non-obvious deploy bugs** the 2026-06-01 full rebuild surfaced — ALB deletion-protection traps, S3 versioned-bucket destroy, two-phase RDS password apply, macOS AppleDouble null bytes in tar, pgbouncer prod-hostname-in-bundle, `.local` TLD validation rejection, GRAFANA_ADMIN_PASSWORD compose validation gate, EC2 OOM on t4g.small — all catalogued at **[docs/operations/deployment.md](./docs/operations/deployment.md#non-obvious-gotchas-catalogued-during-the-2026-06-01-dev-rebuild)**.
## 🚀 Quick start ### Option A — Try the live system Open **** and sign in. The Login page exposes a **"Try Live Demo"** button that signs you in as the read-only `VIEWER` demo account — interviewers can click straight through. Demo passwords live in the onboarding email and in `voice-agent/agent/.env.local` for self-hosted dev; they are deliberately not in this README. The Voice Agent button is top-right of the navbar on every page. ### Option B — Run locally (full stack) # Clone git clone https://github.com/Abhi-mishra998/aegis.git cd aegis # Boot the 22-container stack (first boot ~3–5 min for image pulls) cd infra && docker compose up --build -d sleep 90 # Verify docker compose ps --format "table {{.Name}}\t{{.Status}}" # Set up a venv + seed the admin cd .. && python3 -m venv .venv && source .venv/bin/activate pip install -e ".[dev,server]" python scripts/utils/seed_admin.py # Optional — run all 3 demo packs in dry-run (~10s, no live services hit) ACP_DRY_RUN=1 python demos/run_all_demos.py # Open the UI open http://localhost:5173 # macOS # Login with the credentials you set in seed_admin.py ### Option C — Run the Voice Agent locally cd voice-agent/agent # Install deps (uv recommended; first run downloads ~600 MB of model weights) uv sync uv run --module livekit.agents download-files # Set credentials in .env.local — LIVEKIT, DEEPGRAM, CARTESIA, GROQ keys. # See voice-agent/agent/.env.example for the shape (placeholders only). cp .env.example .env.local && $EDITOR .env.local # Build the RAG index from the docs (idempotent — re-run after doc edits) uv run src/ingest.py # Console mode (terminal voice loop) uv run src/agent.py console # Or dev mode (connects to LiveKit Cloud + Playground) uv run src/agent.py dev 📚 Full ops runbook at **[voice-agent/README.md](./voice-agent/README.md)**.
## 🎬 Demo packs Three deterministic scenario packs that populate the UI from empty to representative in ~46 seconds total. Each runs against the live gateway (or in dry-run mode for offline). | Pack | Scenarios | Duration | Tools | |---|---|---|---| | **`db_copilot`** | 5 — safe SELECT, bulk SELECT *, PII exfil, DROP TABLE, kill switch | ~11 s | `db.query`, `db.execute`, `execute_agent` | | **`devops_agent`** | 9 — read/scale/delete/RBAC, blast-radius, autonomy contract enforcement, kill-switch persistence | ~14 s | 23 K8s tools (`k8s.delete_pod`, `k8s.scale`, `k8s.rbac.grant`, …) | | **`support_agent`** | 7 — ticket lookup, single-customer PII, cross-tenant block, bulk PII export, email exfil, runaway burst, crypto receipt | ~21 s | 9 CRM/ticketing tools | After one run: 3 agents, ~200 audit rows, ~50 decisions, ~50 flight timelines, 1 signed Merkle root, varied risk distribution. Full guide at **[docs/introduction/demo-packs.md](./docs/introduction/demo-packs.md)**.
## 📂 Repository layout acp/ ├── README.md This file ├── pyproject.toml Python project + [server,dev] extras ├── ruff.toml Lint config (excludes voice-agent/) ├── conftest.py pytest fixtures shared across services │ ├── docs/ GitBook — single source of truth for design │ ├── README.md Entry index │ ├── SUMMARY.md GitBook nav │ ├── introduction/ What is, why, quickstart, 60s-tour, demo packs │ ├── architecture/ System overview, 10-stage pipeline, data model, deployment topology, UI primitives, flow of a decision │ ├── services/ One page per backend service │ ├── ui/ UI map + per-page docs │ ├── voice-guide/ ⭐ NEW — overview, UI integration, RAG+LLM, deployment │ ├── security/ Crypto audit chain, JWT, RBAC, kill switch, OPA, threat scenarios, secret management │ ├── operations/ Deployment, backup-restore, key rotation, soak, observability + 3 runbooks │ └── api/ Reference + auth + error codes + examples │ ├── services/ 13 FastAPI microservices │ ├── gateway/ 11-stage middleware + 22 sub-routers (incl. routers/voice.py) │ ├── identity/ JWT, SSO, agent credentials │ ├── registry/ Agents and tool permissions │ ├── policy/ OPA bundle host │ ├── decision/ Signal combiner + kill switch │ ├── behavior/ Behavioral firewall + Isolation Forest baselines │ ├── audit/ Signed chain + transparency roots │ ├── usage/ Outbox consumer + billing routes │ ├── api/ Incidents, API keys, webhooks, SIEM │ ├── identity_graph/ Typed graph + blast-radius │ ├── flight_recorder/ Per-execution timelines │ ├── autonomy/ Multi-agent contracts + playbooks │ ├── forensics/ Investigation + replay + PDF export │ └── insight/ Audit aggregations │ ├── ui/ React + Vite + Tailwind UI │ ├── src/components/VoiceAgent/ ⭐ NEW — Button, Panel, AnimatedOrb │ ├── src/components/Common/ Button, Modal, ConfirmDialog, ConnectorPrimitives, … │ ├── src/components/Layout/ Sidebar, Topbar (with Voice Agent button), AgentScopePicker │ ├── src/pages/ 38 pages, all wired to live APIs │ └── tests/ Playwright e2e │ ├── voice-agent/ ⭐ NEW — sibling project, separate venv + Terraform │ ├── README.md Run-locally + run-on-EC2 │ ├── diagram.md Exhaustive 657-line architecture reference │ ├── agent/ │ │ ├── AGENT_V2.md Build contract — the locked decisions │ │ ├── src/{agent,rag,ingest,modelfile}.py │ │ ├── persona/Modelfile System prompt + generation params (Ollama syntax) │ │ └── tests/test_rag.py 7-test pytest suite over the hybrid retriever │ └── infrastructure/ Terraform — VPC, SG, IAM, secrets, EC2, EIP, monitoring, autostop │ ├── sdk/ Python + JS SDKs │ ├── acp_client/ Python — execute, verify-chain, verify-root │ ├── acp-js/ TypeScript SDK │ ├── integrations/ LangChain + Anthropic + OpenAI adapters │ └── intelligence/ Cross-agent intelligence engine (used by behavior service) │ ├── demos/ 3 scenario packs (db_copilot, devops_agent, support_agent) + autonomous agent │ ├── infra/ │ ├── docker-compose.yml Base + docker-compose.aws.yml override │ ├── terraform/ environments/{dev} + modules/* (network, alb, rds, elasticache, secrets, …) │ ├── grafana-dashboards/ platform-slo, trust-layers, tenant-activity, queues │ ├── prometheus-rules.yml Alertmanager routes │ ├── helm/ Self-host Helm chart │ ├── statuspage/ Public status JSON publisher │ └── cloudwatch/ CloudWatch agent config │ ├── scripts/ │ ├── ops/ backup, restore_drill, export_tenant, redact_tenant_pii, reconcile, rollback, smoke_test │ ├── maintenance/ backfill_audit_chain, rotate_transparency_key, prune_audit_outbox │ └── utils/ seed_admin, dev-only helpers │ ├── tests/ Unit + integration suite (Playwright is under ui/tests/) │ └── .github/workflows/ CI — test.yml (ruff + pytest + Playwright + helm-lint), terraform.yml (fmt + validate), scheduled-{backup,prune}, weekly-restore-drill
## 🛠️ Tech stack | Layer | Choice | Why | |---|---|---| | Language | Python 3.11 (services) · TypeScript (UI) | Async story, FastAPI fit, mature OPA bindings | | Web framework | FastAPI 0.115 | First-class async, OpenAPI generation, Pydantic v2 | | DB | PostgreSQL 15 (RDS) | JSON-B + transactional outbox + audit chain | | Cache | Redis 7 (ElastiCache) | Pub/Sub for SSE, kill-switch flag, rate-limit, decision cache | | Policy engine | OPA / Rego | Decoupled policy from code; hot-reloadable | | Crypto | ed25519 + SHA-256 + Merkle | Per-row signature, chained prev_hash, daily transparency root | | UI | React 18 + Vite 5 + Tailwind | Fast HMR, small bundle, no shadcn dependency | | Voice | LiveKit Agents 1.5 + Deepgram + Cartesia + Groq | Hosted LLM + STT + TTS; outbound WebRTC; free-tier cost envelope | | Infrastructure | Terraform 1.5+ | Reproducible AWS provisioning, S3 state | | CI | GitHub Actions | ruff + pytest + Playwright + helm-lint + `terraform fmt/validate`. **No auto-deploy** — deploys via SSM. |
## 👥 Who this is for ### 👷 Engineers building agents You want a clear seam between "the agent decides what to do" and "the action runs". Aegis is that seam. Drop the SDK in, declare your agent + permissions, point your tool calls at `/execute`, and you get auth + policy + behavior + decision + audit for free. Every call comes back with a signed receipt you can show to a customer or an auditor. ### 🏢 Security architects You're being asked "what stops an AI agent going rogue?" Aegis is the runtime control plane that gives you a defensible answer. Tenant-isolation primitive, per-agent rate limits + cost caps, a kill switch that propagates in under 5 seconds, a tamper-evident audit chain that survives DB compromise. Threat scenarios documented at [docs/security/threat-scenarios.md](./docs/security/threat-scenarios.md). ### 🎯 Reviewers evaluating the project Skim **[docs/README.md](./docs/README.md)** for the 1-hour reading order. Click the **Voice Agent** button on `dev.aegisagent.in` and ask it about the kill switch — the answer comes from this exact docs tree via hybrid RAG. The 5-minute video link at the top of this README walks the same scenarios live.
## 📬 Get in touch - **GitHub Issues** — bug reports, feature requests - **Live demo** — (sign in, click the Voice Agent button) - **Engineering write-up** — [Hashnode: I built a runtime firewall for AI agents](https://projectsphere.hashnode.dev/i-built-a-runtime-firewall-for-ai-agents) - **5-minute video** — [Google Drive demo link](https://drive.google.com/file/d/1Eojid76NcrRLC1Gp302i113pNgrH1hso/view)
## 📜 License Apache 2.0 for the code. CC-BY-4.0 for the documentation. See [LICENSE](./LICENSE).
separator