binRick/packet-gen
GitHub: binRick/packet-gen
Stars: 0 | Forks: 0
# packet-gen
[**Live demo & management UI → packet-gen.ximg.app**](https://packet-gen.ximg.app)
Synthetic office network traffic generator. Runs simulated users ("personas") that produce real protocol traffic against a paired sink server, so that the resulting packet flows — when piped through an encrypted tunnel — look like a real office to anyone analyzing the tunnel.
The content of the traffic is throwaway. What matters is the packet-level characteristics: SNI values, packet sizes, inter-arrival times, session durations, fan-out across destinations.
flowchart LR
subgraph synthnet["docker network: synthnet (172.28.0.0/24)"]
direction LR
content["content
ollama
(.5)"] corpus["corpus-builder
one-shot"] originator["originator
FastAPI + asyncio
per-persona tasks
(.20)"] terminator["terminator
protocol sinks
SNI cert minting
fake DNS
(.10)"] corpora[("shared corpora
paths + pages
+ emails")] certs[("shared CA
+ minted certs")] content -- generate --> corpus corpus -- write JSONL --> corpora originator -- sample --> corpora terminator -- sample --> corpora terminator -- write --> certs originator -- trust --> certs originator == "DNS / TLS / SIP-RTP / NTP / SMTP / IMAP" ==> terminator end client["browser / analyst"] nginx["nginx
packet-gen.ximg.app"] static["static UI
(index.html)"] client --> nginx nginx -- "/" --> static nginx -- "/api/*" --> originator classDef svc fill:#0e1117,stroke:#06b6d4,color:#cbd5e1; classDef vol fill:#1a1c25,stroke:#64748b,color:#94a3b8; class content,corpus,originator,terminator,nginx,static svc; class corpora,certs vol; ## Protocols | Activity | Wire | TLS? | What it produces | | --- | --- | --- | --- | | `https_browse` | HTTPS / TCP 443 | yes (SNI cert minted per host) | multi-page sessions, real paths + HTML bodies sampled from corpus | | `smtp_send` | SMTP / TCP 587 | STARTTLS | EHLO, MAIL FROM, RCPT TO, DATA with corpus-sourced subject + body | | `imap_poll` | IMAP / TCP 993 | implicit TLS | LOGIN, SELECT INBOX, NOOP, SEARCH UNSEEN, LOGOUT | | `voip_call` | SIP / UDP 5060 + RTP / UDP ephemeral | no | INVITE / 200 / ACK / ~50 pps RTP for call duration / BYE; terminator echoes RTP back | | `ntp_query` | NTP / UDP 123 | no | 48-byte client request, valid 48-byte server response | | `https_beacon` | HTTPS / TCP 443 | yes | small POST/GET to a fixed URL — models EDR / telemetry / heartbeat traffic | ## Architecture ┌──────────────────────┐ │ content (ollama) │ │ llama3.2:3b │ └─────────┬────────────┘ │ (corpus build, one-shot) ▼ ┌──────────────────────┐ │ corpus-builder │ │ paths/ + pages/ │ ──► shared corpora volume └──────────────────────┘ │ │ (read at runtime) ┌────────────────────────┐ DNS + TLS ┌─────────▼──────────────┐ │ originator │ ───────────────────► │ terminator │ │ FastAPI control plane │ │ on-demand cert minting│ │ asyncio personas │ ◄─────────────────── │ protocol sinks │ │ httpx with corpus │ real HTML bodies │ fake DNS resolver │ └────────────────────────┘ └────────────────────────┘ 172.28.0.20 172.28.0.10 - **Originator**: FastAPI app. Holds office state, spawns one asyncio task per persona, each task runs the persona's activities on Poisson schedules. Trusts the internal CA so TLS works. - **Terminator**: TLS sink that mints per-SNI certificates on the fly from an internal CA (like mitmproxy). Also runs a wildcard DNS resolver that returns its own IP for every A query, so any hostname a persona asks for resolves to the sink. - **Content service**: a local ollama instance running a small LLM. Generates a per-persona, per-domain corpus of URL paths and HTML pages so that traffic has realistic *size and structure distribution* (real pages are bimodal in length, not uniform-random). - **Corpus builder**: a one-shot job that reads `personas.yaml`, asks the content service for paths + pages per domain, writes JSONL to a shared volume. Originator and terminator sample from it at runtime — the LLM is never on the hot path. - **Shared CA volume**: terminator creates `ca.crt` / `ca.key` on first run, originator reads `ca.crt` to validate the chain. ## Run docker compose up --build **First-run timing**: the content service has to pull `llama3.2:3b` (~2 GB) and the corpus builder then generates paths + pages for every domain. Expect 15–40 minutes on CPU; minutes on GPU. Subsequent runs reuse the cached model and skip corpus regeneration unless you set `REBUILD_CORPUS=1` on the `corpus-builder` service. To skip the corpus entirely (faster startup, lower realism — falls back to random bytes), comment out the `corpus-builder` service and the `condition: service_completed_successfully` lines that depend on it. Then: # inspect curl localhost:8000/personas curl localhost:8000/offices # start the default 'hq' office curl -X POST localhost:8000/offices/hq/start # watch events stream in watch -n 1 'curl -s localhost:8000/events?n=20 | jq .' # stop curl -X POST localhost:8000/offices/hq/stop ## API | Method | Path | Purpose | | --- | --- | --- | | GET | `/healthz` | liveness | | GET | `/personas` | list loaded persona templates | | GET | `/offices` | list offices | | POST | `/offices` | create/update an office | | GET | `/offices/{id}` | office detail + last 50 events | | POST | `/offices/{id}/start` | spawn the persona runners | | POST | `/offices/{id}/stop` | tear them down | | GET | `/events?n=50` | global recent event log | ## Personas Defined in `originator/app/config/personas.yaml`. Each persona is a list of activities; each activity has a protocol, a Poisson schedule, and protocol-specific parameters (domain pool, page count, payload size, etc.). Offices are groups of personas, defined in `originator/app/config/offices.yaml` or via `POST /offices`. ## Realism notes - **Organic-but-fast start**: when an office starts, each activity samples its first-fire delay from the steady-state distribution rather than waiting a full interval. Within ~one mean-interval the office looks like it's been running all day. - **SNI diversity**: every persona has its own domain pool. The terminator mints a valid cert (signed by the internal CA) matching whatever SNI shows up, so wire traffic includes real cert-chain bytes per domain. - **TLS fingerprint caveat**: all clients currently share Python's TLS stack, so the ClientHello fingerprint is identical across personas. Per-persona fingerprint shaping is a future task. - **No tunnel yet**: scaffold runs over the Docker bridge. WireGuard sidecars (the actual subject of analysis) are a future task. ## Layout originator/ app/ main.py FastAPI app + lifespan engine.py Office/Persona runtime schedule.py Poisson + initial-delay logic models.py Pydantic data model corpus.py Lookup paths from shared corpus activities/ One module per protocol handler config/ personas.yaml, offices.yaml terminator/ app/ main.py Supervisor ca.py Internal CA load/create cert_mint.py Per-SNI cert generation corpus.py Lookup pages from shared corpus sinks/ https, dns, ... one per protocol corpus_builder/ build.py One-shot job: persona → ollama → paths/pages JSONL shared/certs/ CA + minted certs (gitignored) corpora are in a named docker volume, not on disk ## Corpus tuning The corpus builder is configured via env vars on the `corpus-builder` service in `docker-compose.yml`: | Var | Default | Purpose | | --- | --- | --- | | `CORPUS_MODEL` | `llama3.2:3b` | ollama tag. `qwen2.5:1.5b` is much faster, slightly lower quality. | | `PATHS_PER_DOMAIN` | 40 | URL paths generated per domain (one batched LLM call). | | `PAGES_PER_DOMAIN` | 5 | HTML page bodies per domain (one call per page). | | `CONCURRENCY` | 2 | Parallel page generations (ollama supports up to `OLLAMA_NUM_PARALLEL`). | | `REBUILD_CORPUS` | unset | Set to `1` to force regeneration even if corpus already exists. |
ollama
(.5)"] corpus["corpus-builder
one-shot"] originator["originator
FastAPI + asyncio
per-persona tasks
(.20)"] terminator["terminator
protocol sinks
SNI cert minting
fake DNS
(.10)"] corpora[("shared corpora
paths + pages
+ emails")] certs[("shared CA
+ minted certs")] content -- generate --> corpus corpus -- write JSONL --> corpora originator -- sample --> corpora terminator -- sample --> corpora terminator -- write --> certs originator -- trust --> certs originator == "DNS / TLS / SIP-RTP / NTP / SMTP / IMAP" ==> terminator end client["browser / analyst"] nginx["nginx
packet-gen.ximg.app"] static["static UI
(index.html)"] client --> nginx nginx -- "/" --> static nginx -- "/api/*" --> originator classDef svc fill:#0e1117,stroke:#06b6d4,color:#cbd5e1; classDef vol fill:#1a1c25,stroke:#64748b,color:#94a3b8; class content,corpus,originator,terminator,nginx,static svc; class corpora,certs vol; ## Protocols | Activity | Wire | TLS? | What it produces | | --- | --- | --- | --- | | `https_browse` | HTTPS / TCP 443 | yes (SNI cert minted per host) | multi-page sessions, real paths + HTML bodies sampled from corpus | | `smtp_send` | SMTP / TCP 587 | STARTTLS | EHLO, MAIL FROM, RCPT TO, DATA with corpus-sourced subject + body | | `imap_poll` | IMAP / TCP 993 | implicit TLS | LOGIN, SELECT INBOX, NOOP, SEARCH UNSEEN, LOGOUT | | `voip_call` | SIP / UDP 5060 + RTP / UDP ephemeral | no | INVITE / 200 / ACK / ~50 pps RTP for call duration / BYE; terminator echoes RTP back | | `ntp_query` | NTP / UDP 123 | no | 48-byte client request, valid 48-byte server response | | `https_beacon` | HTTPS / TCP 443 | yes | small POST/GET to a fixed URL — models EDR / telemetry / heartbeat traffic | ## Architecture ┌──────────────────────┐ │ content (ollama) │ │ llama3.2:3b │ └─────────┬────────────┘ │ (corpus build, one-shot) ▼ ┌──────────────────────┐ │ corpus-builder │ │ paths/ + pages/ │ ──► shared corpora volume └──────────────────────┘ │ │ (read at runtime) ┌────────────────────────┐ DNS + TLS ┌─────────▼──────────────┐ │ originator │ ───────────────────► │ terminator │ │ FastAPI control plane │ │ on-demand cert minting│ │ asyncio personas │ ◄─────────────────── │ protocol sinks │ │ httpx with corpus │ real HTML bodies │ fake DNS resolver │ └────────────────────────┘ └────────────────────────┘ 172.28.0.20 172.28.0.10 - **Originator**: FastAPI app. Holds office state, spawns one asyncio task per persona, each task runs the persona's activities on Poisson schedules. Trusts the internal CA so TLS works. - **Terminator**: TLS sink that mints per-SNI certificates on the fly from an internal CA (like mitmproxy). Also runs a wildcard DNS resolver that returns its own IP for every A query, so any hostname a persona asks for resolves to the sink. - **Content service**: a local ollama instance running a small LLM. Generates a per-persona, per-domain corpus of URL paths and HTML pages so that traffic has realistic *size and structure distribution* (real pages are bimodal in length, not uniform-random). - **Corpus builder**: a one-shot job that reads `personas.yaml`, asks the content service for paths + pages per domain, writes JSONL to a shared volume. Originator and terminator sample from it at runtime — the LLM is never on the hot path. - **Shared CA volume**: terminator creates `ca.crt` / `ca.key` on first run, originator reads `ca.crt` to validate the chain. ## Run docker compose up --build **First-run timing**: the content service has to pull `llama3.2:3b` (~2 GB) and the corpus builder then generates paths + pages for every domain. Expect 15–40 minutes on CPU; minutes on GPU. Subsequent runs reuse the cached model and skip corpus regeneration unless you set `REBUILD_CORPUS=1` on the `corpus-builder` service. To skip the corpus entirely (faster startup, lower realism — falls back to random bytes), comment out the `corpus-builder` service and the `condition: service_completed_successfully` lines that depend on it. Then: # inspect curl localhost:8000/personas curl localhost:8000/offices # start the default 'hq' office curl -X POST localhost:8000/offices/hq/start # watch events stream in watch -n 1 'curl -s localhost:8000/events?n=20 | jq .' # stop curl -X POST localhost:8000/offices/hq/stop ## API | Method | Path | Purpose | | --- | --- | --- | | GET | `/healthz` | liveness | | GET | `/personas` | list loaded persona templates | | GET | `/offices` | list offices | | POST | `/offices` | create/update an office | | GET | `/offices/{id}` | office detail + last 50 events | | POST | `/offices/{id}/start` | spawn the persona runners | | POST | `/offices/{id}/stop` | tear them down | | GET | `/events?n=50` | global recent event log | ## Personas Defined in `originator/app/config/personas.yaml`. Each persona is a list of activities; each activity has a protocol, a Poisson schedule, and protocol-specific parameters (domain pool, page count, payload size, etc.). Offices are groups of personas, defined in `originator/app/config/offices.yaml` or via `POST /offices`. ## Realism notes - **Organic-but-fast start**: when an office starts, each activity samples its first-fire delay from the steady-state distribution rather than waiting a full interval. Within ~one mean-interval the office looks like it's been running all day. - **SNI diversity**: every persona has its own domain pool. The terminator mints a valid cert (signed by the internal CA) matching whatever SNI shows up, so wire traffic includes real cert-chain bytes per domain. - **TLS fingerprint caveat**: all clients currently share Python's TLS stack, so the ClientHello fingerprint is identical across personas. Per-persona fingerprint shaping is a future task. - **No tunnel yet**: scaffold runs over the Docker bridge. WireGuard sidecars (the actual subject of analysis) are a future task. ## Layout originator/ app/ main.py FastAPI app + lifespan engine.py Office/Persona runtime schedule.py Poisson + initial-delay logic models.py Pydantic data model corpus.py Lookup paths from shared corpus activities/ One module per protocol handler config/ personas.yaml, offices.yaml terminator/ app/ main.py Supervisor ca.py Internal CA load/create cert_mint.py Per-SNI cert generation corpus.py Lookup pages from shared corpus sinks/ https, dns, ... one per protocol corpus_builder/ build.py One-shot job: persona → ollama → paths/pages JSONL shared/certs/ CA + minted certs (gitignored) corpora are in a named docker volume, not on disk ## Corpus tuning The corpus builder is configured via env vars on the `corpus-builder` service in `docker-compose.yml`: | Var | Default | Purpose | | --- | --- | --- | | `CORPUS_MODEL` | `llama3.2:3b` | ollama tag. `qwen2.5:1.5b` is much faster, slightly lower quality. | | `PATHS_PER_DOMAIN` | 40 | URL paths generated per domain (one batched LLM call). | | `PAGES_PER_DOMAIN` | 5 | HTML page bodies per domain (one call per page). | | `CONCURRENCY` | 2 | Parallel page generations (ollama supports up to `OLLAMA_NUM_PARALLEL`). | | `REBUILD_CORPUS` | unset | Set to `1` to force regeneration even if corpus already exists. |