JustinwkWan/NetSentinel

GitHub: JustinwkWan/NetSentinel

Stars: 0 | Forks: 0

# NetSentinel An AI agent for network security that ingests PCAP files (or live traffic), detects anomalous network flows, and autonomously investigates each flagged flow using a ReAct-style agent loop backed by CVE and MITRE ATT&CK threat intelligence. The output is structured threat reports, viewable in a web dashboard or from the CLI. ## Architecture PCAP file / live capture | v [ Ingestion ] ---> FlowRecords (aggregated network flows) | v [ Detection ] ---> FlaggedFlows (anomalous flows with scores) | v [ Agent (LangGraph ReAct loop) ] |--- cve_lookup tool ---------> [ RAG Store (ChromaDB) ] |--- attack_technique_lookup -> [ RAG Store (ChromaDB) ] | v [ Structured Threat Reports ] | +--> CLI output +--> FastAPI backend ---> React dashboard **4 core layers:** Plus two peer layers built on top: - **Evaluation** (`eval/`) - LLM-as-judge harness that scores agent reports against a labeled dataset with bias mitigations (rubric scoring, anti-verbosity, swappable judge model). - **Web app** (`api/` + `web/`) - FastAPI backend and React dashboard for running the pipeline, browsing local pcaps, and driving live capture from the browser. ## Tech Stack - **Language:** Python (core) + TypeScript (frontend) - **Packet handling:** scapy - **Anomaly detection:** PyTorch (LSTM autoencoder) - **Vector store:** ChromaDB - **Agent orchestration:** LangGraph (ReAct loop) - **LLM:** Anthropic Claude API (DeepSeek supported via the Anthropic-compatible endpoint) - **Backend API:** FastAPI + Uvicorn - **Frontend:** React + Vite + Tailwind CSS - **Live capture:** dumpcap (Wireshark) ring buffer - **CLI entry point:** `netsentinel/cli.py` ## Setup ### Prerequisites - Python 3.10+ - An [Anthropic API key](https://console.anthropic.com/) (or a DeepSeek key — see Configuration) - Node.js 18+ and npm — only for the web dashboard - Wireshark / `dumpcap` — only for live capture ### Installation # Clone the repository git clone https://github.com/JustinwkWan/NetSentinel.git cd NetSentinel # Create and activate a virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt ### Configuration Create a `.env` file in the project root with your API key: cp .env.example .env # Edit .env and add your API key ANTHROPIC_API_KEY=your-api-key-here **Using DeepSeek (or another Anthropic-compatible provider):** set the model, base URL, and key instead. NetSentinel talks to any Anthropic-compatible endpoint via `langchain-anthropic`. LLM_MODEL=deepseek-v4-pro LLM_BASE_URL=https://api.deepseek.com/anthropic DEEPSEEK_API_KEY=your-deepseek-key All tunable settings (model names, retrieval `k`, agent iteration cap, detector thresholds, paths) live in `config.py` — no magic numbers scattered through the code. ### Build the RAG Store Before running the pipeline, build the ChromaDB vector store with threat intelligence data: python scripts/build_rag_store.py This creates a ChromaDB collection at `data/chroma/` with CVE entries (and ATT&CK techniques in Phase 2). ## Usage ### Run the pipeline on a PCAP file python -m netsentinel.cli # Use the LSTM detector instead of the default stub python -m netsentinel.cli --detector lstm Example with the included sample: python -m netsentinel.cli data/pcaps/sample_suspicious.pcap python -m netsentinel.cli data/pcaps/sample_suspicious.pcap --detector lstm The pipeline will: 1. Parse the PCAP and assemble network flows 2. Flag suspicious flows (stub: rule-based, lstm: reconstruction error) 3. Investigate each flagged flow using the AI agent 4. Print structured threat reports ### Train the LSTM detector # Generate normal traffic for training python scripts/generate_training_pcap.py # Train the model python scripts/train_lstm.py ### Run the demo python scripts/run_demo.py ### Web app (dashboard) The web app provides a browser UI for running the pipeline and viewing reports without parsing terminal output. It has two parts: a FastAPI backend and a React/Vite frontend. # Terminal 1 — backend (http://127.0.0.1:8765) source venv/bin/activate uvicorn api.main:app --host 127.0.0.1 --port 8765 --reload # Terminal 2 — frontend (http://localhost:5173) cd web npm install # first time only npm run dev Open http://localhost:5173. From the dashboard you can: - **Run from `data/pcaps/`** — pick a bundled pcap from the dropdown, or upload one. - **Browse a local folder** — point the backend at any folder on your machine and run a pcap in place (no copy). - **Live capture** — start/stop a rolling capture; each completed window is auto-analyzed. - **View reports** — severity, threat type, summary, CVEs, ATT&CK techniques, and remediation per flagged flow, plus a run history. The frontend proxies `/api/*` to the backend (see `web/vite.config.ts`). ### Live capture Capture live traffic into a rolling pcapng ring buffer and analyze each window as it completes. Two ways to drive it: **From the dashboard** — use the "Live capture" panel (Start/Stop, interface, detector). The backend manages `dumpcap` and auto-runs the pipeline on each closed rotation, cleaning up the capture files on stop. **From the CLI** — two standalone scripts: # Terminal A — capture 60s windows, 10-file rolling buffer (~10 min) sudo ./scripts/live_capture.sh # Terminal B — run the pipeline on each completed window DETECTOR=lstm ./scripts/watch_and_run.sh ### Evaluation harness Score the agent's reports against a labeled dataset using an LLM-as-judge with bias mitigations (rubric-based scoring, anti-verbosity instruction, swappable judge model via `EVAL_JUDGE_MODEL`). # Run the full eval set python -m eval.harness --save # Run specific cases python -m eval.harness --cases reverse_shell_4444 ssh_brute_force Results print a per-case breakdown plus aggregate scores, and `--save` writes raw results to `data/eval/eval_results.json`. ### Run tests pytest tests/ ## Project Structure NetSentinel/ ├── config.py # Central config (model, k values, thresholds) ├── data/ │ ├── pcaps/ # Sample + live-capture PCAP files │ ├── threat_intel/ # Raw CVE/ATT&CK data │ ├── chroma/ # ChromaDB store (gitignored, built by script) │ └── eval/ # Eval results (gitignored) ├── netsentinel/ │ ├── cli.py # CLI entry point │ ├── ingestion/ │ │ ├── sources.py # PacketSource interface, PcapFileSource │ │ └── flows.py # FlowRecord dataclass │ ├── detection/ │ │ ├── base.py # Detector Protocol, FlaggedFlow │ │ ├── stub.py # StubDetector (rule-based) │ │ └── lstm.py # LstmDetector (LSTM autoencoder) │ ├── rag/ │ │ ├── store.py # ChromaDB query interface │ │ ├── chunking.py # Natural-boundary chunkers │ │ └── build_store.py # Download + chunk + embed CVE/ATT&CK │ └── agent/ │ ├── graph.py # LangGraph ReAct graph │ ├── state.py # Agent state definition │ ├── tools.py # cve_lookup, attack_technique_lookup │ ├── prompts.py # System and investigation prompts │ └── report.py # ThreatReport dataclass ├── eval/ # Evaluation harness (LLM-as-judge) │ ├── dataset.py # Labeled eval cases │ ├── judge.py # Rubric-based judge with bias mitigations │ ├── harness.py # Runs agent + judge over the dataset │ └── report.py # Aggregate scoring + summary ├── api/ # FastAPI backend │ ├── main.py # Routes: pcaps, runs, capture, local browse │ ├── jobs.py # Background job store + pipeline orchestrator │ ├── capture.py # Live-capture manager (dumpcap ring buffer) │ └── models.py # Pydantic schemas ├── web/ # React + Vite + Tailwind dashboard │ └── src/ │ ├── api.ts # Typed API client │ └── components/ # PcapSelector, RunControls, LiveCapturePanel, … ├── scripts/ │ ├── build_rag_store.py # Build the RAG store │ ├── generate_training_pcap.py # Generate normal traffic for LSTM training │ ├── train_lstm.py # Train the LSTM detector │ ├── run_demo.py # Run the demo │ ├── live_capture.sh # dumpcap rolling-window capture │ └── watch_and_run.sh # Auto-run pipeline on each captured window └── tests/ ## Build Phases - [x] **Phase 1** - Skeleton end-to-end agent (PCAP -> StubDetector -> minimal RAG -> LangGraph loop -> report) - [x] **Phase 2** - Full RAG layer (real CVE/ATT&CK data, natural chunking, second tool) - [x] **Phase 3** - LSTM autoencoder detector (trained on normal traffic, flags anomalous flows by reconstruction error) - [x] **Phase 4** - Evaluation harness with LLM-as-judge (rubric scoring + bias mitigations) - [x] **Phase 5** - Web dashboard (FastAPI + React), live capture, and local-folder browsing ## Design Decisions See [Design.md](Design.md) for the full technical design document.