NIHAR-SARKAR/CHAI

GitHub: NIHAR-SARKAR/CHAI

Stars: 9 | Forks: 7

# CHAI Cyber Host Artificial Intelligence (C.H.A.I)

## Architecture Overview External Client (CHAI / any MCP tool) │ MCP stdio/SSE ▼ ┌─────────────────────────────────────────┐ │ MCP Security Server │ │ │ │ run_autonomous_scan() │ │ │ │ │ ┌────▼────────────────────┐ │ │ │ execution_loop.py │ │ │ │ (local, no LLM here) │ │ │ │ tool1 → tool2 → tool3 │ │ │ └────┬────────────────────┘ │ │ │ at phase boundaries only │ │ ┌────▼────────────────────┐ │ │ │ ai_planner.py │ │ │ │ plan / evaluate / │◄─────────┼── llm/provider_factory.py │ │ summarize │ │ (Azure / OpenAI / Claude / │ └─────────────────────────┘ │ Bedrock / OpenRouter / HF) │ │ │ All tools, safety, sandbox unchanged │ └─────────────────────────────────────────┘ **Design Philosophy: THIN BRAIN, THICK LOOP** - The internal LLM fires only at **decision boundaries**, not per-step - A local `execution_loop` handles tool chaining deterministically between LLM calls - Keeps token usage low (~6-10 calls per full pentest) and latency acceptable on a Pi 4 ## Features ### AI Decision Engine - **plan()**: Decides what to test next based on findings - **evaluate()**: Decides whether to continue or stop - **summarize_for_report()**: Generates executive summary and remediation priorities ### Security & Sandboxing - **firejail** profiles with rlimit restrictions - **Linux cgroups** for resource limiting - **Restricted user** (`pentester`) execution - **Tiered safety policy** (Tier 1/2/3) - **Immutable audit logging** of all commands and AI decisions ### Plugin System - Auto-discovers plugins from `plugins/bundled/` and `plugins/external/` - Drop-in plugin architecture — no core changes needed - Bundled plugins: Feroxbuster, Metasploit, Burp Suite API ### Database - **SQLite ONLY** — no Neo4j, Redis, or Postgres required - WAL mode for better concurrency - Knowledge graph with 50+ attack techniques and recursive CTE chain queries ## Project Structure CHAI/ ├── main.py # FastMCP server entry point ├── config.py # Configuration loader ├── config.yaml # Main configuration (no secrets) ├── .security.yml # API keys (git-ignored) ├── requirements.txt # Python dependencies ├── app_context.py # Application context singleton │ ├── llm/ # Multi-provider LLM adapter layer │ ├── base_provider.py # Abstract base class │ ├── provider_factory.py # Provider selection with fallback │ ├── prompt_templates.py # All LLM prompts (versioned) │ └── providers/ │ ├── azure_openai.py # Azure OpenAI │ ├── openai_direct.py # Direct OpenAI │ ├── anthropic_claude.py # Claude │ ├── amazon_bedrock.py # AWS Bedrock │ ├── openrouter.py # OpenRouter │ └── huggingface.py # HuggingFace │ ├── core/ # Core engine │ ├── session_manager.py # SQLite session CRUD + state machine │ ├── safety_policy.py # Command validation, tier system │ ├── process_controller.py # firejail/cgroups/chroot wrapper │ ├── audit_logger.py # Immutable audit logging │ ├── ai_planner.py # LLM decision engine (3 call types) │ └── execution_loop.py # Local chain runner │ ├── kb/ # Knowledge Base │ ├── graph_db.py # Attack graph with recursive CTE │ ├── playbook_loader.py # Playbook section extraction │ └── vector_search.py # Vector/BM25 search │ ├── tools/ # Security testing tools │ ├── base.py # Base tool class │ ├── recon.py # Reconnaissance │ ├── scan.py # Vulnerability scanning │ ├── injection.py # Injection testing │ ├── auth.py # Authentication testing │ ├── network.py # Network testing │ ├── poc.py # PoC generation │ ├── exec.py # Custom command execution │ ├── analyze.py # Findings analysis │ ├── report.py # Report generation │ └── autonomous.py # Autonomous scan orchestrator │ ├── plugins/ # Plugin system │ ├── plugin_base.py # Base class │ ├── plugin_loader.py # Auto-discovery loader │ └── bundled/ │ ├── feroxbuster_plugin.py # Directory bruteforcer │ ├── metasploit_plugin.py # Metasploit Framework │ └── burp_api_plugin.py # Burp Suite Pro API │ ├── models/ # Data models │ ├── session.py # Session and Finding models │ └── schemas.py # Pydantic schemas │ ├── utils/ # Utilities │ ├── command_parser.py # Command parsing │ ├── output_parser.py # Tool output parsing │ └── cvss_calculator.py # CVSS v3.1 calculator │ └── data/ # Database schemas & profiles ├── init_sessions.sql # Session DB schema + AI decisions table ├── init_graph.sql # Knowledge graph (50+ nodes) └── firejail/ └── pentest.profile # Firejail sandbox profile ## Installation ### Prerequisites - Any linux machine / Raspberry Pi 4/5 with Kali Linux ARM64 (bare metal, NO Docker) - Python 3.11+ - firejail installed - Kali Linux pentest tools (nmap, sqlmap, nuclei, ffuf, etc.) ### Setup # Clone the repository git clone https://github.com/NIHAR-SARKAR/CHAI.git cd CHAI # Create virtual environment python -m venv .venv source .venv/bin/activate -- linux .venv\Scripts\activate -- windows # Install dependencies pip install -r requirements.txt # Configure secrets cp .security.yml.example .security.yml chmod 600 .security.yml # Edit .security.yml with your API keys # Create required directories ### linux sudo mkdir -p /opt/sessions /opt/logs /opt/kb /opt/mcp-security-server/plugins/external sudo chown -R $(whoami) /opt/sessions /opt/logs /opt/kb ### windows PowerSheel New-Item -ItemType Directory -Force -Path "C:\opt\sessions" New-Item -ItemType Directory -Force -Path "C:\opt\logs" New-Item -ItemType Directory -Force -Path "C:\opt\kb" New-Item -ItemType Directory -Force -Path "C:\opt\mcp-security-server\plugins\external" icacls "C:\opt" /grant "$env:USERNAME:(OI)(CI)F" /Ts -- Grant current user full permissions # Install firejail profile sudo cp data/firejail/pentest.profile /etc/firejail/ # run server python main.py --transport streamable-http

## Configuration ### config.yaml (Main Config) Edit `config.yaml` to configure: - Server transport (stdio or SSE) - Sandbox limits (RAM, CPU, timeout) - LLM provider selection - Plugin enable/disable Key sections: llm: active_provider: "azure_openai" # Change to your preferred provider fallback_provider: "openrouter" # Optional fallback ai_planner: max_phases: 4 # Max autonomous phases stop_on_critical: true # Stop on critical findings plugins: bundled: feroxbuster: true metasploit: false # Disabled by default (Tier 3) burp_api: false # Needs Burp Pro API key ### .security.yml (Secrets) # NEVER commit this file azure_openai: api_key: "your-azure-key" openai: api_key: "your-openai-key" anthropic: api_key: "your-anthropic-key" # ... etc for each provider ### CHAI Integration Add to your CHAI `config.json`: **stdio transport:** { "tools": { "mcp": { "servers": { "chai-security": { "transport": "stdio", "command": "python", "args": ["-m", "main.py"], "cwd": "/opt/mcp-security-server", "env": { "PYTHONPATH": "/opt/mcp-security-server" }, "discovery": "deferred" } } } } } **SSE transport (for remote Pi access):** { "tools": { "mcp": { "servers": { "chai-security": { "transport": "sse", "url": "http://raspberrypi.local:9010/sse" } } } } } ## Usage ### Initialize a Session initialize_session( target="https://target.example.com", test_type="web_app", scope=["target.example.com"] ) # Returns: {"session_id": "sess-abc-123", ...} ### Run Autonomous Scan (One Call, Complete Test) run_autonomous_scan( session_id="sess-abc-123", max_phases=4, stop_on_critical=True, generate_report=True, provider_override=None # Uses config.yaml active_provider ) # Internally: plan → [recon → scan → inject] → evaluate → plan → [...] → report # Returns after ~15-30 min: # { # "phases_completed": 3, # "total_findings": 12, # "critical_count": 1, # "high_count": 4, # "report_path": "/opt/sessions/reports/sess-abc-123.md", # "status": "complete" # } ### Manual Tool Calls # Reconnaissance run_recon(session_id="sess-abc-123", target="target.example.com", recon_type="passive") # Vulnerability scanning scan_vulnerabilities(session_id="sess-abc-123", target="target.example.com", scanner="nuclei") # Injection testing test_injection(session_id="sess-abc-123", target="target.example.com", injection_type="sqli") # Authentication testing test_authentication(session_id="sess-abc-123", target="target.example.com", test_type="bypass") # Network testing test_network(session_id="sess-abc-123", target="target.example.com", test_type="ssl") # Custom command execute_command(session_id="sess-abc-123", command="nmap -sV target.example.com") # Run plugin run_plugin(session_id="sess-abc-123", plugin_name="feroxbuster", target="https://target.example.com") # Generate report generate_report(session_id="sess-abc-123", format="markdown") # Check status get_session_status(session_id="sess-abc-123") # Emergency stop emergency_stop(session_id="sess-abc-123") ## Adding a New LLM Provider **Step 1** — Create `llm/providers/gemini.py`: from llm.base_provider import BaseLLMProvider, LLMResponse class GeminiProvider(BaseLLMProvider): def __init__(self, config): ... @property def provider_name(self): return "gemini" async def complete(self, ...): ... async def health_check(self): ... **Step 2** — Add one `case` to `llm/provider_factory.py`: case "gemini": from llm.providers.gemini import GeminiProvider return GeminiProvider(config) **Step 3** — Add config block to `config.yaml`: llm: gemini: enabled: true model: "gemini-2.5-pro" api_base: "https://generativelanguage.googleapis.com/v1beta/openai" **Step 4** — Add key to `.security.yml`: gemini: api_key: "" **Step 5** — Change `active_provider: "gemini"` in `config.yaml`. **That's it. No other files change.** ## Adding a New Pentest Plugin **Step 1** — Create `plugins/external/gospider_plugin.py`: from plugins.plugin_base import PentestPlugin, PluginMetadata, PluginResult class GospiderPlugin(PentestPlugin): @property def metadata(self): return PluginMetadata( name="gospider", display_name="GoSpider Web Crawler", version="1.1.6", description="Fast web spider", tier="tier1", requires_binary="gospider", tags=["web", "recon", "crawler"], ) async def run(self, session_id, target, args, process_controller, safety_policy, session_manager): # Build command, validate through safety_policy, execute via process_controller ... **That's it. No changes to core application.** ## LLM Call Budget For a 4-phase autonomous scan: - Phase 1: plan() + evaluate() = 2 calls - Phase 2: plan() + evaluate() = 2 calls - Phase 3: plan() + evaluate() = 2 calls - Phase 4: plan() + evaluate() = 2 calls - Report: summarize_for_report() = 1 call - **Total: ~9 LLM calls per full pentest** This keeps token usage low and latency acceptable on a Raspberry Pi 4. ## Safety & Compliance - **Command denylist**: Dangerous commands (rm -rf /, fork bombs, etc.) are blocked - **Tier system**: Tools classified by risk (Tier 1/2/3) - **Scope checking**: Commands validated against defined scope - **Rate limiting**: Per-tier concurrent execution limits - **Sandboxing**: All commands run through firejail with resource limits - **Audit trail**: Every command and AI decision is logged immutably ## License MIT License — See LICENSE file for details. ![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)