tmon3ygrc-sentinel/darksword

GitHub: tmon3ygrc-sentinel/darksword

Stars: 1 | Forks: 0

# ⚔️ Project DARKSWORD — GRC Intelligence Platform **Threat Intelligence → CMMC 2.0 Gap Analysis — Automated** A multi-source intelligence pipeline that ingests daily security content from show notes, threat feeds, and YouTube, classifies it using Claude AI, maps it against **CMMC 2.0 / NIST 800-171** controls, links records to a live GRC learning plan, and pushes structured records into a Notion GRC repository. ## Architecture Simply Cyber (show notes) AlienVault OTX (feed) Barricade Cyber (YouTube) ↓ ↓ ↓ get_show_notes() get_otx_pulses() get_transcript() ↓ ↓ ↓ analyze_with_claude() analyze_with_claude_prompt() analyze_with_claude() ↓ ↓ ↓ └───────────────────────────┘ barricade_input.txt ↓ ↓ governance_input.txt threat_ingest.py ↓ [Barricade Engine] notion_logger_v7.py ↓ [DARKSWORD Engine] Strategy & Architecture DB ↓ ↙ CPE Tracker DB ↘ ↙ Master Frameworks DB (CMMC 2.0) [Source of Truth] ↓ GRC Learning Plan DB [Auto-linked by content] ### Databases (Notion) | Database | Script | Source | Purpose | |---|---|---|---| | CPE Tracker | `notion_logger_v7.py` | Simply Cyber, AlienVault OTX | Tactical threat intel | | STAR Strategy | `threat_ingest.py` | Barricade Cyber | Strategic architecture | | Master Frameworks | shared | CMMC 2.0 | Control mapping (source of truth) | | GRC Learning Plan | shared | Internal | Auto-linked from control domains | | Cybernews Intel | *(planned)* | Cybernews | Threat actor profiles | ## Workspace Structure This project spans two VS Code workspaces: STAR_PROJECT (GRC-OCEG) └── darksword/ ├── notion_logger_v7.py ← DARKSWORD core engine (V7) ├── threat_ingest.py ← Barricade engine ├── governance_input.txt ← Simply Cyber working file (gitignored) ├── barricade_input.txt ← Barricade working file (gitignored) ├── failed_records.txt ← Failed push log ├── prompts/ ← Analyst prompt library ├── archive/ ← Legacy scripts ├── GRC-Playground/ ← Experimental work ├── GovSCH/ ← Governance scheduler ├── .env ← API keys (gitignored) ├── requirements.txt ├── README.md └── script_walkthrough_.md ← Full code walkthrough (V7.0) PHOENIX_LAB_INFRA (AdminOps) └── scripts/python/ └── ... ← Lab automation scripts ## Pipeline Modes ### DARKSWORD (`notion_logger_v7.py`) cpe # launches via alias | Option | Description | |---|---| | `1. Autonomous Pipeline` | Show Notes → Claude → Notion (Simply Cyber via `cyberthreatbrief.simplycyber.io`) | | `2. Manual Pipeline` | `governance_input.txt` → Notion | | `3. Test Pipeline` | Mock data → Notion (`$0.00`, `--test` flag) | | `4. OTX Pipeline` | AlienVault OTX → Claude → Notion (requires `OTX_API_KEY`) | ### Autonomous Pipeline Workflow (Simply Cyber) 1. Run `cpe` → select **1. Autonomous Pipeline** 2. Enter date (`YYYY-MM-DD`) or press Enter for today 3. Script fetches show notes from `cyberthreatbrief.simplycyber.io`, sends to Claude, writes records 4. Records push automatically to Notion with CMMC linking and learning plan mapping ### Manual Pipeline Workflow (other YouTube sources) 1. Go to YouTube video → open transcript → toggle timestamps off → copy all text 2. Paste transcript into Claude chat using `prompts/cpe_prompt_claude.txt` 3. Claude generates `===INTEL_RECORD_START===` formatted records 4. Copy records into `governance_input.txt` 5. Run `cpe` → select **2. Manual Pipeline** → enter source URL 6. Records push to Notion with CMMC linking and auto learning plan mapping ### Barricade Engine (`threat_ingest.py`) python threat_ingest.py ## Quick Start git clone https://github.com/tmon3ygrc-sentinel/darksword.git cd darksword python -m venv .venv source .venv/Scripts/activate # Windows Git Bash pip install -r requirements.txt cp .env.example .env Fill in `.env`: # Core NOTION_TOKEN=secret_... DATABASE_ID=... # CPE Tracker CMMC_DATABASE_ID=... # Master Frameworks ANTHROPIC_API_KEY=sk-ant-... OTX_API_KEY=... # AlienVault OTX (for Choice 4) # Learning Plan Weeks LEARNING_WEEK_1=... LEARNING_WEEK_2=... LEARNING_WEEK_3=... LEARNING_WEEK_5=... LEARNING_WEEK_6=... LEARNING_WEEK_7=... LEARNING_WEEK_8=... LEARNING_WEEK_10=... LEARNING_WEEK_11=... LEARNING_WEEK_12=... LEARNING_WEEK_13=... LEARNING_WEEK_14=... LEARNING_WEEK_15=... LEARNING_WEEK_17=... LEARNING_WEEK_18=... LEARNING_WEEK_19=... LEARNING_WEEK_20=... LEARNING_WEEK_21=... LEARNING_WEEK_23=... LEARNING_WEEK_24=... LEARNING_WEEK_25=... LEARNING_WEEK_26=... LEARNING_WEEK_27=... LEARNING_WEEK_28=... LEARNING_WEEK_29=... LEARNING_WEEK_30=... LEARNING_WEEK_33=... LEARNING_WEEK_35=... LEARNING_WEEK_36=... Set the `cpe` alias in `~/.bashrc`: alias cpe='cd /c/Work/GRC-OCEG/darksword && /c/Work/GRC-OCEG/.venv/Scripts/python.exe notion_logger_v7.py' ## Intelligence Sources | Source | Channel | Focus | Status | |---|---|---|---| | Simply Cyber | Show Notes | Daily tactical threat briefs | ✅ Live (Autonomous + Manual) | | AlienVault OTX | Threat Feed | IOC feeds, pulse intelligence | ✅ Live (OTX Pipeline) | | Barricade Cyber | YouTube | DFIR, MSP/enterprise ops | ✅ Live | | Cybernews | YouTube | Threat actor profiles, geopolitical | 📋 Planned | ## CMMC Cache The script queries the Master Frameworks database at launch and builds an in-memory cache of all CMMC 2.0 controls. Currently loaded: **128 controls**. Notable controls added during development: - `SR.L2-3.15.2` — Supply Chain Risk Management: Notification of Supply Chain Compromise ## Learning Plan Auto-Mapping Every intel record is automatically linked to relevant GRC learning plan weeks based on its `control_domains` and `intel_category` — no manual input required. **Domain → Week mapping:** | Control Domain | Learning Weeks | |---|---| | Incident Response (IR) | Week 23 | | Supply Chain Risk Management (SR) | Week 19, Week 29 | | Risk Assessment (RA) | Week 18, Week 20 | | Access Control (AC) | Week 13 | | Identification and Authentication (IA) | Week 13 | | Configuration Management (CM) | Week 12 | | System Integrity (SI) | Week 17 | | System and Communications Protection (SC) | Week 17 | | Security Awareness and Training (AT) | Week 5 | | Audit and Accountability (AU) | Week 27, Week 28 | **Category → Week mapping:** | Intel Category | Learning Weeks | |---|---| | regulatory | Week 25 | | advisory | Week 26 | | supply-chain | Week 19, Week 29 | | incident / ransomware / phishing | Week 23 | | vulnerability | Week 20, Week 21 | | malware | Week 19 | | breach | Week 28 | | law-enforcement | Week 25 | | ai-risk | Week 17 | | identity-intelligence | Week 13 | ## Roadmap - [x] DARKSWORD v6 — Claude-powered tactical intel pipeline - [x] Manual Pipeline — standard workflow for Simply Cyber content - [x] CMMC relation mapping (128 controls) - [x] `SR.L2-3.15.2` added to Master Frameworks - [x] `impacted_identity_provider` field mapping fixed - [x] Learning plan auto-detection from `control_domains` and `intel_category` - [x] Learning plan expanded from 3 weeks to 29 weeks - [x] Barricade engine (`threat_ingest.py`) active - [x] DARKSWORD v7 — `get_show_notes()` replaces YouTube scraping for Simply Cyber - [x] Autonomous Pipeline (Choice 1) live for Simply Cyber via show notes - [x] OTX Pipeline (Choice 4) — AlienVault threat feed integration with three-gate filter - [x] `analyze_with_claude_prompt()` — per-source prompt tuning - [x] `OTX_ANALYST_PROMPT` — `content_type`, `content_category`, `impacted_identity_provider` fixed - [x] CMMC cache retry loop (3 attempts, rate-limit resilient) - [x] `max_tokens` increased to 16000 - [ ] Windows Task Scheduler automation - [ ] Cybernews threat actor database + relations - [ ] Claude-powered Barricade pipeline (replace hardcoded items) - [ ] Phoenix Lab VM environment (attack surface testing) ## Known Limitations **`get_transcript()` still blocked for Simply Cyber** — yt-dlp is blocked at the network/IP level for Simply Cyber specifically. This is no longer a pipeline limitation: V7's Choice 1 uses `get_show_notes()` to fetch from `cyberthreatbrief.simplycyber.io` directly, bypassing YouTube entirely. `get_transcript()` is retained for other sources (Barricade, Cybernews) that are not network-restricted. **`unknown` threat actor shows empty in Notion** — the script skips placeholder values (`none`, `unknown`, `empty`, `n/a`) to prevent noise in the database. This is intentional behavior. ## License MIT — Open source. Use it, fork it, build on it. *Built with HardOPS discipline. Manual mastery before automation. Eat your own cooking.* ⚔️💎🦅