aditya777-dev/soc-workflow-automation
GitHub: aditya777-dev/soc-workflow-automation
Stars: 1 | Forks: 0
# SOC Workflow Automation — Alert Enrichment
A Python automation tool that ingests simulated SIEM alerts, enriches every indicator of compromise (IP addresses, domains, file hashes) against **VirusTotal** and **AbuseIPDB** free APIs, calculates a risk score, and generates structured incident reports in three formats.
Built as a SOC analyst portfolio project demonstrating threat-intelligence automation, API integration, and incident reporting.
## What It Does
[SIEM Alert JSON]
│
▼
[Extract IOCs] ─────────────────────────────────────────────┐
│ │
├── IP Addresses ──► VirusTotal (malicious votes, │
│ AbuseIPDB country, ASN, ISP) │
│ │
├── Domains ──────► VirusTotal (malicious votes, │
│ registrar, category) │
│ │
└── File Hashes ──► VirusTotal (AV detections, │
file type, SHA-256) │
│
[Risk Scoring: 0–100] ◄────────────────┘
│
┌──────┴──────┐
▼ ▼
reports/*.json reports/*.txt reports/*.html
(machine-readable) (analyst review) (interactive web)
Private/internal IPs (RFC 1918, loopback, link-local) and syntactically invalid IP strings are automatically skipped — no wasted API quota.
## Project Structure
SOC workflow automation script/
│
├── src/
│ └── soc_enrichment.py ← Main script (run this)
│
├── data/
│ └── sample_alerts.json ← Simulated SIEM alerts (input)
│
├── reports/
│ ├── example_report.txt ← Pre-generated plain-text sample
│ └── example_report.html ← Pre-generated HTML sample
│
├── .env ← Your API keys (gitignored — never committed)
├── .env.example ← API key template (safe to commit)
├── .gitignore
├── requirements.txt
└── README.md
## Quick Start
### 1 — Clone the repository
git clone https://github.com/aditya777-dev/soc-workflow-automation.git
cd soc-workflow-automation
### 2 — Install dependencies
pip install -r requirements.txt
Installs: `requests` (HTTP API calls) and `python-dotenv` (API key loading).
### 3 — Configure API keys
cp .env.example .env
Edit `.env`:
VIRUSTOTAL_API_KEY=your_key_here
ABUSEIPDB_API_KEY=your_key_here
**Get free API keys:**
| Service | URL | Free Tier |
|---------|-----|-----------|
| VirusTotal | https://www.virustotal.com/gui/sign-in | 500 lookups/day, 4/minute |
| AbuseIPDB | https://www.abuseipdb.com/register | 1,000 checks/day |
### 4 — Run
python src/soc_enrichment.py
Reports are saved to `reports/` with a timestamp in the filename.
**Custom paths:**
python src/soc_enrichment.py data/my_alerts.json --output-dir /tmp/reports
## Sample Alert Format
The script accepts a JSON array. Every IOC field is optional — only present fields are enriched.
[
{
"alert_id": "ALT-2026-001",
"timestamp": "2026-05-30T08:15:00Z",
"severity": "CRITICAL",
"alert_type": "Malware C2 Communication",
"source_host": "WORKSTATION-042",
"source_ip": "192.168.1.42",
"destination_ip": "185.220.101.45",
"destination_domain": "update.microsoft-cdn.net",
"file_hash_md5": "44d88612fea8a8f36de82e1278abb02f",
"file_name": "system_update.exe",
"process": "svchost.exe",
"description": "Suspicious outbound connection to known Tor exit node",
"rule_triggered": "TOR_EXIT_NODE_COMMUNICATION"
}
]
**Supported IOC fields:**
| Field | Enriched via |
|-------|-------------|
| `source_ip` | VirusTotal + AbuseIPDB |
| `destination_ip` | VirusTotal + AbuseIPDB |
| `destination_domain` | VirusTotal |
| `file_hash_md5` | VirusTotal |
## Output Reports
Three files are written for every run, named `incident_report_YYYYMMDD_HHMMSS.*`:
### `*.json` — Machine-readable
Full structured output with raw API fields and verdict scores. Suitable for SIEM ingestion, ticketing-system import, or further scripting.
### `*.txt` — Plain-text analyst report
Box-formatted report with per-alert sections, IOC enrichment blocks, and risk factors. Printable and easy to attach to tickets.
### `*.html` — Interactive web report
Dark-themed HTML report with colour-coded severity/verdict badges, risk-score progress bars, and collapsible IOC detail panels. Open in any browser — no internet connection required (fully self-contained).
See [`reports/example_report.html`](reports/example_report.html) and [`reports/example_report.txt`](reports/example_report.txt) for live samples generated against the real APIs.
## Risk Scoring
Additive score, capped at 100:
| Signal | Points |
|--------|--------|
| IP: ≥1 VirusTotal malicious vote | +40 |
| File hash: ≥1 VirusTotal malicious vote | +40 |
| IP: AbuseIPDB confidence ≥ 75% | +30 |
| Domain: ≥1 VirusTotal malicious vote | +25 |
| IP: AbuseIPDB confidence 25–74% | +15 |
| Final Score | Verdict |
|-------------|---------|
| 0–9 | CLEAN |
| 10–39 | POTENTIALLY_SUSPICIOUS |
| 40–69 | SUSPICIOUS |
| 70–100 | MALICIOUS |
## Terminal Output
20:03:55 [INFO ] Loaded 3 alert(s) from data\sample_alerts.json
20:03:55 [INFO ] Note: VirusTotal free tier = 4 req/min — expect ~16 s between lookups.
20:03:55 [INFO ] ── Processing alert 1/3: ALT-2026-001 [CRITICAL] ──
20:03:55 [INFO ] Skipping non-public IP '192.168.1.42' (source_ip)
20:03:55 [INFO ] [VT] Enriching IP: 185.220.101.45
20:03:56 [INFO ] [AbuseIPDB] Checking IP: 185.220.101.45
...
==============================================================
ENRICHMENT COMPLETE
==============================================================
Alerts processed : 3
JSON report : reports\incident_report_20260530_200522.json
Text report : reports\incident_report_20260530_200522.txt
HTML report : reports\incident_report_20260530_200522.html
==============================================================
Alert ID Verdict Risk Score
------------------ ------------------------ ----------
ALT-2026-001 MALICIOUS 100/100
ALT-2026-002 SUSPICIOUS 40/100
ALT-2026-003 MALICIOUS 70/100
## Rate Limits & Timing
| Service | Free Limit | How this script handles it |
|---------|------------|---------------------------|
| VirusTotal | 4 req/min, 500/day | 16 s enforced delay between every VT call |
| AbuseIPDB | 1,000/day | No per-minute cap — called immediately |
The same IOC value across multiple alerts is queried **once** and cached for the run duration.
**Estimated runtime for the 3 sample alerts:** ~2 minutes (6 VT calls × 16 s).
## Bugs Fixed During Development
| # | Bug | Fix |
|---|-----|-----|
| 1 | Invalid IP strings (`not.an.ip.addr`) bypassed the private-IP guard and were sent to APIs (HTTP 400/422) | `is_valid_public_ip()` now validates with `ipaddress` and requires `is_global` |
| 2 | 429 rate-limit retry was recursive — stack overflow risk under sustained throttling | Replaced with an iterative retry loop (`VT_MAX_RETRIES = 3`) |
| 3 | Log lines appeared out-of-order (logging → stderr, print → stdout) | `logging.basicConfig(stream=sys.stdout)` + `sys.stdout.reconfigure(encoding='utf-8')` |
| 4 | Unicode box-drawing characters (`→`, `──`) in log messages crashed on Windows cp1252 console | `sys.stdout.reconfigure(encoding='utf-8', errors='replace')` before logging setup |
## Technologies Used
| Tool | Purpose |
|------|---------|
| Python 3.8+ | Core scripting |
| `requests` | HTTP calls to threat-intel APIs |
| `python-dotenv` | Secure API key loading |
| VirusTotal API v3 | Multi-vendor malware scanning (IPs, domains, hashes) |
| AbuseIPDB API v2 | Crowd-sourced IP abuse reputation |
## Author
Built as a portfolio project for a SOC Analyst role.
**Demonstrated skills:**
- REST API integration (authentication, rate limiting, error handling, retries)
- IOC extraction and enrichment automation
- Risk scoring and verdict classification
- Multi-format report generation (JSON, TXT, HTML)
- Python best practices: logging, type hints, modular OOP design, caching
- Security-aware input validation (invalid/private IP filtering)