PratikT33/Threat-Intelligence-Aggregator
GitHub: unknownspy333/Threat-Intelligence-Aggregator
Stars: 0 | Forks: 0
# Threat Intelligence Aggregator
A Python-based security automation toolkit that collects, parses, normalizes,
and correlates Indicators of Compromise (IOCs) from multiple threat feeds —
without AI or machine learning.
## Project Structure
ti_aggregator/
├── main.py # Pipeline orchestrator — run this
├── feed_parser.py # IOC extraction from all feed formats
├── normalizer.py # Schema normalization & deduplication
├── correlator.py # Cross-feed correlation & severity rating
├── blocklist_generator.py # Blocklist export (TXT / CSV / JSON)
├── reporter.py # Threat report generation
├── requirements.txt # Python dependencies
├── sample_feeds/
│ ├── feed1.csv # Sample CSV feed
│ ├── feed2.json # Sample JSON feed
│ ├── feed3.txt # Sample plain-text feed
│ └── feed4_stix.json # Sample STIX 2.x bundle
└── output/
├── blocklists/ # Generated blocklist files
└── reports/ # Generated threat reports
## Quick Start
### 1. Install dependencies
pip install -r requirements.txt
### 2. Run the pipeline
python main.py
### 3. Check output
output/
├── blocklists/
│ ├── ip_blocklist.txt <- paste into firewall
│ ├── ip_blocklist.csv
│ ├── ip_blocklist.json
│ ├── domain_blocklist.txt <- paste into DNS sinkhole
│ ├── url_blocklist.txt <- import into web filter
│ ├── hash_blocklist.txt <- import into EDR / AV
│ ├── email_blocklist.txt
│ └── firewall_ipset.txt <- Linux ipset / iptables
└── reports/
├── ti_report_YYYYMMDD_HHMMSS.json
└── ti_report_YYYYMMDD_HHMMSS.csv
## Adding Feeds
Edit the `FEEDS` list in `main.py`:
FEEDS = [
# Local CSV file
{"name": "AbuseIPDB_Export", "type": "csv", "path": "feeds/abuseipdb.csv"},
# Local JSON file
{"name": "Custom_Feed", "type": "json", "path": "feeds/custom.json"},
# Plain text file
{"name": "Internal_IOCs", "type": "txt", "path": "feeds/internal.txt"},
# STIX 2.x bundle
{"name": "MISP_Feed", "type": "stix", "path": "feeds/misp_bundle.json"},
# Remote URL (fetched at runtime)
{"name": "URLhaus", "type": "url",
"path": "https://urlhaus.abuse.ch/downloads/text/"},
]
## Supported IOC Types
| Type | Category | Example |
|---------|----------|------------------------------------------------------------------|
| ip | network | 185.220.101.45 |
| domain | network | malware.example.com |
| url | network | https://evil.ru/payload.exe |
| md5 | file | 44d88612fea8a8f36de82e1278abb02f |
| sha1 | file | da39a3ee5e6b4b0d3255bfef95601890afd80709 |
| sha256 | file | aabbccdd...11223344 (64 hex chars) |
| email | identity | phish@attacker-domain.xyz |
## Severity Ratings
| Severity | Condition | Recommended Action |
|----------|--------------------------------|--------------------------------------------|
| High | Seen in 3+ independent feeds | Immediate blocking + SOC escalation |
| Medium | Seen in 2 feeds | Blocklist + analyst review |
| Low | Seen in 1 feed | Monitor / log only |
## Supported Feed Formats
| Format | Description |
|--------|----------------------------------------------------------|
| TXT | Plain text, one indicator per line |
| CSV | Columnar; all columns scanned (schema-agnostic) |
| JSON | Any JSON structure; full document is searched |
| STIX | STIX 2.x bundles; indicator pattern fields are parsed |
| URL | Any remote HTTP/HTTPS feed returning text |
## Pipeline Architecture
[Feed Files / URLs]
|
v
feed_parser.py <- Load + extract IOCs (regex)
|
v
normalizer.py <- Validate, deduplicate, add metadata
|
v
correlator.py <- Cross-feed grouping + severity rating
|
+---------> blocklist_generator.py -> output/blocklists/
|
+---------> reporter.py -> output/reports/
## Libraries Used
All standard library except `requests`:
- `re` — IOC extraction via compiled regex patterns
- `json` / `csv` — Feed parsing and structured output
- `requests` — Remote feed fetching
- `ipaddress` — IPv4 validation (filters private/reserved ranges)
- `collections` — Counter and defaultdict for correlation
- `datetime` — ISO 8601 UTC timestamping
- `os` / `sys` — File I/O and path management
## Example Console Output
==============================================================
THREAT INTELLIGENCE AGGREGATOR — REPORT SUMMARY
==============================================================
Generated : 2026-05-27 10:30:00 UTC
Feeds : 4
- SampleFeed_CSV
- SampleFeed_JSON
- SampleFeed_TXT
- SampleFeed_STIX
--------------------------------------------------------------
Total unique IOCs : 47
High severity : 8
Medium severity : 14
Low severity : 25
--------------------------------------------------------------
IOC Type Breakdown:
ip : 12 ############
domain : 9 #########
url : 8 ########
sha256 : 6 ######
md5 : 4 ####
sha1 : 3 ###
email : 5 #####
--------------------------------------------------------------
Top High-Risk Indicators (showing up to 15):
[ip ] 185.220.101.45 feeds=4
[domain] malware.example.com feeds=4
[sha256] aabbccdd11223344... feeds=3
[ip ] 91.108.4.0 feeds=3
==============================================================