rajapalagummi/Cybersecurity-Threat-Detection
GitHub: rajapalagummi/Cybersecurity-Threat-Detection
Stars: 1 | Forks: 0
# Cybersecurity Threat Detection & Network Anomaly Intelligence Platform
## PyTorch Autoencoder + Graph Analysis + Real-Time Attack Injection | 10 Attack Types
## Overview
Every organization's network generates thousands of events per second. Most are normal. A few are attacks. The challenge is telling them apart automatically, in real time, before damage is done.
This project builds a production-grade network security analytics platform that detects anomalous behaviour using a PyTorch autoencoder (trained exclusively on normal traffic, flags deviations), Isolation Forest ensemble scoring, and NetworkX graph analysis for attack path detection. A live attack injector simulates 10 distinct attack types — each with its own network signature — demonstrating detection in real time through Grafana dashboards and Neo4j graph visualization.
## Live Demo Attack Types
python3 inject_attack.py --type brute_force # SSH/RDP repeated failed logins → success
python3 inject_attack.py --type ddos # 20 sources flooding single target
python3 inject_attack.py --type port_scan # Sequential port probing (recon)
python3 inject_attack.py --type lateral_movement # Compromised host → 8 internal hops
python3 inject_attack.py --type data_exfiltration # Large outbound transfers (~38MB)
python3 inject_attack.py --type ransomware # SMB spread + file encryption
python3 inject_attack.py --type credential_stuffing # 15 rotating IPs, 60 accounts
python3 inject_attack.py --type sql_injection # Web → DB anomalous query patterns
python3 inject_attack.py --type privilege_escalation # user_012 → admin → root progression
python3 inject_attack.py --type c2_beacon # Periodic callbacks (beaconing pattern)
python3 inject_attack.py --type all # All 10 sequentially
## Architecture
Network Event Simulator (src/network_simulator.py)
↓ Realistic baseline traffic (logins, transfers, DNS, HTTP)
↓ SQLite event store (data/events.db)
Attack Injector (inject_attack.py)
↓ 10 attack types with distinct network signatures
↓ Injected into same event stream
Anomaly Detection Engine (src/detector.py)
↓ Feature extraction: 8 numerical features per event
↓ PyTorch Autoencoder: trained on normal traffic only
↓ Isolation Forest: ensemble anomaly scoring
↓ Weighted ensemble: 60% autoencoder + 40% IsoForest
↓ Per-event anomaly score [0-1]
Graph Analysis (NetworkX + Neo4j)
↓ Directed graph: IP nodes, connection edges
↓ Attack pattern detection: high out-degree, centrality, attack edges
↓ Visual path exploration in Neo4j browser
Dashboards
↓ Grafana: real-time anomaly timeline, attack type breakdown, alerts
↓ Neo4j: interactive network graph with attack path highlighting
## Technical Implementation
### 1. PyTorch Autoencoder — Unsupervised Anomaly Detection
Trained exclusively on normal traffic. Learns to reconstruct normal events with low error. Attack events have high reconstruction error = high anomaly score.
class NetworkAutoencoder(nn.Module):
def __init__(self, input_dim=8):
super().__init__()
self.encoder = nn.Sequential(
nn.Linear(input_dim, 32), nn.ReLU(), nn.Dropout(0.1),
nn.Linear(32, 16), nn.ReLU(),
nn.Linear(16, 8), nn.ReLU(),
)
self.decoder = nn.Sequential(
nn.Linear(8, 16), nn.ReLU(),
nn.Linear(16, 32), nn.ReLU(), nn.Dropout(0.1),
nn.Linear(32, input_dim),
)
def reconstruction_error(self, x):
recon = self.forward(x)
errors = torch.mean((x - recon) ** 2, dim=1)
return errors.numpy()
**Threshold:** 95th percentile of training reconstruction errors. Events above threshold = anomalous.
### 2. Feature Engineering — 8 Network Features
features["bytes_sent_log"] = np.log1p(df["bytes_sent"]) # Volume
features["bytes_recv_log"] = np.log1p(df["bytes_recv"]) # Response size
features["duration_log"] = np.log1p(df["duration_ms"]) # Connection time
features["dst_port_norm"] = df["dst_port"] / 65535.0 # Destination port
features["src_port_norm"] = df["src_port"] / 65535.0 # Source port
features["is_external_src"] = ... # External source flag
features["is_external_dst"] = ... # External dest flag
features["is_failure"] = (df["status"] == "failed") # Auth failure
### 3. Ensemble Scoring
ae_scores = autoencoder.reconstruction_error(X) / threshold
iso_scores = -isolation_forest.score_samples(X) # normalized
final_scores = 0.6 * ae_scores + 0.4 * iso_scores # weighted ensemble
### 4. Graph Attack Pattern Detection
# High out-degree = port scanner or C2 master
for node in G.nodes():
if G.out_degree(node) > 10:
findings.append({"type": "high_out_degree", "severity": "HIGH"})
# Betweenness centrality = pivot/relay node
centrality = nx.betweenness_centrality(G)
high_pivots = [(n, c) for n, c in centrality.items() if c > 0.3]
### 5. Attack Signatures — What Makes Each Attack Detectable
| Attack | Signature | Key Feature |
|---|---|---|
| Brute Force | 29 failures + 1 success | is_failure spike from single src_ip |
| DDoS | 100 packets, 20 sources, tiny duration | bytes_recv ≈ 0, high volume |
| Port Scan | Sequential ports, tiny bytes, fast | dst_port variety, small payload |
| Lateral Movement | Internal→internal, multiple services | is_external=0 on both sides |
| Data Exfiltration | 500KB-5MB outbound, external dst | bytes_sent_log very high |
| Ransomware | SMB dst_port=445, high write volume | service=smb, high bytes |
| Credential Stuffing | Many users, rotating IPs, auth failures | is_failure=1, src_ip variety |
| SQL Injection | DB port 3306, large recv (data dump) | dst_port=3306, bytes_recv spike |
| Privilege Escalation | user→admin→root progression | username pattern |
| C2 Beacon | Consistent timing, small payload | duration_ms consistent |
## Key Metrics
- **655 total events** processed (300 baseline + 355 attack)
- **10 attack types** with distinct network signatures
- **54.2% attack rate** after full injection demo
- **PyTorch Autoencoder**: 8-dimensional encoding, 100 epochs, final loss 0.4679
- **Ensemble scoring**: 60% autoencoder + 40% Isolation Forest
- **3 graph patterns** detected: high out-degree, attack edges, centrality pivots
- **Real-time**: events scored in batches, dashboards refresh every 5s
- **Zero paid APIs**: fully local, no cloud dependency
## How to Run
# 1. Setup
cd ~/Desktop/Projects/cybersecurity-threat-detection
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 2. Run full pipeline (generates baseline + trains model + scores + report)
python3 main.py
# 3. Start dashboards
docker-compose up -d
# Grafana: http://localhost:3001 (admin / cybersec123)
# Neo4j: http://localhost:7474 (neo4j / cybersec123)
# 4. Live demo — Terminal 1: continuous traffic
python3 -c "from src.network_simulator import run_simulator; run_simulator(3.0)"
# 5. Live demo — Terminal 2: inject attacks
python3 inject_attack.py --type brute_force
python3 inject_attack.py --type ddos
python3 inject_attack.py --type ransomware
python3 inject_attack.py --type all
# 6. Load Neo4j graph
python3 src/neo4j_loader.py
# Query in Neo4j browser: MATCH p=(a)-[r:CONNECTED_TO]->(b) WHERE r.is_attack=true RETURN p LIMIT 50
## Push to GitHub
git init
git add .
git commit -m "feat: Cybersecurity Threat Detection — PyTorch autoencoder, 10 attack types, Neo4j graph"
git remote add origin https://github.com/rajapalagummi/Cybersecurity-Threat-Detection.git
git branch -M main
git push -u origin main
*Built by Raja Palagummi | rajapalagummi.com | github.com/rajapalagummi*