aditya-620/CipherEye-Deep-Packet-Inspection
GitHub: aditya-620/CipherEye-Deep-Packet-Inspection
一个纯 Java 的深度包检测引擎,用于分析网络流量并实施域名阻断规则。
Stars: 0 | Forks: 0
# Java Deep Packet Inspection (DPI) Engine
A multi-threaded, high-performance Deep Packet Inspection engine written in **100% Java**.
This project analyzes raw network traffic (PCAP files), identifies applications (even when encrypted via TLS/HTTPS), and can enforce blocking rules at the network level. It simulates how a corporate firewall or internet service provider analyzes your traffic, without requiring any native C++ libraries (no libpcap/winpcap needed).
## Table of Contents
1. [What is DPI?](#1-what-is-dpi)
2. [Networking Background](#2-networking-background)
3. [Project Overview](#3-project-overview)
4. [File Structure](#4-file-structure)
5. [The Journey of a Packet (Simple Version)](#5-the-journey-of-a-packet-simple-version)
6. [The Journey of a Packet (Multi-threaded Version)](#6-the-journey-of-a-packet-multi-threaded-version)
7. [Deep Dive: Each Component](#7-deep-dive-each-component)
8. [How SNI Extraction Works](#8-how-sni-extraction-works)
9. [How Blocking Works](#9-how-blocking-works)
10. [Building and Running](#10-building-and-running)
11. [Understanding the Output](#11-understanding-the-output)
## 1. What is DPI?
**Deep Packet Inspection (DPI)** is a technology used to examine the contents of network packets as they pass through a checkpoint. Unlike simple firewalls that only look at packet headers (source/destination IP), DPI looks *inside* the packet payload.
### Real-World Uses:
- **ISPs**: Throttle or block certain applications (e.g., BitTorrent)
- **Enterprises**: Block social media on office networks
- **Parental Controls**: Block inappropriate websites
- **Security**: Detect malware or intrusion attempts
### What Our DPI Engine Does:
User Traffic (PCAP) → [DPI Engine] → Filtered Traffic (PCAP)
↓
- Identifies apps (YouTube, Facebook, etc.)
- Blocks based on rules
- Generates reports
- Multi-threaded processing
## 2. Networking Background
### The Network Stack (Layers)
When you visit a website, data travels through multiple "layers":
┌─────────────────────────────────────────────────────────┐
│ Layer 7: Application │ HTTP, TLS, DNS │
├─────────────────────────────────────────────────────────┤
│ Layer 4: Transport │ TCP (reliable), UDP (fast) │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Network │ IP addresses (routing) │
├─────────────────────────────────────────────────────────┤
│ Layer 2: Data Link │ MAC addresses (local network)│
└─────────────────────────────────────────────────────────┘
### A Packet's Structure
Every network packet is like a **Russian nesting doll** - headers wrapped inside headers:
┌──────────────────────────────────────────────────────────────────┐
│ Ethernet Header (14 bytes) │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ IP Header (20 bytes) │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ TCP Header (20 bytes) │ │ │
│ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Payload (Application Data) │ │ │ │
│ │ │ │ e.g., TLS Client Hello with SNI │ │ │ │
│ │ │ └──────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
### The Five-Tuple
A **connection** (or "flow") is uniquely identified by 5 values:
| Field | Example | Purpose |
|-------|---------|---------|
| Source IP | 192.168.1.100 | Who is sending |
| Destination IP | 172.217.14.206 | Where it's going |
| Source Port | 54321 | Sender's application identifier |
| Destination Port | 443 | Service being accessed (443 = HTTPS) |
| Protocol | TCP (6) | TCP or UDP |
**Why is this important?**
- All packets with the same 5-tuple belong to the same connection
- If we block one packet of a connection, we should block all of them
- This is how we "track" conversations between computers
In our Java code, this is represented by the `FiveTuple` class:
public class FiveTuple {
public long srcIp;
public long dstIp;
public int srcPort;
public int dstPort;
public int protocol;
// Custom hashCode implementation ensures packets from the
// same connection always get the same hash value!
@Override
public int hashCode() {
return Objects.hash(srcIp, dstIp, srcPort, dstPort, protocol);
}
}
### What is SNI?
**Server Name Indication (SNI)** is part of the TLS/HTTPS handshake. When you visit `https://www.youtube.com`:
1. Your browser sends a "Client Hello" packet before the encryption begins
2. This message includes the domain name in **plaintext** (not encrypted yet!)
3. The server uses this to know which certificate to send
TLS Client Hello:
├── Version: TLS 1.2
├── Random: [32 bytes]
├── Cipher Suites: [list]
└── Extensions:
└── SNI Extension:
└── Server Name: "www.youtube.com" ← We extract THIS!
**This is the key to DPI**: Even though HTTPS is encrypted, the domain name is visible in the first packet!
## 3. Project Overview
### What This Project Does
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Wireshark │ │ DPI Engine │ │ Output │
│ Capture │ ──► │ │ ──► │ PCAP │
│ (input.pcap)│ │ - Parse │ │ (filtered) │
└─────────────┘ │ - Classify │ └─────────────┘
│ - Block │
│ - Report │
└─────────────┘
### Three Main Versions
| Version | File | Use Case |
|---------|------|----------|
| Working (Single-threaded) | `MainWorking.java` | Learning, small captures |
| Simple (Single-threaded) | `MainSimple.java` | Testing and debugging |
| DPI Multi-threaded | `MainDpi.java` | Production, large captures |
## 4. File Structure
src/main/java/dpi/
├── parser/ # Network protocol parsing
│ ├── PacketParser.java # Extract headers from raw bytes
│ ├── ParsedPacket.java # Parsed packet representation
│ ├── PcapReader.java # PCAP file reading
│ ├── PcapGlobalHeader.java # PCAP global header structure
│ ├── PcapPacketHeader.java # PCAP packet header structure
│ ├── RawPacket.java # Raw packet bytes container
│ └── PortableNet.java # Network utilities (IP parsing, etc.)
│
├── extractor/ # Application detection
│ ├── DNSExtractor.java # DNS domain extraction
│ ├── SNIExtractor.java # TLS SNI extraction ★ KEY COMPONENT
│ ├── HTTPHostExtractor.java # HTTP Host header extraction
│ └── QUICSNIExtractor.java # QUIC SNI extraction
│
├── engine/ # Multi-threaded processing
│ ├── DPIEngine.java # Main orchestrator
│ ├── FastPathProcessor.java # Worker thread for DPI processing
│ ├── FPManager.java # Manages FastPath threads
│ ├── LoadBalancer.java # Distributes packets to workers
│ ├── LBManager.java # Manages LoadBalancer threads
│ ├── ConnectionTracker.java # Tracks connection state
│ ├── RuleManager.java # Manages blocking rules
│ ├── GlobalConnectionTable.java # Thread-safe connection storage
│ └── ThreadSafeQueue.java # Thread-safe packet queue
│
├── types/ # Data structures
│ ├── AppType.java # Application type enum
│ ├── Connection.java # Connection state tracking
│ ├── ConnectionState.java # State machine for flows
│ ├── DPIStats.java # Statistics collection
│ ├── FiveTuple.java # Connection identifier
│ ├── PacketAction.java # Action enum (FORWARD/DROP)
│ └── PacketJob.java # Work unit (packet + metadata)
│
└── main/ # Entry points
├── DpiMt.java # Multi-threaded version
├── Main.java # Old main (deprecated)
├── MainDpi.java # ★ RECOMMENDED: Production version
├── MainSimple.java # Learning: Single-threaded
└── MainWorking.java # Working: Single-threaded
## 5. The Journey of a Packet (Simple Version)
Let's trace a single packet through `MainSimple.java`:
### Step 1: Read PCAP File
PcapReader reader = new PcapReader("test_dpi.pcap");
**What happens:**
1. Open the file in binary mode
2. Read the 24-byte global header (magic number, version, etc.)
3. Verify it's a valid PCAP file
**PCAP File Format:**
┌────────────────────────────┐
│ Global Header (24 bytes) │ ← Read once at start
│ (magic, version, snaplen) │
├────────────────────────────┤
│ Packet Header (16 bytes) │ ← Timestamp, length
│ Packet Data (variable) │ ← Actual network bytes
├────────────────────────────┤
│ Packet Header (16 bytes) │
│ Packet Data (variable) │
├────────────────────────────┤
│ ... more packets ... │
└────────────────────────────┘
### Step 2: Read Each Packet
RawPacket raw;
while ((raw = reader.readPacket()) != null) {
// raw.data contains the packet bytes
// raw.header contains timestamp and length
}
**What happens:**
1. Read 16-byte packet header
2. Read N bytes of packet data (N = header.incl_len)
3. Return null when no more packets
### Step 3: Parse Protocol Headers
ParsedPacket parsed = PacketParser.parse(raw);
**What happens (in PacketParser.java):**
raw.data bytes:
[0-13] Ethernet Header
[14-33] IP Header
[34-53] TCP Header
[54+] Payload
After parsing:
parsed.srcMac = "00:11:22:33:44:55"
parsed.destMac = "aa:bb:cc:dd:ee:ff"
parsed.srcIp = "192.168.1.100"
parsed.destIp = "172.217.14.206"
parsed.srcPort = 54321
parsed.destPort = 443
parsed.protocol = 6 (TCP)
parsed.hasTcp = true
### Step 4: Create Five-Tuple and Look Up Flow
FiveTuple tuple = new FiveTuple(
parsed.srcIp,
parsed.destIp,
parsed.srcPort,
parsed.destPort,
parsed.protocol
);
Connection flow = connections.computeIfAbsent(tuple,
k -> new Connection(k));
**What happens:**
- The flow table is a `ConcurrentHashMap`: `FiveTuple → Connection`
- If this 5-tuple exists, we get the existing flow
- If not, a new flow is created
- All packets with the same 5-tuple share the same flow
### Step 5: Extract SNI (Deep Packet Inspection)
// For HTTPS traffic (port 443)
if (parsed.destPort == 443 && parsed.payloadLength > 5) {
String sni = SNIExtractor.extractSNI(payload, parsed.payloadLength);
if (sni != null) {
flow.setSni(sni); // "www.youtube.com"
flow.setAppType(ClassificationEngine.sniToAppType(sni)); // YOUTUBE
}
}
### Step 6: Check Blocking Rules
if (ruleManager.isBlocked(tuple.srcIp, flow.getAppType(), flow.getSni())) {
flow.setBlocked(true);
}
### Step 7: Forward or Drop
if (flow.isBlocked()) {
stats.incrementDropped();
// Don't write to output
} else {
stats.incrementForwarded();
// Write packet to output file
writer.writePacket(raw);
}
### Step 8: Generate Report
After processing all packets:
// Count apps and print statistics
DPIStats.printReport(stats);
## 6. The Journey of a Packet (Multi-threaded Version)
The multi-threaded version (`MainDpi.java`) adds **parallelism** for high performance:
### Architecture Overview
┌─────────────────┐
│ Reader Thread │
│ (reads PCAP) │
└────────┬────────┘
│
┌──────────────┴──────────────┐
│ hash(5-tuple) % 2 │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ LB0 Thread │ │ LB1 Thread │
│ (LoadBalancer) │ │ (LoadBalancer) │
└────────┬────────┘ └────────┬────────┘
│ │
┌──────┴──────┐ ┌──────┴──────┐
│hash % 2 │ │hash % 2 │
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│FP0 │ │FP1 │ │FP2 │ │FP3 │
│Thread │ │Thread │ │Thread │ │Thread │
└─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘
│ │ │ │
└────────────┴──────────────┴────────────┘
│
▼
┌───────────────────────┐
│ Output Queue │
└───────────┬───────────┘
│
▼
┌───────────────────────┐
│ Output Writer Thread │
│ (writes to PCAP) │
└───────────────────────┘
### Why This Design?
1. **Load Balancers (LBs):** Distribute work across FPs using `LinkedBlockingQueue`
2. **Fast Paths (FPs):** Do the actual DPI processing
3. **Consistent Hashing:** Same 5-tuple always goes to same FP
**Why consistent hashing matters:**
Connection: 192.168.1.100:54321 → 142.250.185.206:443
Packet 1 (SYN): hash → FP2
Packet 2 (SYN-ACK): hash → FP2 (same FP!)
Packet 3 (ACK): hash → FP2 (same FP!)
Packet 4 (Client Hello): hash → FP2 (same FP!)
All packets of this connection go to FP2.
FP2 can track the flow state correctly.
### Detailed Flow
#### Step 1: Reader Thread (Main Thread)
PcapReader reader = new PcapReader("test_dpi.pcap");
RawPacket raw;
while ((raw = reader.readPacket()) != null) {
PacketJob job = new PacketJob(raw);
// Hash to select Load Balancer
int lbIdx = Math.abs(job.getFiveTuple().hashCode()) % numLBs;
// Push to LB's queue
lbQueues[lbIdx].put(job); // Blocks if queue is full
}
#### Step 2: Load Balancer Thread
// In LBManager.java
public void run() {
while (running) {
try {
PacketJob job = inputQueue.take(); // Blocks until available
// Hash to select Fast Path
int fpIdx = Math.abs(job.getFiveTuple().hashCode()) % numFPs;
// Push to FP's queue
fpQueues[fpIdx].put(job);
stats.incrementDispatched();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
#### Step 3: Fast Path Thread
// In FastPathProcessor.java
public void run() {
while (running) {
try {
PacketJob job = inputQueue.take();
// Look up flow (each FP has its own flow table)
FiveTuple tuple = job.getFiveTuple();
Connection flow = localFlows.computeIfAbsent(tuple,
k -> new Connection(k));
// Classify (SNI extraction)
classifyFlow(job, flow);
// Check rules
if (ruleManager.isBlocked(tuple.srcIp, flow.getAppType(), flow.getSni())) {
stats.incrementDropped();
} else {
// Forward: push to output queue
outputQueue.put(job);
stats.incrementForwarded();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
#### Step 4: Output Writer Thread
// In DPIEngine.java outputThread()
while (running || !outputQueue.isEmpty()) {
try {
PacketJob job = outputQueue.take();
// Write to output file
pcapWriter.writePacket(job.getRawPacket());
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
## 7. Deep Dive: Each Component
### PcapReader.java / RawPacket.java
**Purpose:** Read network captures saved by Wireshark
**Key classes:**
public class PcapGlobalHeader {
public int magicNumber; // 0xa1b2c3d4 identifies PCAP
public int versionMajor; // Usually 2
public int versionMinor; // Usually 4
public int snapLen; // Max packet size captured
public int network; // 1 = Ethernet
}
public class PcapPacketHeader {
public int tsSec; // Timestamp (seconds)
public int tsUsec; // Timestamp (microseconds)
public int inclLen; // Bytes saved in file
public int origLen; // Original packet size
}
public class RawPacket {
public byte[] data; // Raw packet bytes
public PcapPacketHeader header;
}
### PacketParser.java / ParsedPacket.java
**Purpose:** Extract protocol fields from raw bytes
**Key method:**
public static ParsedPacket parse(RawPacket raw) {
ParsedPacket parsed = new ParsedPacket();
parseEthernet(raw.data, parsed); // Extract MACs, EtherType
parseIPv4(raw.data, parsed); // Extract IPs, protocol, TTL
if (parsed.protocol == 6) { // TCP
parseTCP(raw.data, parsed); // Extract ports, flags, seq
} else if (parsed.protocol == 17) { // UDP
parseUDP(raw.data, parsed); // Extract ports
}
return parsed;
}
### SNIExtractor.java / HTTPHostExtractor.java / QUICSNIExtractor.java
**Purpose:** Extract domain names from TLS, HTTP, and QUIC
**For TLS (HTTPS):**
public class SNIExtractor {
public static String extractSNI(byte[] payload, int length) {
// 1. Verify TLS record header (0x16 = Handshake)
// 2. Verify Client Hello (0x01)
// 3. Skip to extensions
// 4. Find SNI extension (type 0x0000)
// 5. Extract hostname string
}
}
**For HTTP:**
public class HTTPHostExtractor {
public static String extractHost(byte[] payload, int length) {
// 1. Verify HTTP request (GET, POST, etc.)
// 2. Search for "Host: " header
// 3. Extract value until newline
}
}
### Connection.java / ConnectionState.java
**Purpose:** Track state of a connection/flow
public class Connection {
private FiveTuple tuple;
private AppType appType = AppType.UNKNOWN;
private String sni;
private ConnectionState state = ConnectionState.NEW;
private boolean blocked = false;
private long firstSeen;
private long lastSeen;
private int packetCount = 0;
// Getters and setters...
}
public enum ConnectionState {
NEW, // Connection just created
ESTABLISHED, // SYN-ACK received
CLASSIFIED, // SNI or Host extracted
BLOCKED, // Blocked by rules
CLOSED // FIN received
}
### RuleManager.java
**Purpose:** Manage blocking rules and check if traffic should be blocked
public class RuleManager {
private Set blockedIPs = new HashSet<>();
private Set blockedApps = new HashSet<>();
private Set blockedDomains = new HashSet<>();
public boolean isBlocked(long srcIp, AppType appType, String sni) {
// Check IP blacklist
if (blockedIPs.contains(srcIp)) return true;
// Check app blacklist
if (blockedApps.contains(appType)) return true;
// Check domain blacklist (substring match)
if (sni != null) {
for (String domain : blockedDomains) {
if (sni.contains(domain)) return true;
}
}
return false;
}
public void blockApp(String appName) { /* ... */ }
public void blockDomain(String domain) { /* ... */ }
public void blockIP(String ip) { /* ... */ }
}
### FastPathProcessor.java / FPManager.java
public class FastPathProcessor extends Thread {
private int id;
private LinkedBlockingQueue inputQueue;
private LinkedBlockingQueue outputQueue;
private Map localFlows;
private RuleManager ruleManager;
private volatile boolean running = true;
@Override
public void run() {
while (running) {
PacketJob job = inputQueue.take();
Connection flow = localFlows.computeIfAbsent(
job.getFiveTuple(),
k -> new Connection(k)
);
// DPI processing
processPacket(job, flow);
// Decision: forward or drop
if (!flow.isBlocked()) {
outputQueue.put(job);
}
}
}
}
### LoadBalancer.java / LBManager.java
**Purpose:** Distribute packets to Fast Path workers using consistent hashing
public class LoadBalancer extends Thread {
private int id;
private LinkedBlockingQueue inputQueue;
private FastPathProcessor[] fastPaths;
private volatile boolean running = true;
@Override
public void run() {
while (running) {
PacketJob job = inputQueue.take();
// Consistent hash: same 5-tuple always to same FP
int fpIdx = Math.abs(
job.getFiveTuple().hashCode() % fastPaths.length
);
// Dispatch to Fast Path
fastPaths[fpIdx].getInputQueue().put(job);
}
}
}
### DPIEngine.java
public class DPIEngine {
private int numLBs = 2;
private int numFPsPerLB = 2;
private LoadBalancer[] loadBalancers;
private FastPathProcessor[] fastPaths;
public void run(String inputFile, String outputFile) {
// Create threads
initializeThreads();
// Start processing (reader is main thread)
PcapReader reader = new PcapReader(inputFile);
PcapWriter writer = new PcapWriter(outputFile);
readAndDistribute(reader); // Main thread reads & distributes
// Shutdown threads gracefully
shutdown();
// Generate report
printStatistics();
}
}
## 8. How SNI Extraction Works
### The TLS Handshake
When you visit `https://www.youtube.com`:
┌──────────┐ ┌──────────┐
│ Browser │ │ Server │
└────┬─────┘ └────┬─────┘
│ │
│ ──── Client Hello ─────────────────────►│
│ (includes SNI: www.youtube.com) │
│ │
│ ◄─── Server Hello ───────────────────── │
│ (includes certificate) │
│ │
│ ──── Key Exchange ─────────────────────►│
│ │
│ ◄═══ Encrypted Data ══════════════════► │
│ (from here on, everything is │
│ encrypted - we can't see it) │
**We can only extract SNI from the Client Hello!**
### TLS Client Hello Structure
Byte 0: Content Type = 0x16 (Handshake)
Bytes 1-2: Version = 0x0301 (TLS 1.0)
Bytes 3-4: Record Length
-- Handshake Layer --
Byte 5: Handshake Type = 0x01 (Client Hello)
Bytes 6-8: Handshake Length
-- Client Hello Body --
Bytes 9-10: Client Version
Bytes 11-42: Random (32 bytes)
Byte 43: Session ID Length (N)
Bytes 44 to 44+N: Session ID
... Cipher Suites ...
... Compression Methods ...
-- Extensions --
Bytes X-X+1: Extensions Length
For each extension:
Bytes: Extension Type (2)
Bytes: Extension Length (2)
Bytes: Extension Data
-- SNI Extension (Type 0x0000) --
Extension Type: 0x0000
Extension Length: L
SNI List Length: M
SNI Type: 0x00 (hostname)
SNI Length: K
SNI Value: "www.youtube.com" ← THE GOAL!
### Our Extraction Code (from SNIExtractor.java)
public static String extractSNI(byte[] payload, int length) {
if (length < 43) return null;
// Check TLS record header
if (payload[0] != 0x16) return null; // Not handshake
if (payload[5] != 0x01) return null; // Not Client Hello
int offset = 43; // Skip to session ID
if (offset >= length) return null;
// Skip Session ID
int sessionLen = payload[offset] & 0xFF;
offset += 1 + sessionLen;
if (offset + 2 > length) return null;
// Skip Cipher Suites
int cipherLen = readUint16BE(payload, offset);
offset += 2 + cipherLen;
if (offset >= length) return null;
// Skip Compression Methods
int compLen = payload[offset] & 0xFF;
offset += 1 + compLen;
if (offset + 2 > length) return null;
// Read Extensions Length
int extLen = readUint16BE(payload, offset);
offset += 2;
// Search for SNI extension
int extEnd = offset + extLen;
while (offset + 4 <= extEnd && offset + 4 <= length) {
int extType = readUint16BE(payload, offset);
int extDataLen = readUint16BE(payload, offset + 2);
offset += 4;
if (extType == 0x0000) { // SNI!
// Parse SNI structure
if (offset + 5 <= length) {
int sniLen = readUint16BE(payload, offset + 3);
if (offset + 5 + sniLen <= length) {
return new String(payload, offset + 5, sniLen,
StandardCharsets.US_ASCII);
}
}
return null;
}
offset += extDataLen;
}
return null; // SNI not found
}
private static int readUint16BE(byte[] data, int offset) {
return ((data[offset] & 0xFF) << 8) | (data[offset + 1] & 0xFF);
}
## 9. How Blocking Works
### Rule Types
| Rule Type | Example | What it Blocks |
|-----------|---------|----------------|
| IP | `192.168.1.50` | All traffic from this source |
| App | `YouTube` | All YouTube connections |
| Domain | `tiktok` | Any SNI containing "tiktok" |
### The Blocking Flow
Packet arrives
│
▼
┌─────────────────────────────────┐
│ Is source IP in blocked list? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
┌─────────────────────────────────┐
│ Is app type in blocked list? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
┌─────────────────────────────────┐
│ Does SNI match blocked domain? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
FORWARD
### Flow-Based Blocking
**Important:** We block at the *flow* level, not packet level.
Connection to YouTube:
Packet 1 (SYN) → No SNI yet, FORWARD
Packet 2 (SYN-ACK) → No SNI yet, FORWARD
Packet 3 (ACK) → No SNI yet, FORWARD
Packet 4 (Client Hello) → SNI: www.youtube.com
→ App: YOUTUBE (blocked!)
→ Mark flow as BLOCKED
→ DROP this packet
Packet 5 (Data) → Flow is BLOCKED → DROP
Packet 6 (Data) → Flow is BLOCKED → DROP
...all subsequent packets → DROP
**Why this approach?**
- We can't identify the app until we see the Client Hello
- Once identified, we block all future packets of that flow
- The connection will fail/timeout on the client
## 10. Building and Running
### Prerequisites
- **Java 17+** - Check with `java -version`
- **Apache Maven** - Check with `mvn -version`
If you don't have these installed, see [WINDOWS_SETUP.md](WINDOWS_SETUP.md).
### Build
**Compile the project:**
mvn clean compile
### Generate Test Data
Before running, generate a sample PCAP file with test traffic:
**Windows PowerShell:**
javac GenerateTestPcap.java
java GenerateTestPcap
**Linux/macOS:**
javac GenerateTestPcap.java
java GenerateTestPcap
This creates `test_dpi.pcap` with various traffic types (HTTPS, DNS, HTTP, etc.).
### Run the Engine
**Windows PowerShell:**
*Simple single-threaded version (for learning):*
mvn "-Dexec.mainClass=dpi.main.MainSimple" "-Dexec.args=test_dpi.pcap" exec:java
*Multi-threaded production version (recommended):*
mvn "-Dexec.mainClass=dpi.main.MainDpi" "-Dexec.args=test_dpi.pcap output.pcap" exec:java
*With blocking rules:*
mvn "-Dexec.mainClass=dpi.main.MainDpi" "-Dexec.args=test_dpi.pcap output.pcap --block-app YouTube --block-domain facebook" exec:java
**Linux/macOS:**
*Simple single-threaded version:*
mvn exec:java -Dexec.mainClass="dpi.main.MainSimple" -Dexec.args="test_dpi.pcap"
*Multi-threaded production version:*
mvn exec:java -Dexec.mainClass="dpi.main.MainDpi" -Dexec.args="test_dpi.pcap output.pcap"
*With blocking rules:*
mvn exec:java -Dexec.mainClass="dpi.main.MainDpi" -Dexec.args="test_dpi.pcap output.pcap --block-app YouTube --block-domain facebook"
### Supported Apps for Blocking
`Google`, `YouTube`, `Facebook`, `Twitter`, `Instagram`, `Netflix`, `Amazon`, `Microsoft`, `Apple`, `WhatsApp`, `TikTok`, `Spotify`, `Discord`, `GitHub`, `Twitch`, `Reddit`
## 11. Understanding the Output
### Sample Output
╔══════════════════════════════════════════════════════════════╗
║ DPI ENGINE v1.0 ║
║ Deep Packet Inspection System ║
╠══════════════════════════════════════════════════════════════╣
║ Configuration: ║
║ Load Balancers: 2 ║
║ FPs per LB: 2 ║
║ Total FP threads: 4 ║
╚══════════════════════════════════════════════════════════════╝
[FPManager] Created 4 fast path processors
[LBManager] Created 2 load balancers, 2 FPs each
[DPIEngine] Initialized successfully
[DPIEngine] Processing: test_dpi.pcap
[DPIEngine] Output to: output.pcap
[FP0] Started
[FP1] Started
[FP2] Started
[FP3] Started
[LB0] Started (serving FP0-FP1)
[LB1] Started (serving FP2-FP3)
[DPIEngine] All threads started
Opened PCAP file: test_dpi.pcap
Version: 2.4
Snaplen: 65535 bytes
Link type: 1 (Ethernet)
[Reader] Starting packet processing...
[Reader] Finished reading 77 packets
[LB0] Stopped
[LB1] Stopped
[FP0] Stopped (processed 25 packets)
[FP1] Stopped (processed 0 packets)
[FP2] Stopped (processed 0 packets)
[FP3] Stopped (processed 52 packets)
[DPIEngine] All threads stopped
╔══════════════════════════════════════════════════════════════╗
║ DPI ENGINE STATISTICS ║
╠══════════════════════════════════════════════════════════════╣
║ PACKET STATISTICS ║
║ Total Packets: 77 ║
║ Total Bytes: 4394 ║
║ TCP Packets: 73 ║
║ UDP Packets: 4 ║
╠══════════════════════════════════════════════════════════════╣
║ FILTERING STATISTICS ║
║ Forwarded: 77 ║
║ Dropped/Blocked: 0 ║
║ Drop Rate: 0.00% ║
╠══════════════════════════════════════════════════════════════╣
║ LOAD BALANCER STATISTICS ║
║ LB Received: 77 ║
║ LB Dispatched: 77 ║
╠══════════════════════════════════════════════════════════════╣
║ FAST PATH STATISTICS ║
║ FP Processed: 77 ║
║ FP Forwarded: 77 ║
║ FP Dropped: 0 ║
║ Active Connections: 43 ║
╠══════════════════════════════════════════════════════════════╣
║ BLOCKING RULES ║
║ Blocked IPs: 0 ║
║ Blocked Apps: 0 ║
║ Blocked Domains: 0 ║
║ Blocked Ports: 0 ║
╚══════════════════════════════════════════════════════════════╝
╔══════════════════════════════════════════════════════════════╗
║ APPLICATION CLASSIFICATION REPORT ║
╠══════════════════════════════════════════════════════════════╣
║ Total Connections: 43 ║
║ Classified: 6 (14.0%) ║
║ Unidentified: 37 (86.0%) ║
╠══════════════════════════════════════════════════════════════╣
║ APPLICATION DISTRIBUTION ║
╠══════════════════════════════════════════════════════════════╣
║ Unknown 37 86.0% ################# ║
║ DNS 4 9.3% # ║
║ HTTPS 2 4.7% ║
╚══════════════════════════════════════════════════════════════╝
Processing complete!
Output written to: output.pcap
### What Each Section Means
| Section | Meaning |
|---------|---------|
| Configuration | Number of threads and their arrangement |
| Packet Statistics | Total TCP/UDP packet counts and sizes |
| Filtering Statistics | How many packets were allowed (Forwarded) vs Blocked (Dropped) |
| Load Balancer Statistics | Work received and dispatched by LBs |
| Fast Path Statistics | How packets were distributed and processed across FPs |
| Blocking Rules | Summary of active blocking rules |
| Application Breakdown | Traffic classification results with percentages |
## Summary
This DPI engine demonstrates:
1. **Network Protocol Parsing** - Understanding packet structure at byte level
2. **Deep Packet Inspection** - Extracting information from encrypted connections
3. **Flow Tracking** - Managing stateful connections with Java's collections
4. **Multi-threaded Architecture** - Scaling with `LinkedBlockingQueue` and thread pools
5. **Producer-Consumer Pattern** - Efficient thread-safe work distribution
**The key insight:** Even though HTTPS traffic is encrypted, the domain name (SNI) is visible in the TLS Client Hello, allowing network operators to identify and control application usage.
## Questions?
The code is well-organized with clear package structure and follows the flow described in this document:
- Start with `MainSimple.java` to understand the concepts
- Move to `MainDpi.java` to see how parallelism is added
- Examine individual components in each package for detailed implementation
Happy learning! 🚀
标签:DNS分析, HTTP分析, Java编程, JS文件枚举, PCAP文件处理, QUIC协议, SNI提取, 加密流量识别, 域名枚举, 域名阻塞, 多线程处理, 应用协议提取, 应用程序识别, 恶意活动检测, 流量过滤, 深度包检测, 纯Java实现, 网络安全, 网络安全分析, 连接跟踪, 防火墙, 隐私保护, 高性能计算