aditya-620/CipherEye-Deep-Packet-Inspection

GitHub: aditya-620/CipherEye-Deep-Packet-Inspection

一个纯 Java 的深度包检测引擎,用于分析网络流量并实施域名阻断规则。

Stars: 0 | Forks: 0

# Java Deep Packet Inspection (DPI) Engine A multi-threaded, high-performance Deep Packet Inspection engine written in **100% Java**. This project analyzes raw network traffic (PCAP files), identifies applications (even when encrypted via TLS/HTTPS), and can enforce blocking rules at the network level. It simulates how a corporate firewall or internet service provider analyzes your traffic, without requiring any native C++ libraries (no libpcap/winpcap needed). ## Table of Contents 1. [What is DPI?](#1-what-is-dpi) 2. [Networking Background](#2-networking-background) 3. [Project Overview](#3-project-overview) 4. [File Structure](#4-file-structure) 5. [The Journey of a Packet (Simple Version)](#5-the-journey-of-a-packet-simple-version) 6. [The Journey of a Packet (Multi-threaded Version)](#6-the-journey-of-a-packet-multi-threaded-version) 7. [Deep Dive: Each Component](#7-deep-dive-each-component) 8. [How SNI Extraction Works](#8-how-sni-extraction-works) 9. [How Blocking Works](#9-how-blocking-works) 10. [Building and Running](#10-building-and-running) 11. [Understanding the Output](#11-understanding-the-output) ## 1. What is DPI? **Deep Packet Inspection (DPI)** is a technology used to examine the contents of network packets as they pass through a checkpoint. Unlike simple firewalls that only look at packet headers (source/destination IP), DPI looks *inside* the packet payload. ### Real-World Uses: - **ISPs**: Throttle or block certain applications (e.g., BitTorrent) - **Enterprises**: Block social media on office networks - **Parental Controls**: Block inappropriate websites - **Security**: Detect malware or intrusion attempts ### What Our DPI Engine Does: User Traffic (PCAP) → [DPI Engine] → Filtered Traffic (PCAP) ↓ - Identifies apps (YouTube, Facebook, etc.) - Blocks based on rules - Generates reports - Multi-threaded processing ## 2. Networking Background ### The Network Stack (Layers) When you visit a website, data travels through multiple "layers": ┌─────────────────────────────────────────────────────────┐ │ Layer 7: Application │ HTTP, TLS, DNS │ ├─────────────────────────────────────────────────────────┤ │ Layer 4: Transport │ TCP (reliable), UDP (fast) │ ├─────────────────────────────────────────────────────────┤ │ Layer 3: Network │ IP addresses (routing) │ ├─────────────────────────────────────────────────────────┤ │ Layer 2: Data Link │ MAC addresses (local network)│ └─────────────────────────────────────────────────────────┘ ### A Packet's Structure Every network packet is like a **Russian nesting doll** - headers wrapped inside headers: ┌──────────────────────────────────────────────────────────────────┐ │ Ethernet Header (14 bytes) │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ │ IP Header (20 bytes) │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ │ │ TCP Header (20 bytes) │ │ │ │ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Payload (Application Data) │ │ │ │ │ │ │ │ e.g., TLS Client Hello with SNI │ │ │ │ │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────┘ ### The Five-Tuple A **connection** (or "flow") is uniquely identified by 5 values: | Field | Example | Purpose | |-------|---------|---------| | Source IP | 192.168.1.100 | Who is sending | | Destination IP | 172.217.14.206 | Where it's going | | Source Port | 54321 | Sender's application identifier | | Destination Port | 443 | Service being accessed (443 = HTTPS) | | Protocol | TCP (6) | TCP or UDP | **Why is this important?** - All packets with the same 5-tuple belong to the same connection - If we block one packet of a connection, we should block all of them - This is how we "track" conversations between computers In our Java code, this is represented by the `FiveTuple` class: public class FiveTuple { public long srcIp; public long dstIp; public int srcPort; public int dstPort; public int protocol; // Custom hashCode implementation ensures packets from the // same connection always get the same hash value! @Override public int hashCode() { return Objects.hash(srcIp, dstIp, srcPort, dstPort, protocol); } } ### What is SNI? **Server Name Indication (SNI)** is part of the TLS/HTTPS handshake. When you visit `https://www.youtube.com`: 1. Your browser sends a "Client Hello" packet before the encryption begins 2. This message includes the domain name in **plaintext** (not encrypted yet!) 3. The server uses this to know which certificate to send TLS Client Hello: ├── Version: TLS 1.2 ├── Random: [32 bytes] ├── Cipher Suites: [list] └── Extensions: └── SNI Extension: └── Server Name: "www.youtube.com" ← We extract THIS! **This is the key to DPI**: Even though HTTPS is encrypted, the domain name is visible in the first packet! ## 3. Project Overview ### What This Project Does ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Wireshark │ │ DPI Engine │ │ Output │ │ Capture │ ──► │ │ ──► │ PCAP │ │ (input.pcap)│ │ - Parse │ │ (filtered) │ └─────────────┘ │ - Classify │ └─────────────┘ │ - Block │ │ - Report │ └─────────────┘ ### Three Main Versions | Version | File | Use Case | |---------|------|----------| | Working (Single-threaded) | `MainWorking.java` | Learning, small captures | | Simple (Single-threaded) | `MainSimple.java` | Testing and debugging | | DPI Multi-threaded | `MainDpi.java` | Production, large captures | ## 4. File Structure src/main/java/dpi/ ├── parser/ # Network protocol parsing │ ├── PacketParser.java # Extract headers from raw bytes │ ├── ParsedPacket.java # Parsed packet representation │ ├── PcapReader.java # PCAP file reading │ ├── PcapGlobalHeader.java # PCAP global header structure │ ├── PcapPacketHeader.java # PCAP packet header structure │ ├── RawPacket.java # Raw packet bytes container │ └── PortableNet.java # Network utilities (IP parsing, etc.) │ ├── extractor/ # Application detection │ ├── DNSExtractor.java # DNS domain extraction │ ├── SNIExtractor.java # TLS SNI extraction ★ KEY COMPONENT │ ├── HTTPHostExtractor.java # HTTP Host header extraction │ └── QUICSNIExtractor.java # QUIC SNI extraction │ ├── engine/ # Multi-threaded processing │ ├── DPIEngine.java # Main orchestrator │ ├── FastPathProcessor.java # Worker thread for DPI processing │ ├── FPManager.java # Manages FastPath threads │ ├── LoadBalancer.java # Distributes packets to workers │ ├── LBManager.java # Manages LoadBalancer threads │ ├── ConnectionTracker.java # Tracks connection state │ ├── RuleManager.java # Manages blocking rules │ ├── GlobalConnectionTable.java # Thread-safe connection storage │ └── ThreadSafeQueue.java # Thread-safe packet queue │ ├── types/ # Data structures │ ├── AppType.java # Application type enum │ ├── Connection.java # Connection state tracking │ ├── ConnectionState.java # State machine for flows │ ├── DPIStats.java # Statistics collection │ ├── FiveTuple.java # Connection identifier │ ├── PacketAction.java # Action enum (FORWARD/DROP) │ └── PacketJob.java # Work unit (packet + metadata) │ └── main/ # Entry points ├── DpiMt.java # Multi-threaded version ├── Main.java # Old main (deprecated) ├── MainDpi.java # ★ RECOMMENDED: Production version ├── MainSimple.java # Learning: Single-threaded └── MainWorking.java # Working: Single-threaded ## 5. The Journey of a Packet (Simple Version) Let's trace a single packet through `MainSimple.java`: ### Step 1: Read PCAP File PcapReader reader = new PcapReader("test_dpi.pcap"); **What happens:** 1. Open the file in binary mode 2. Read the 24-byte global header (magic number, version, etc.) 3. Verify it's a valid PCAP file **PCAP File Format:** ┌────────────────────────────┐ │ Global Header (24 bytes) │ ← Read once at start │ (magic, version, snaplen) │ ├────────────────────────────┤ │ Packet Header (16 bytes) │ ← Timestamp, length │ Packet Data (variable) │ ← Actual network bytes ├────────────────────────────┤ │ Packet Header (16 bytes) │ │ Packet Data (variable) │ ├────────────────────────────┤ │ ... more packets ... │ └────────────────────────────┘ ### Step 2: Read Each Packet RawPacket raw; while ((raw = reader.readPacket()) != null) { // raw.data contains the packet bytes // raw.header contains timestamp and length } **What happens:** 1. Read 16-byte packet header 2. Read N bytes of packet data (N = header.incl_len) 3. Return null when no more packets ### Step 3: Parse Protocol Headers ParsedPacket parsed = PacketParser.parse(raw); **What happens (in PacketParser.java):** raw.data bytes: [0-13] Ethernet Header [14-33] IP Header [34-53] TCP Header [54+] Payload After parsing: parsed.srcMac = "00:11:22:33:44:55" parsed.destMac = "aa:bb:cc:dd:ee:ff" parsed.srcIp = "192.168.1.100" parsed.destIp = "172.217.14.206" parsed.srcPort = 54321 parsed.destPort = 443 parsed.protocol = 6 (TCP) parsed.hasTcp = true ### Step 4: Create Five-Tuple and Look Up Flow FiveTuple tuple = new FiveTuple( parsed.srcIp, parsed.destIp, parsed.srcPort, parsed.destPort, parsed.protocol ); Connection flow = connections.computeIfAbsent(tuple, k -> new Connection(k)); **What happens:** - The flow table is a `ConcurrentHashMap`: `FiveTuple → Connection` - If this 5-tuple exists, we get the existing flow - If not, a new flow is created - All packets with the same 5-tuple share the same flow ### Step 5: Extract SNI (Deep Packet Inspection) // For HTTPS traffic (port 443) if (parsed.destPort == 443 && parsed.payloadLength > 5) { String sni = SNIExtractor.extractSNI(payload, parsed.payloadLength); if (sni != null) { flow.setSni(sni); // "www.youtube.com" flow.setAppType(ClassificationEngine.sniToAppType(sni)); // YOUTUBE } } ### Step 6: Check Blocking Rules if (ruleManager.isBlocked(tuple.srcIp, flow.getAppType(), flow.getSni())) { flow.setBlocked(true); } ### Step 7: Forward or Drop if (flow.isBlocked()) { stats.incrementDropped(); // Don't write to output } else { stats.incrementForwarded(); // Write packet to output file writer.writePacket(raw); } ### Step 8: Generate Report After processing all packets: // Count apps and print statistics DPIStats.printReport(stats); ## 6. The Journey of a Packet (Multi-threaded Version) The multi-threaded version (`MainDpi.java`) adds **parallelism** for high performance: ### Architecture Overview ┌─────────────────┐ │ Reader Thread │ │ (reads PCAP) │ └────────┬────────┘ │ ┌──────────────┴──────────────┐ │ hash(5-tuple) % 2 │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ LB0 Thread │ │ LB1 Thread │ │ (LoadBalancer) │ │ (LoadBalancer) │ └────────┬────────┘ └────────┬────────┘ │ │ ┌──────┴──────┐ ┌──────┴──────┐ │hash % 2 │ │hash % 2 │ ▼ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │FP0 │ │FP1 │ │FP2 │ │FP3 │ │Thread │ │Thread │ │Thread │ │Thread │ └─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘ │ │ │ │ └────────────┴──────────────┴────────────┘ │ ▼ ┌───────────────────────┐ │ Output Queue │ └───────────┬───────────┘ │ ▼ ┌───────────────────────┐ │ Output Writer Thread │ │ (writes to PCAP) │ └───────────────────────┘ ### Why This Design? 1. **Load Balancers (LBs):** Distribute work across FPs using `LinkedBlockingQueue` 2. **Fast Paths (FPs):** Do the actual DPI processing 3. **Consistent Hashing:** Same 5-tuple always goes to same FP **Why consistent hashing matters:** Connection: 192.168.1.100:54321 → 142.250.185.206:443 Packet 1 (SYN): hash → FP2 Packet 2 (SYN-ACK): hash → FP2 (same FP!) Packet 3 (ACK): hash → FP2 (same FP!) Packet 4 (Client Hello): hash → FP2 (same FP!) All packets of this connection go to FP2. FP2 can track the flow state correctly. ### Detailed Flow #### Step 1: Reader Thread (Main Thread) PcapReader reader = new PcapReader("test_dpi.pcap"); RawPacket raw; while ((raw = reader.readPacket()) != null) { PacketJob job = new PacketJob(raw); // Hash to select Load Balancer int lbIdx = Math.abs(job.getFiveTuple().hashCode()) % numLBs; // Push to LB's queue lbQueues[lbIdx].put(job); // Blocks if queue is full } #### Step 2: Load Balancer Thread // In LBManager.java public void run() { while (running) { try { PacketJob job = inputQueue.take(); // Blocks until available // Hash to select Fast Path int fpIdx = Math.abs(job.getFiveTuple().hashCode()) % numFPs; // Push to FP's queue fpQueues[fpIdx].put(job); stats.incrementDispatched(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } } #### Step 3: Fast Path Thread // In FastPathProcessor.java public void run() { while (running) { try { PacketJob job = inputQueue.take(); // Look up flow (each FP has its own flow table) FiveTuple tuple = job.getFiveTuple(); Connection flow = localFlows.computeIfAbsent(tuple, k -> new Connection(k)); // Classify (SNI extraction) classifyFlow(job, flow); // Check rules if (ruleManager.isBlocked(tuple.srcIp, flow.getAppType(), flow.getSni())) { stats.incrementDropped(); } else { // Forward: push to output queue outputQueue.put(job); stats.incrementForwarded(); } } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } } #### Step 4: Output Writer Thread // In DPIEngine.java outputThread() while (running || !outputQueue.isEmpty()) { try { PacketJob job = outputQueue.take(); // Write to output file pcapWriter.writePacket(job.getRawPacket()); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } ## 7. Deep Dive: Each Component ### PcapReader.java / RawPacket.java **Purpose:** Read network captures saved by Wireshark **Key classes:** public class PcapGlobalHeader { public int magicNumber; // 0xa1b2c3d4 identifies PCAP public int versionMajor; // Usually 2 public int versionMinor; // Usually 4 public int snapLen; // Max packet size captured public int network; // 1 = Ethernet } public class PcapPacketHeader { public int tsSec; // Timestamp (seconds) public int tsUsec; // Timestamp (microseconds) public int inclLen; // Bytes saved in file public int origLen; // Original packet size } public class RawPacket { public byte[] data; // Raw packet bytes public PcapPacketHeader header; } ### PacketParser.java / ParsedPacket.java **Purpose:** Extract protocol fields from raw bytes **Key method:** public static ParsedPacket parse(RawPacket raw) { ParsedPacket parsed = new ParsedPacket(); parseEthernet(raw.data, parsed); // Extract MACs, EtherType parseIPv4(raw.data, parsed); // Extract IPs, protocol, TTL if (parsed.protocol == 6) { // TCP parseTCP(raw.data, parsed); // Extract ports, flags, seq } else if (parsed.protocol == 17) { // UDP parseUDP(raw.data, parsed); // Extract ports } return parsed; } ### SNIExtractor.java / HTTPHostExtractor.java / QUICSNIExtractor.java **Purpose:** Extract domain names from TLS, HTTP, and QUIC **For TLS (HTTPS):** public class SNIExtractor { public static String extractSNI(byte[] payload, int length) { // 1. Verify TLS record header (0x16 = Handshake) // 2. Verify Client Hello (0x01) // 3. Skip to extensions // 4. Find SNI extension (type 0x0000) // 5. Extract hostname string } } **For HTTP:** public class HTTPHostExtractor { public static String extractHost(byte[] payload, int length) { // 1. Verify HTTP request (GET, POST, etc.) // 2. Search for "Host: " header // 3. Extract value until newline } } ### Connection.java / ConnectionState.java **Purpose:** Track state of a connection/flow public class Connection { private FiveTuple tuple; private AppType appType = AppType.UNKNOWN; private String sni; private ConnectionState state = ConnectionState.NEW; private boolean blocked = false; private long firstSeen; private long lastSeen; private int packetCount = 0; // Getters and setters... } public enum ConnectionState { NEW, // Connection just created ESTABLISHED, // SYN-ACK received CLASSIFIED, // SNI or Host extracted BLOCKED, // Blocked by rules CLOSED // FIN received } ### RuleManager.java **Purpose:** Manage blocking rules and check if traffic should be blocked public class RuleManager { private Set blockedIPs = new HashSet<>(); private Set blockedApps = new HashSet<>(); private Set blockedDomains = new HashSet<>(); public boolean isBlocked(long srcIp, AppType appType, String sni) { // Check IP blacklist if (blockedIPs.contains(srcIp)) return true; // Check app blacklist if (blockedApps.contains(appType)) return true; // Check domain blacklist (substring match) if (sni != null) { for (String domain : blockedDomains) { if (sni.contains(domain)) return true; } } return false; } public void blockApp(String appName) { /* ... */ } public void blockDomain(String domain) { /* ... */ } public void blockIP(String ip) { /* ... */ } } ### FastPathProcessor.java / FPManager.java public class FastPathProcessor extends Thread { private int id; private LinkedBlockingQueue inputQueue; private LinkedBlockingQueue outputQueue; private Map localFlows; private RuleManager ruleManager; private volatile boolean running = true; @Override public void run() { while (running) { PacketJob job = inputQueue.take(); Connection flow = localFlows.computeIfAbsent( job.getFiveTuple(), k -> new Connection(k) ); // DPI processing processPacket(job, flow); // Decision: forward or drop if (!flow.isBlocked()) { outputQueue.put(job); } } } } ### LoadBalancer.java / LBManager.java **Purpose:** Distribute packets to Fast Path workers using consistent hashing public class LoadBalancer extends Thread { private int id; private LinkedBlockingQueue inputQueue; private FastPathProcessor[] fastPaths; private volatile boolean running = true; @Override public void run() { while (running) { PacketJob job = inputQueue.take(); // Consistent hash: same 5-tuple always to same FP int fpIdx = Math.abs( job.getFiveTuple().hashCode() % fastPaths.length ); // Dispatch to Fast Path fastPaths[fpIdx].getInputQueue().put(job); } } } ### DPIEngine.java public class DPIEngine { private int numLBs = 2; private int numFPsPerLB = 2; private LoadBalancer[] loadBalancers; private FastPathProcessor[] fastPaths; public void run(String inputFile, String outputFile) { // Create threads initializeThreads(); // Start processing (reader is main thread) PcapReader reader = new PcapReader(inputFile); PcapWriter writer = new PcapWriter(outputFile); readAndDistribute(reader); // Main thread reads & distributes // Shutdown threads gracefully shutdown(); // Generate report printStatistics(); } } ## 8. How SNI Extraction Works ### The TLS Handshake When you visit `https://www.youtube.com`: ┌──────────┐ ┌──────────┐ │ Browser │ │ Server │ └────┬─────┘ └────┬─────┘ │ │ │ ──── Client Hello ─────────────────────►│ │ (includes SNI: www.youtube.com) │ │ │ │ ◄─── Server Hello ───────────────────── │ │ (includes certificate) │ │ │ │ ──── Key Exchange ─────────────────────►│ │ │ │ ◄═══ Encrypted Data ══════════════════► │ │ (from here on, everything is │ │ encrypted - we can't see it) │ **We can only extract SNI from the Client Hello!** ### TLS Client Hello Structure Byte 0: Content Type = 0x16 (Handshake) Bytes 1-2: Version = 0x0301 (TLS 1.0) Bytes 3-4: Record Length -- Handshake Layer -- Byte 5: Handshake Type = 0x01 (Client Hello) Bytes 6-8: Handshake Length -- Client Hello Body -- Bytes 9-10: Client Version Bytes 11-42: Random (32 bytes) Byte 43: Session ID Length (N) Bytes 44 to 44+N: Session ID ... Cipher Suites ... ... Compression Methods ... -- Extensions -- Bytes X-X+1: Extensions Length For each extension: Bytes: Extension Type (2) Bytes: Extension Length (2) Bytes: Extension Data -- SNI Extension (Type 0x0000) -- Extension Type: 0x0000 Extension Length: L SNI List Length: M SNI Type: 0x00 (hostname) SNI Length: K SNI Value: "www.youtube.com" ← THE GOAL! ### Our Extraction Code (from SNIExtractor.java) public static String extractSNI(byte[] payload, int length) { if (length < 43) return null; // Check TLS record header if (payload[0] != 0x16) return null; // Not handshake if (payload[5] != 0x01) return null; // Not Client Hello int offset = 43; // Skip to session ID if (offset >= length) return null; // Skip Session ID int sessionLen = payload[offset] & 0xFF; offset += 1 + sessionLen; if (offset + 2 > length) return null; // Skip Cipher Suites int cipherLen = readUint16BE(payload, offset); offset += 2 + cipherLen; if (offset >= length) return null; // Skip Compression Methods int compLen = payload[offset] & 0xFF; offset += 1 + compLen; if (offset + 2 > length) return null; // Read Extensions Length int extLen = readUint16BE(payload, offset); offset += 2; // Search for SNI extension int extEnd = offset + extLen; while (offset + 4 <= extEnd && offset + 4 <= length) { int extType = readUint16BE(payload, offset); int extDataLen = readUint16BE(payload, offset + 2); offset += 4; if (extType == 0x0000) { // SNI! // Parse SNI structure if (offset + 5 <= length) { int sniLen = readUint16BE(payload, offset + 3); if (offset + 5 + sniLen <= length) { return new String(payload, offset + 5, sniLen, StandardCharsets.US_ASCII); } } return null; } offset += extDataLen; } return null; // SNI not found } private static int readUint16BE(byte[] data, int offset) { return ((data[offset] & 0xFF) << 8) | (data[offset + 1] & 0xFF); } ## 9. How Blocking Works ### Rule Types | Rule Type | Example | What it Blocks | |-----------|---------|----------------| | IP | `192.168.1.50` | All traffic from this source | | App | `YouTube` | All YouTube connections | | Domain | `tiktok` | Any SNI containing "tiktok" | ### The Blocking Flow Packet arrives │ ▼ ┌─────────────────────────────────┐ │ Is source IP in blocked list? │──Yes──► DROP └───────────────┬─────────────────┘ │No ▼ ┌─────────────────────────────────┐ │ Is app type in blocked list? │──Yes──► DROP └───────────────┬─────────────────┘ │No ▼ ┌─────────────────────────────────┐ │ Does SNI match blocked domain? │──Yes──► DROP └───────────────┬─────────────────┘ │No ▼ FORWARD ### Flow-Based Blocking **Important:** We block at the *flow* level, not packet level. Connection to YouTube: Packet 1 (SYN) → No SNI yet, FORWARD Packet 2 (SYN-ACK) → No SNI yet, FORWARD Packet 3 (ACK) → No SNI yet, FORWARD Packet 4 (Client Hello) → SNI: www.youtube.com → App: YOUTUBE (blocked!) → Mark flow as BLOCKED → DROP this packet Packet 5 (Data) → Flow is BLOCKED → DROP Packet 6 (Data) → Flow is BLOCKED → DROP ...all subsequent packets → DROP **Why this approach?** - We can't identify the app until we see the Client Hello - Once identified, we block all future packets of that flow - The connection will fail/timeout on the client ## 10. Building and Running ### Prerequisites - **Java 17+** - Check with `java -version` - **Apache Maven** - Check with `mvn -version` If you don't have these installed, see [WINDOWS_SETUP.md](WINDOWS_SETUP.md). ### Build **Compile the project:** mvn clean compile ### Generate Test Data Before running, generate a sample PCAP file with test traffic: **Windows PowerShell:** javac GenerateTestPcap.java java GenerateTestPcap **Linux/macOS:** javac GenerateTestPcap.java java GenerateTestPcap This creates `test_dpi.pcap` with various traffic types (HTTPS, DNS, HTTP, etc.). ### Run the Engine **Windows PowerShell:** *Simple single-threaded version (for learning):* mvn "-Dexec.mainClass=dpi.main.MainSimple" "-Dexec.args=test_dpi.pcap" exec:java *Multi-threaded production version (recommended):* mvn "-Dexec.mainClass=dpi.main.MainDpi" "-Dexec.args=test_dpi.pcap output.pcap" exec:java *With blocking rules:* mvn "-Dexec.mainClass=dpi.main.MainDpi" "-Dexec.args=test_dpi.pcap output.pcap --block-app YouTube --block-domain facebook" exec:java **Linux/macOS:** *Simple single-threaded version:* mvn exec:java -Dexec.mainClass="dpi.main.MainSimple" -Dexec.args="test_dpi.pcap" *Multi-threaded production version:* mvn exec:java -Dexec.mainClass="dpi.main.MainDpi" -Dexec.args="test_dpi.pcap output.pcap" *With blocking rules:* mvn exec:java -Dexec.mainClass="dpi.main.MainDpi" -Dexec.args="test_dpi.pcap output.pcap --block-app YouTube --block-domain facebook" ### Supported Apps for Blocking `Google`, `YouTube`, `Facebook`, `Twitter`, `Instagram`, `Netflix`, `Amazon`, `Microsoft`, `Apple`, `WhatsApp`, `TikTok`, `Spotify`, `Discord`, `GitHub`, `Twitch`, `Reddit` ## 11. Understanding the Output ### Sample Output ╔══════════════════════════════════════════════════════════════╗ ║ DPI ENGINE v1.0 ║ ║ Deep Packet Inspection System ║ ╠══════════════════════════════════════════════════════════════╣ ║ Configuration: ║ ║ Load Balancers: 2 ║ ║ FPs per LB: 2 ║ ║ Total FP threads: 4 ║ ╚══════════════════════════════════════════════════════════════╝ [FPManager] Created 4 fast path processors [LBManager] Created 2 load balancers, 2 FPs each [DPIEngine] Initialized successfully [DPIEngine] Processing: test_dpi.pcap [DPIEngine] Output to: output.pcap [FP0] Started [FP1] Started [FP2] Started [FP3] Started [LB0] Started (serving FP0-FP1) [LB1] Started (serving FP2-FP3) [DPIEngine] All threads started Opened PCAP file: test_dpi.pcap Version: 2.4 Snaplen: 65535 bytes Link type: 1 (Ethernet) [Reader] Starting packet processing... [Reader] Finished reading 77 packets [LB0] Stopped [LB1] Stopped [FP0] Stopped (processed 25 packets) [FP1] Stopped (processed 0 packets) [FP2] Stopped (processed 0 packets) [FP3] Stopped (processed 52 packets) [DPIEngine] All threads stopped ╔══════════════════════════════════════════════════════════════╗ ║ DPI ENGINE STATISTICS ║ ╠══════════════════════════════════════════════════════════════╣ ║ PACKET STATISTICS ║ ║ Total Packets: 77 ║ ║ Total Bytes: 4394 ║ ║ TCP Packets: 73 ║ ║ UDP Packets: 4 ║ ╠══════════════════════════════════════════════════════════════╣ ║ FILTERING STATISTICS ║ ║ Forwarded: 77 ║ ║ Dropped/Blocked: 0 ║ ║ Drop Rate: 0.00% ║ ╠══════════════════════════════════════════════════════════════╣ ║ LOAD BALANCER STATISTICS ║ ║ LB Received: 77 ║ ║ LB Dispatched: 77 ║ ╠══════════════════════════════════════════════════════════════╣ ║ FAST PATH STATISTICS ║ ║ FP Processed: 77 ║ ║ FP Forwarded: 77 ║ ║ FP Dropped: 0 ║ ║ Active Connections: 43 ║ ╠══════════════════════════════════════════════════════════════╣ ║ BLOCKING RULES ║ ║ Blocked IPs: 0 ║ ║ Blocked Apps: 0 ║ ║ Blocked Domains: 0 ║ ║ Blocked Ports: 0 ║ ╚══════════════════════════════════════════════════════════════╝ ╔══════════════════════════════════════════════════════════════╗ ║ APPLICATION CLASSIFICATION REPORT ║ ╠══════════════════════════════════════════════════════════════╣ ║ Total Connections: 43 ║ ║ Classified: 6 (14.0%) ║ ║ Unidentified: 37 (86.0%) ║ ╠══════════════════════════════════════════════════════════════╣ ║ APPLICATION DISTRIBUTION ║ ╠══════════════════════════════════════════════════════════════╣ ║ Unknown 37 86.0% ################# ║ ║ DNS 4 9.3% # ║ ║ HTTPS 2 4.7% ║ ╚══════════════════════════════════════════════════════════════╝ Processing complete! Output written to: output.pcap ### What Each Section Means | Section | Meaning | |---------|---------| | Configuration | Number of threads and their arrangement | | Packet Statistics | Total TCP/UDP packet counts and sizes | | Filtering Statistics | How many packets were allowed (Forwarded) vs Blocked (Dropped) | | Load Balancer Statistics | Work received and dispatched by LBs | | Fast Path Statistics | How packets were distributed and processed across FPs | | Blocking Rules | Summary of active blocking rules | | Application Breakdown | Traffic classification results with percentages | ## Summary This DPI engine demonstrates: 1. **Network Protocol Parsing** - Understanding packet structure at byte level 2. **Deep Packet Inspection** - Extracting information from encrypted connections 3. **Flow Tracking** - Managing stateful connections with Java's collections 4. **Multi-threaded Architecture** - Scaling with `LinkedBlockingQueue` and thread pools 5. **Producer-Consumer Pattern** - Efficient thread-safe work distribution **The key insight:** Even though HTTPS traffic is encrypted, the domain name (SNI) is visible in the TLS Client Hello, allowing network operators to identify and control application usage. ## Questions? The code is well-organized with clear package structure and follows the flow described in this document: - Start with `MainSimple.java` to understand the concepts - Move to `MainDpi.java` to see how parallelism is added - Examine individual components in each package for detailed implementation Happy learning! 🚀
标签:DNS分析, HTTP分析, Java编程, JS文件枚举, PCAP文件处理, QUIC协议, SNI提取, 加密流量识别, 域名枚举, 域名阻塞, 多线程处理, 应用协议提取, 应用程序识别, 恶意活动检测, 流量过滤, 深度包检测, 纯Java实现, 网络安全, 网络安全分析, 连接跟踪, 防火墙, 隐私保护, 高性能计算