kushanbhagya/AI-Port-Scan-Detection-System

GitHub: kushanbhagya/AI-Port-Scan-Detection-System

这是一个基于机器学习的网络安全项目，通过分析实时网络流量特征来自动检测和识别端口扫描攻击。

Stars: 0 | Forks: 0

``` █████╗ ██╗ ██████╗ ██████╗ ██████╗ ████████╗ ███████╗ ██████╗ █████╗ ███╗ ██╗ ██╔══██╗██║ ██╔══██╗██╔═══██╗██╔══██╗╚══██╔══╝ ██╔════╝██╔════╝██╔══██╗████╗ ██║ ███████║██║ ██████╔╝██║ ██║██████╔╝ ██║ ███████╗██║ ███████║██╔██╗ ██║ ██╔══██║██║ ██╔═══╝ ██║ ██║██╔══██╗ ██║ ╚════██║██║ ██╔══██║██║╚██╗██║ ██║ ██║██║ ██║ ╚██████╔╝██║ ██║ ██║ ███████║╚██████╗██║ ██║██║ ╚████║ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═══╝ ``` ### `🔐 AI × 网络安全 · 机器学习驱动的入侵检测 · 真实攻击模拟`
![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=flat-square&logo=python&logoColor=white) ![scikit-learn](https://img.shields.io/badge/scikit--learn-ML_Engine-F7931E?style=flat-square&logo=scikitlearn&logoColor=white) ![Kali Linux](https://img.shields.io/badge/Kali_Linux-Attacker-557C94?style=flat-square&logo=kalilinux&logoColor=white) ![Status](https://img.shields.io/badge/Status-In_Progress-orange?style=flat-square&logo=statuspage) ![License](https://img.shields.io/badge/Use-Educational_Only-red?style=flat-square&logo=openssl)

## `> cat overview.txt` 一个全栈 AI 安全项目，它**模拟真实的端口扫描攻击**，捕获实时网络流量，提取行为特征，并训练机器学习模型以自动将流量分类为**正常或恶意**。 ``` [Kali Linux] [Ubuntu Target] [AI Engine] Nmap Scanner ──────► tcpdump / Wireshark ──────► Random Forest Model Port Scanner Packet Capture Detection & Alerts ``` ## `> cat objectives.txt` ``` [*] Simulate real-world port scanning attacks in a controlled lab [*] Capture and analyze raw network traffic at the packet level [*] Engineer meaningful features from raw PCAP data [*] Train a machine learning model to detect malicious behavior [*] Build a production-grade AI + Security portfolio project ``` ## `> ls tech-stack/` ### 🔹 网络安全层 | 工具 | 角色 | |------|------| | ![Kali](https://img.shields.io/badge/Kali_Linux-Attacker_Machine-557C94?style=flat-square&logo=kalilinux) | 攻击模拟环境 | | ![Ubuntu](https://img.shields.io/badge/Ubuntu_Server-Target_Machine-E95420?style=flat-square&logo=ubuntu) | 易受攻击的目标主机 | | ![Nmap](https://img.shields.io/badge/Nmap-Port_Scanner-0E83CD?style=flat-square&logo=gnubash) | SYN 扫描，全端口扫描 | | ![Wireshark](https://img.shields.io/badge/tcpdump%20%2F%20Wireshark-Packet_Capture-1679A7?style=flat-square&logo=wireshark) | 流量捕获为 `.pcap` | ### 🔹 AI / ML 层 | 库 | 用途 | |---------|---------| | ![Python](https://img.shields.io/badge/Python-Core_Language-3776AB?style=flat-square&logo=python) | 端到端流水线 | | ![pandas](https://img.shields.io/badge/pandas-Data_Processing-150458?style=flat-square&logo=pandas) | 特征提取与数据集准备 | | ![scikit-learn](https://img.shields.io/badge/scikit--learn-Model_Training-F7931E?style=flat-square&logo=scikitlearn) | Random Forest 分类器 | | ![matplotlib](https://img.shields.io/badge/matplotlib-Visualization-11557C?style=flat-square&logo=python) | 混淆矩阵，指标 | | ![joblib](https://img.shields.io/badge/joblib-Model_Serialization-grey?style=flat-square&logo=python) | 保存和加载训练好的模型 | ### 🔹 可选 ![Streamlit](https://img.shields.io/badge/Streamlit-Live_Dashboard-FF4B4B?style=flat-square&logo=streamlit&logoColor=white) ## `> cat how-it-works.md` ``` STEP 1 Generate Traffic ├── Normal: ping, HTTP, SSH browsing └── Attack: Nmap SYN scan, full port sweep STEP 2 Capture Packets └── tcpdump → .pcap files STEP 3 Feature Engineering ├── Packet count per source IP ├── SYN packet ratio ├── Unique destination ports touched └── Packet rate (packets/second) STEP 4 Label & Structure └── CSV dataset → normal=0 / portscan=1 STEP 5 Train Model └── Random Forest → accuracy + recall + F1 STEP 6 Detect & Alert └── New traffic → prediction → 🚨 ALERT or ✅ CLEAN ``` ## `> tree ai-portscan-detector/` ``` ai-portscan-detector/ │ ├── 📁 data/ │ ├── raw/ # PCAP files (git-ignored) │ └── processed/ # Structured CSV datasets │ ├── 📁 models/ # Saved trained ML models (.pkl) │ ├── 📁 notebooks/ # Jupyter notebooks — EDA & training │ ├── 📁 screenshots/ # Visual documentation │ ├── 📁 src/ │ ├── capture/ # Traffic capture scripts │ ├── preprocess/ # PCAP → feature extraction │ ├── train/ # Model training pipeline │ └── predict/ # Detection & alert logic │ ├── 📄 README.md ├── 📄 requirements.txt └── 📄 main.py ``` ## `> cat roadmap.md` ### ✅ 阶段 1 — 项目设置 - [x] 创建模块化项目结构 - [x] 初始化 GitHub 仓库 - [ ] 配置实验室环境（Kali + Ubuntu 虚拟机） ### 🚧 阶段 2 — 流量生成与捕获 - [ ] 启动目标服务（Apache, SSH, FTP） - [ ] 生成正常流量（ping, HTTP, SSH 会话） - [ ] 使用 Nmap 模拟攻击（`-sS`, `-p-`, `-A`） - [ ] 使用 `tcpdump` 捕获实时数据包 - [ ] 将原始 `.pcap` 文件保存到 `data/raw/` ### 🚧 阶段 3 — 流量分析 - [ ] 在 Wireshark 中打开捕获文件 - [ ] 通过视觉识别正常与扫描行为 - [ ] 过滤 SYN 泛洪、端口扫描、探测模式 ### 🚧 阶段 4 — 特征工程 - [ ] 使用 `scapy` / `pyshark` 将 PCAP 转换为结构化 CSV - [ ] 提取核心特征： | 特征 | 描述 | |---------|-------------| | `packet_count` | 来自源 IP 的数据包总数 | | `syn_count` | 发送的 SYN 数据包数量 | | `unique_dst_ports` | 探测的不同端口 | | `packet_rate` | 每秒数据包数 | - [ ] 分配标签：`0 = 正常`，`1 = 端口扫描` ### 🚧 阶段 5 — 模型训练 - [ ] 拆分数据集（训练 / 测试） - [ ] 训练 Random Forest 分类器 - [ ] 评估：准确率、精确率、召回率、F1 分数 - [ ] 通过 `joblib` 导出模型 ### 🚧 阶段 6 — 检测系统 - [ ] 构建实时预测脚本 - [ ] 解析传入的流量特征 - [ ] 在分类为阳性时触发 `🚨 ALERT: Port Scan Detected` ### 🚧 阶段 7 — 仪表盘 *(可选)* - [ ] 构建 Streamlit Web UI - [ ] 实时检测信息流 - [ ] 攻击统计与模型置信度分数 ### 🚧 阶段 8 — 收尾 - [ ] 用截图记录所有发现 - [ ] 清理和优化代码库 - [ ] 发布完整的文章 ## `> cat progress.log` ``` Phases Completed : 1 / 8 Overall Progress : [██░░░░░░░░░░░░░░░░░░] 10% Phase 1 Setup : ██████████████████████ ✅ Done Phase 2 Capture : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending Phase 3 Analysis : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending Phase 4 Features : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending Phase 5 Training : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending Phase 6 Detection : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending Phase 7 Dashboard : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Optional Phase 8 Final : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending ``` ## `> ls screenshots/ --coming-soon` ``` [ ] lab-setup.png ← VM network topology [ ] tcpdump-capture.png ← Live packet capture [ ] wireshark-analysis.png ← SYN flood visualization [ ] dataset-preview.png ← Engineered feature table [ ] model-results.png ← Confusion matrix & metrics [ ] detection-output.png ← Alert system in action ``` ## `> cat future.txt` ``` [→] Real-time live traffic detection pipeline [→] Deep learning models (LSTM for sequence-based detection) [→] Multi-attack classification (DDoS, brute-force, ARP spoofing) [→] SIEM integration (Splunk / ELK Stack) [→] Dockerized deployment for portable demo ``` ## `> cat disclaimer.txt` ``` ╔══════════════════════════════════════════════════════════════╗ ║ ⚠️ EDUCATIONAL USE ONLY ║ ║ ║ ║ All attack simulations in this project are performed ║ ║ exclusively in isolated, controlled lab environments. ║ ║ ║ ║ Running these techniques against systems without explicit ║ ║ permission is ILLEGAL and unethical. ║ ║ ║ ║ The author accepts no responsibility for misuse. ║ ╚══════════════════════════════════════════════════════════════╝ ``` ## `> whoami --author`

### **Kushan Bhagya** *网络安全爱好者 · AI + 安全学习者 · 实践构建者* [![GitHub](https://img.shields.io/badge/GitHub-Follow-181717?style=for-the-badge&logo=github)](https://github.com/) [![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-0A66C2?style=for-the-badge&logo=linkedin)](https://linkedin.com/)
*机器学习与攻击性安全的交汇点。* `[★ 如果您觉得有用，请给此仓库点星]` `[🍴 Fork 以构建您自己的检测器]`

标签：Apex, BSD, CISA项目, CTI, GitHub, Kubernetes, Python, Scikit-learn, 人工智能, 威胁情报, 密码管理, 开发者工具, 异常检测, 插件系统, 攻击模拟, 数据挖掘, 无后门, 机器学习, 深度学习, 特征工程, 用户模式Hook绕过, 端口扫描检测, 网络安全, 网络安全教育, 网络攻防, 逆向工具, 隐私保护, 驱动签名利用