kushanbhagya/AI-Port-Scan-Detection-System
GitHub: kushanbhagya/AI-Port-Scan-Detection-System
这是一个基于机器学习的网络安全项目,通过分析实时网络流量特征来自动检测和识别端口扫描攻击。
Stars: 0 | Forks: 0
```
█████╗ ██╗ ██████╗ ██████╗ ██████╗ ████████╗ ███████╗ ██████╗ █████╗ ███╗ ██╗
██╔══██╗██║ ██╔══██╗██╔═══██╗██╔══██╗╚══██╔══╝ ██╔════╝██╔════╝██╔══██╗████╗ ██║
███████║██║ ██████╔╝██║ ██║██████╔╝ ██║ ███████╗██║ ███████║██╔██╗ ██║
██╔══██║██║ ██╔═══╝ ██║ ██║██╔══██╗ ██║ ╚════██║██║ ██╔══██║██║╚██╗██║
██║ ██║██║ ██║ ╚██████╔╝██║ ██║ ██║ ███████║╚██████╗██║ ██║██║ ╚████║
╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═══╝
```
### `🔐 AI × 网络安全 · 机器学习驱动的入侵检测 · 真实攻击模拟`
    
## `> cat overview.txt`
一个全栈 AI 安全项目,它**模拟真实的端口扫描攻击**,捕获实时网络流量,提取行为特征,并训练机器学习模型以自动将流量分类为**正常或恶意**。
```
[Kali Linux] [Ubuntu Target] [AI Engine]
Nmap Scanner ──────► tcpdump / Wireshark ──────► Random Forest Model
Port Scanner Packet Capture Detection & Alerts
```
## `> cat objectives.txt`
```
[*] Simulate real-world port scanning attacks in a controlled lab
[*] Capture and analyze raw network traffic at the packet level
[*] Engineer meaningful features from raw PCAP data
[*] Train a machine learning model to detect malicious behavior
[*] Build a production-grade AI + Security portfolio project
```
## `> ls tech-stack/`
### 🔹 网络安全层
| 工具 | 角色 |
|------|------|
|  | 攻击模拟环境 |
|  | 易受攻击的目标主机 |
|  | SYN 扫描,全端口扫描 |
|  | 流量捕获为 `.pcap` |
### 🔹 AI / ML 层
| 库 | 用途 |
|---------|---------|
|  | 端到端流水线 |
|  | 特征提取与数据集准备 |
|  | Random Forest 分类器 |
|  | 混淆矩阵,指标 |
|  | 保存和加载训练好的模型 |
### 🔹 可选

## `> cat how-it-works.md`
```
STEP 1 Generate Traffic
├── Normal: ping, HTTP, SSH browsing
└── Attack: Nmap SYN scan, full port sweep
STEP 2 Capture Packets
└── tcpdump → .pcap files
STEP 3 Feature Engineering
├── Packet count per source IP
├── SYN packet ratio
├── Unique destination ports touched
└── Packet rate (packets/second)
STEP 4 Label & Structure
└── CSV dataset → normal=0 / portscan=1
STEP 5 Train Model
└── Random Forest → accuracy + recall + F1
STEP 6 Detect & Alert
└── New traffic → prediction → 🚨 ALERT or ✅ CLEAN
```
## `> tree ai-portscan-detector/`
```
ai-portscan-detector/
│
├── 📁 data/
│ ├── raw/ # PCAP files (git-ignored)
│ └── processed/ # Structured CSV datasets
│
├── 📁 models/ # Saved trained ML models (.pkl)
│
├── 📁 notebooks/ # Jupyter notebooks — EDA & training
│
├── 📁 screenshots/ # Visual documentation
│
├── 📁 src/
│ ├── capture/ # Traffic capture scripts
│ ├── preprocess/ # PCAP → feature extraction
│ ├── train/ # Model training pipeline
│ └── predict/ # Detection & alert logic
│
├── 📄 README.md
├── 📄 requirements.txt
└── 📄 main.py
```
## `> cat roadmap.md`
### ✅ 阶段 1 — 项目设置
- [x] 创建模块化项目结构
- [x] 初始化 GitHub 仓库
- [ ] 配置实验室环境(Kali + Ubuntu 虚拟机)
### 🚧 阶段 2 — 流量生成与捕获
- [ ] 启动目标服务(Apache, SSH, FTP)
- [ ] 生成正常流量(ping, HTTP, SSH 会话)
- [ ] 使用 Nmap 模拟攻击(`-sS`, `-p-`, `-A`)
- [ ] 使用 `tcpdump` 捕获实时数据包
- [ ] 将原始 `.pcap` 文件保存到 `data/raw/`
### 🚧 阶段 3 — 流量分析
- [ ] 在 Wireshark 中打开捕获文件
- [ ] 通过视觉识别正常与扫描行为
- [ ] 过滤 SYN 泛洪、端口扫描、探测模式
### 🚧 阶段 4 — 特征工程
- [ ] 使用 `scapy` / `pyshark` 将 PCAP 转换为结构化 CSV
- [ ] 提取核心特征:
| 特征 | 描述 |
|---------|-------------|
| `packet_count` | 来自源 IP 的数据包总数 |
| `syn_count` | 发送的 SYN 数据包数量 |
| `unique_dst_ports` | 探测的不同端口 |
| `packet_rate` | 每秒数据包数 |
- [ ] 分配标签:`0 = 正常`,`1 = 端口扫描`
### 🚧 阶段 5 — 模型训练
- [ ] 拆分数据集(训练 / 测试)
- [ ] 训练 Random Forest 分类器
- [ ] 评估:准确率、精确率、召回率、F1 分数
- [ ] 通过 `joblib` 导出模型
### 🚧 阶段 6 — 检测系统
- [ ] 构建实时预测脚本
- [ ] 解析传入的流量特征
- [ ] 在分类为阳性时触发 `🚨 ALERT: Port Scan Detected`
### 🚧 阶段 7 — 仪表盘 *(可选)*
- [ ] 构建 Streamlit Web UI
- [ ] 实时检测信息流
- [ ] 攻击统计与模型置信度分数
### 🚧 阶段 8 — 收尾
- [ ] 用截图记录所有发现
- [ ] 清理和优化代码库
- [ ] 发布完整的文章
## `> cat progress.log`
```
Phases Completed : 1 / 8
Overall Progress : [██░░░░░░░░░░░░░░░░░░] 10%
Phase 1 Setup : ██████████████████████ ✅ Done
Phase 2 Capture : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
Phase 3 Analysis : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
Phase 4 Features : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
Phase 5 Training : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
Phase 6 Detection : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
Phase 7 Dashboard : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Optional
Phase 8 Final : ░░░░░░░░░░░░░░░░░░░░░░ ⏳ Pending
```
## `> ls screenshots/ --coming-soon`
```
[ ] lab-setup.png ← VM network topology
[ ] tcpdump-capture.png ← Live packet capture
[ ] wireshark-analysis.png ← SYN flood visualization
[ ] dataset-preview.png ← Engineered feature table
[ ] model-results.png ← Confusion matrix & metrics
[ ] detection-output.png ← Alert system in action
```
## `> cat future.txt`
```
[→] Real-time live traffic detection pipeline
[→] Deep learning models (LSTM for sequence-based detection)
[→] Multi-attack classification (DDoS, brute-force, ARP spoofing)
[→] SIEM integration (Splunk / ELK Stack)
[→] Dockerized deployment for portable demo
```
## `> cat disclaimer.txt`
```
╔══════════════════════════════════════════════════════════════╗
║ ⚠️ EDUCATIONAL USE ONLY ║
║ ║
║ All attack simulations in this project are performed ║
║ exclusively in isolated, controlled lab environments. ║
║ ║
║ Running these techniques against systems without explicit ║
║ permission is ILLEGAL and unethical. ║
║ ║
║ The author accepts no responsibility for misuse. ║
╚══════════════════════════════════════════════════════════════╝
```
## `> whoami --author`
    
### **Kushan Bhagya**
*网络安全爱好者 · AI + 安全学习者 · 实践构建者*
[](https://github.com/)
[](https://linkedin.com/)
*机器学习与攻击性安全的交汇点。* `[★ 如果您觉得有用,请给此仓库点星]` `[🍴 Fork 以构建您自己的检测器]`
*机器学习与攻击性安全的交汇点。* `[★ 如果您觉得有用,请给此仓库点星]` `[🍴 Fork 以构建您自己的检测器]`
标签:Apex, BSD, CISA项目, CTI, GitHub, Kubernetes, Python, Scikit-learn, 人工智能, 威胁情报, 密码管理, 开发者工具, 异常检测, 插件系统, 攻击模拟, 数据挖掘, 无后门, 机器学习, 深度学习, 特征工程, 用户模式Hook绕过, 端口扫描检测, 网络安全, 网络安全教育, 网络攻防, 逆向工具, 隐私保护, 驱动签名利用