Saksham-Bawa/Multi-Channel-Phishing-Detection-using-URL-Project

GitHub: Saksham-Bawa/Multi-Channel-Phishing-Detection-using-URL-Project

基于深度学习多渠道融合与外部威胁情报的钓鱼URL与网页检测Web应用。

Stars: 0 | Forks: 0

# 多渠道钓鱼检测 Web 应用本项目实现了一个用于多渠道 URL 和网页检测的完整 Web 应用。该系统将 AI 模型预测与 VirusTotal 和 Google Safe Browsing 的结果相融合，以提供全面的威胁分析。 ## 功能特性 - **AI 驱动分析**：多渠道神经网络（Transformer + CNN + LSTM） - **VirusTotal 集成**：跨 70 多个引擎的实时恶意软件扫描 - **Google Safe Browsing**：高级威胁检测 - **专业 UI**：深色网络安全主题界面 - **扫描历史记录**：过往分析的持久化存储 - **REST API**：用于程序化访问的 FastAPI 后端 ## 项目结构 ``` phishing-detection/ ├── backend/ │ ├── app.py # FastAPI main application │ ├── routes/ │ │ ├── scan.py # Scan endpoint │ │ └── history.py # History management │ ├── services/ │ │ ├── model_service.py # PyTorch model inference │ │ ├── virustotal_service.py # VT API integration │ │ └── google_sb_service.py # GSB API integration │ ├── utils/ │ │ ├── url_utils.py # URL processing │ │ └── fusion.py # Result fusion logic │ └── requirements.txt # Python dependencies ├── frontend/ │ ├── index.html # Main web interface │ ├── css/ │ │ └── style.css # Styling │ └── js/ │ ├── main.js # UI logic │ ├── results.js # Result rendering │ └── api.js # API calls ├── models/ │ ├── models.py # Neural network architectures │ └── multi_channel_phishing.pth # Trained model weights ├── data/ │ └── dataset.py # Data loading utilities ├── .env # API keys ├── scan_history.json # Scan history (auto-generated) └── README.md ``` ## 设置说明 ### 1. 安装后端依赖 ``` cd backend pip install -r requirements.txt ``` ### 2. 设置环境变量在根目录创建一个 `.env` 文件： ``` GOOGLE_API_KEY=your_google_api_key VT_API_KEY=your_virustotal_api_key ``` ### 3. 运行应用方式 A（推荐）： ``` python run_server.py ``` 方式 B（备选）： ``` python -m uvicorn backend.app:app --reload --host 0.0.0.0 --port 8000 ``` 应用将在 `http://localhost:8000` 处可用 ### 4. 访问 Web 界面打开浏览器并导航至 `http://localhost:8000/static/index.html` ## API 用法 ### 扫描 URL ``` curl -X POST "http://localhost:8000/api/scan" \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "js_trace": "optional js code"}' ``` ### 获取扫描历史记录 ``` curl "http://localhost:8000/api/history" ``` ## 架构该系统采用三渠道方法： 1. **URL Transformer 渠道**：DistilBERT 处理 URL token 以进行语义分析 2. **字符 CNN 渠道**：卷积网络检测混淆模式 3. **JS 跟踪 LSTM 渠道**：双向 LSTM 分析 JavaScript 执行轨迹结果通过加权评分与来自 VirusTotal 和 Google Safe Browsing 的外部威胁情报相融合。 ## 安全说明本应用集成的外部 API 可能具有使用限制并需要 API 密钥。请确保您遵守 VirusTotal 和 Google Safe Browsing 的服务条款。 ``` docker run -it phishing-detector ``` 默认情况下，容器将运行 `predict.py` 脚本。 ## 后续步骤核心基础已完全正常运行。接下来，可以考虑收集大规模的多模态数据集（例如，URL 以及抓取的 JS 轨迹和图像帧），并将 `dataset.py` 中的 pandas DataFrame 加载器替换为您自定义的数据源！

标签：AI安全, AMSI绕过, Apex, Ask搜索, AV绕过, Chat Copilot, CNN, FastAPI, Google Safe Browsing, LSTM, Python, PyTorch, REST API, Transformer, URL分析, VirusTotal, 人工智能, 凭据扫描, 后端开发, 多通道融合, 威胁情报, 威胁检测, 开发者工具, 恶意软件扫描, 搜索语句（dork）, 无后门, 机器学习, 深度学习, 用户模式Hook绕过, 神经网络, 网络安全, 请求拦截, 逆向工具, 钓鱼检测, 隐私保护