zrnge/Malyzer

GitHub: zrnge/Malyzer

一个基于本地 Ollama 大模型驱动的恶意软件智能分析框架，通过 Agentic 循环自动编排静态与动态分析工具，在 FlareVM 环境中完成从样本识别到威胁情报关联的全流程分析。

Stars: 3 | Forks: 0

# Malyze — AI 驱动的恶意软件分析框架 ![Python](https://img.shields.io/badge/Python-3.10%2B-3776AB?style=flat&logo=python&logoColor=white) ![Platform](https://img.shields.io/badge/Platform-Windows-0078D4?style=flat&logo=windows&logoColor=white) ![License](https://img.shields.io/badge/License-MIT-green?style=flat) ![Ollama](https://img.shields.io/badge/LLM-Ollama-black?style=flat&logo=ollama&logoColor=white) ![FlareVM](https://img.shields.io/badge/FlareVM-Compatible-red?style=flat) ![MITRE ATT&CK](https://img.shields.io/badge/MITRE-ATT%26CK-E2231A?style=flat) ![MCP](https://img.shields.io/badge/MCP-Server-6B46C1?style=flat&logo=anthropic&logoColor=white) ![Version](https://img.shields.io/badge/Version-2.1-blue?style=flat) ## 概述 Malyze 自动化了完整的恶意软件分析工作流。与运行每个工具并转储输出不同，AI 扮演了分析师的角色：它每次选择**一个工具**，读取结果，形成假设，并决定下一步调查什么——完全像人类分析师的工作方式。 **核心功能一览：** - 智能体静态分析循环（最多 20 次 AI 驱动的迭代） - 智能体动态分析循环，结合实时 Procmon/FakeNet/tshark/Regshot/ProcDump - 经样本过滤的行为事件（排除了来自 Explorer/MsMpEng/Edge 的噪音） - 通过 Speakeasy (Mandiant) 进行 CPU 仿真，用于脱壳代码分析 - 分析前自动进行 UPX 脱壳 - 对观察到的 DNS 查询进行 DGA 域名检测 - IOC 丰富化（GeoIP、URLhaus、被动 DNS） - 威胁情报 — MalwareBazaar、CIRCL 哈希查询、VirusTotal、Shodan、AlienVault OTX - 为 HIGH/CRITICAL 样本自动生成 YARA 探测规则 - STIX 2.1 捆绑包导出 - 具有实时日志流和 API 的 Web UI - 用于 Claude / AI 智能体集成的 MCP 服务器 - 本地 SQLite 样本数据库，支持通过 SHA256 和 imphash 进行跨会话关联 ## 架构 ``` main.py ── web / CLI / MCP └── AnalysisWorkflow (workflow.py) └── MalyzeAgent (agent.py) │ ├── Step 1 OS & tool inventory (environment.py, tool_registry.py) ├── Step 2 File ID + threat intel (file_identifier.py, intel/) ├── Step 3 Agentic static loop (orchestrator.AgenticOrchestrator) │ AI picks tool → run → AI sees result → repeat ├── Step 4 Tool inventory report ├── Step 5 Agentic dynamic loop (orchestrator.DynamicOrchestrator) │ Pre-exec: FakeNet · Procmon · tshark · Autoruns baseline │ Execute: sample runs in a detached process (psutil-supervised) │ Post-exec: Autoruns diff · Regshot diff · ProcDump │ RAG DB: Procmon CSV → SQLite for active threat hunting ├── Step 6 Final AI synthesis (ai/ollama_analyzer.py) └── Step 7 DB save + auto-YARA (intel/sample_db.py, static/yara_generator.py) ``` ## 静态分析工具 | 工具 | 用途 | |---|---| | FLOSS | 字符串提取 + 去混淆 | | strings64 (Sysinternals) | 原始可打印字符串提取 | | Detect-It-Easy (DIE) | 壳 / 编译器 / 保护器检测 | | CAPA | 映射到 MITRE ATT&CK 的能力检测 | | UPX | UPX 壳二进制文件的自动脱壳 | | pefile (Python) | PE 头、导入表、节区、rich header、overlay | | Capstone / objdump | 反汇编 (x86/x64) | | YARA | 自定义规则匹配 | | Speakeasy (Mandiant) | CPU 仿真 — API 追踪、动态字符串、网络存根 | | 熵值分析 | 按节区计算熵值 + 高熵块检测 | | XOR 暴力破解 | 1 字节和 2 字节密钥去混淆 | | pyelftools / readelf | ELF 二进制文件分析 | | oletools | Office 宏分析、VBA 提取 | | pdfminer | PDF 文本、流提取、JavaScript 检测 | | 脚本分析器 | PowerShell / JS / VBS / 批处理混淆评分 | ## 动态分析工具 | 工具 | 用途 | |---|---| | Procmon64 | 进程 / 文件 / 注册表事件捕获 | | FakeNet-NG | 用于网络拦截的虚假 DNS/HTTP/SMTP 服务 | | tshark | 实时数据包捕获 → pcap | | autorunsc | 持久化基线快照 → 差异对比 | | Regshot | 完整注册表快照 → 差异对比 | | ProcDump | 恶意软件衍生进程的内存转储 | ## 威胁情报来源 | 来源 | 是否需要密钥 | 备注 | |---|---|---| | MalwareBazaar (abuse.ch) | 否 | 哈希查询 — 免费，始终可用 | | CIRCL 哈希查询 | 否 | 社区哈希数据库 (NSRL + 恶意软件) | | VirusTotal | 可选 | 60-70 个 AV 引擎；免费层级 4 次请求/分钟 | | Shodan | 可选 | IP 情报 — 开放端口、网络横幅 | | AlienVault OTX | 可选 | 威胁脉冲订阅源 | | URLhaus | 否 | 内置于 IOC 丰富化中的恶意 URL/IP 查询 | | 被动 DNS | 否 | 提取域名的历史 DNS 解析 | | DGA 检测器 | — | 对观察到的 DNS 查询进行统计评分 | ## 环境要求 ### 系统 - **Windows 10 / 11**（建议使用 FlareVM 进行动态分析） - **Python 3.10+** - 本地运行（或在沙箱可访问的主机上运行）的 **[Ollama](https://ollama.com)** ### 推荐模型 ``` ollama pull mistral ollama pull llama3.1 ollama pull gemma2 ollama pull deepseek-r1 ``` ### Python 依赖 ``` pip install -r requirements.txt ``` ### FlareVM 工具（可选 — 每个工具可启用额外的分析模块） | 工具 | 来源 | |---|---| | FLOSS | github.com/mandiant/flare-floss | | Detect-It-Easy (diec.exe) | github.com/horsicq/Detect-It-Easy | | CAPA | github.com/mandiant/capa | | strings64.exe | learn.microsoft.com/sysinternals | | Procmon64.exe | learn.microsoft.com/sysinternals | | autorunsc.exe | learn.microsoft.com/sysinternals | | procdump64.exe | learn.microsoft.com/sysinternals | | tshark.exe | wireshark.org | | FakeNet.exe | github.com/mandiant/flare-fakenet-ng | | Regshot-x64-Unicode.exe | sourceforge.net/projects/regshot | | yara64.exe | virustotal.github.io/yara | | upx.exe | github.com/upx/upx | ## 安装 ``` git clone https://github.com/zrnge/malyzer.git cd malyzer python -m venv venv venv\Scripts\activate pip install -r requirements.txt ``` 或者在 FlareVM 上使用附带的批处理安装程序： ``` install.bat ``` ## 配置首次使用前请编辑 `config.yaml`： ``` ollama: host: "http://localhost:11434" # or your host machine's LAN IP if running in a sandbox model: "mistral" # any model you have pulled timeout: 900 flarevm: floss: "floss.exe" capa: "capa.exe" die: "diec.exe" strings: "strings64.exe" procmon: "Procmon64.exe" tshark: "tshark.exe" autorunsc: "autorunsc.exe" procdump: "procdump64.exe" fakenet: "FakeNet.exe" regshot: "Regshot-x64-Unicode.exe" upx: "upx.exe" analysis: max_static_iterations: 20 # AI tool-selection loops (static phase) max_dynamic_iterations: 10 # AI tool-selection loops (dynamic phase) tshark_capture_seconds: 60 dynamic_timeout: 60 # seconds the sample runs before being killed intel: malwarebazaar: true # free, no key needed circl_hashlookup: true # free, no key needed virustotal_api_key: "" # optional shodan_api_key: "" # optional otx_api_key: "" # optional output: dir: "./output" report_format: html # html | pdf | json | all analyst: name: "Security Analyst" org: "Malware Analysis Lab" ``` ### 远程 Ollama（沙箱 → 主机）如果 Malyze 在虚拟机/沙箱内运行，而 Ollama 在您的主机上： 1. 在主机上，允许 Ollama 监听所有网络接口： # Windows $env:OLLAMA_HOST="0.0.0.0:11434"; ollama serve 2. 将 `config.yaml` 中的 `host` 设置为主机的局域网 IP，或在运行时覆盖： $env:OLLAMA_HOST="http://192.168.1.10:11434" python main.py analyze sample.exe ## 用法 ### Web UI ``` python main.py web ``` 在 `http://localhost:5000` 打开 — 支持拖放文件上传、切换静态/动态分析、实时日志流、报告下载。 ### 完整分析 (CLI) ``` python main.py analyze malware_sample.exe ``` ### 结合动态分析（请在沙箱内运行！） ``` python main.py analyze malware_sample.exe --dynamic ``` ### 指定报告格式 ``` python main.py analyze sample.exe --format html # html | pdf | json | all ``` ### 快速文件识别 ``` python main.py identify suspicious_file ``` ### 仅提取字符串 ``` python main.py strings sample.exe ``` ### 从保存的 JSON 重新生成报告 ``` python main.py analyze output/sample_analysis.json --report-only ``` ### MCP 服务器（Claude / AI 智能体集成） ``` python main.py mcp-server ``` ## 报告格式 | 格式 | 内容 | |---|---| | **HTML** | 完整的交互式报告 — TTPs、IOCs、各工具发现、YARA 规则、STIX 导出链接 | | **PDF** | 通过 ReportLab 生成的可打印版本 | | **JSON** | 原始机器可读分析数据 | | **STIX 2.1** | 威胁情报捆绑包（恶意软件 SDO + 指示器 + 攻击模式） | 报告默认保存到 `./output/`（每个样本一个子目录）。 ## YARA 自动生成对于被评估为 **HIGH** 或 **CRITICAL** 的样本，Malyze 会基于以下内容自动生成 YARA 探测规则： - 从样本中提取的唯一可疑字符串 - 可疑的导入函数（例如 `VirtualAlloc`、`WriteProcessMemory`） - 文件元数据和节区特征规则将保存到 `output//sample.yar`。 ## MCP 服务器 Malyze 通过 **模型上下文协议 (MCP)** 公开所有分析功能，允许 Claude Desktop 和其他兼容 MCP 的智能体直接调用工具。将其添加到您的 MCP 客户端配置中： ``` { "mcpServers": { "malyze": { "command": "python", "args": ["main.py", "mcp-server"] } } } ``` ## 项目结构 ``` malyze/ ├── main.py # CLI entry point (click) ├── config.yaml # All configuration ├── requirements.txt ├── install.bat # FlareVM quick installer ├── mcp_config.json # MCP client config template ├── rules/ │ └── packers.yar # YARA rules for packer detection └── malyze/ ├── core/ │ ├── agent.py # 7-step analysis pipeline │ ├── orchestrator.py # Agentic loops (AgenticOrchestrator + DynamicOrchestrator) │ ├── tool_registry.py # Tool definitions + availability scanning │ ├── environment.py # OS detection + tool inventory │ ├── file_identifier.py # File type detection + hash computation │ └── workflow.py # CLI / Web / MCP entry wrapper ├── static/ │ ├── pe_analyzer.py # PE headers, imports, sections │ ├── strings_extractor.py # String extraction + XOR brute-force │ ├── entropy_analyzer.py # Section entropy analysis │ ├── packer_detector.py # Packer / protector identification │ ├── disassembler.py # Capstone disassembly │ ├── emulation_analyzer.py # Speakeasy CPU emulation │ ├── unpacker.py # Automatic UPX unpacking │ ├── yara_generator.py # Auto-YARA rule generation │ ├── office_analyzer.py # Office macro / OLE analysis │ ├── pdf_analyzer.py # PDF analysis │ └── script_analyzer.py # Script obfuscation scoring ├── dynamic/ │ ├── behavior_monitor.py # Procmon, FakeNet, Regshot orchestration │ └── rag_db.py # Procmon CSV → SQLite for AI threat hunting ├── intel/ │ ├── lookup.py # MalwareBazaar + VirusTotal + Shodan + OTX │ ├── deep_intel.py # Extended intelligence analysis │ ├── enrichment.py # GeoIP + URLhaus IOC enrichment │ ├── dga_detector.py # DGA domain scoring │ ├── pdns.py # Passive DNS resolution │ └── sample_db.py # SQLite cross-sample correlation database ├── ai/ │ └── ollama_analyzer.py # Ollama LLM integration + prompt construction ├── report/ │ ├── generator.py # HTML / PDF / JSON report generation │ ├── stix_export.py # STIX 2.1 bundle export │ └── templates/ ├── mcp/ │ └── server.py # MCP server (tool exposure) └── web/ ├── server.py # Flask web server + REST API └── templates/ └── index.html # Web UI ``` ## 安全警告 **动态分析会在主机上执行恶意软件样本。** - 始终在隔离的沙箱内运行 `--dynamic`（带有干净快照的 FlareVM、断网的虚拟机或类似环境） - 切勿在生产或个人机器上运行动态分析 - Web UI 默认仅绑定到 `localhost` — 在未于 `config.yaml` 中设置强 `web.api_key` 的情况下，请勿将其暴露在网络接口上 ## 许可证 MIT

标签：AI代理, AI风险缓解, Ask搜索, Cloudflare, CPU仿真, DAST, DGA域名检测, DNS信息、DNS暴力破解, DNS 反向解析, FlareVM, IOC提取, IP 地址批量处理, LLM评估, MCP服务器, MITRE ATT&CK, Ollama, Python, Speakeasy, SQLite, STIX 2.1, TCP SYN 扫描, VirusTotal, Web UI, Windows平台, YARA规则, 云安全监控, 云资产清单, 威胁情报, 开发者工具, 恶意软件分析, 数据包嗅探, 无后门, 本地大语言模型, 沙箱, 网络信息收集, 网络安全, 网络安全审计, 自动化分析, 跨站脚本, 逆向工具, 逆向工程, 隐私保护, 静态分析