Veryyes/BINocular

GitHub: Veryyes/BINocular

一个统一的二进制静态分析框架,通过抽象层屏蔽 Ghidra 和 Rizin 反汇编器的差异,提供统一的 Python API 和数据库持久化能力。

Stars: 9 | Forks: 0

# BINocular - 通用二进制分析框架 ![Static Badge](https://img.shields.io/badge/Version-1.1-navy) ![Static Badge](https://img.shields.io/badge/license-GPLv3-green) ![Static Badge](https://img.shields.io/badge/Python-3.10-blue) ![Static Badge](https://img.shields.io/badge/Disassembler-Rizin-yellow) ![Static Badge](https://img.shields.io/badge/Disassembler-Ghidra-red) BINocular 是一个用于静态分析编译二进制文件的 Python 包, 通过通用的 API 层实现。它是不同 反汇编器之间的抽象层,并提供: - 通用二进制分析原语和概念的反汇编器无关表示 * 汇编指令 * 中间表示(例如 pcode) * 函数 - 已编译 - 源代码 * 控制流图 - 用于安装支持的反汇编器的 CLI 和 API - 概念的序列化/反序列化(例如函数、基本块、指令) - 将对象持久化存储到 SQL 数据库 ## 反汇编器后端支持 ### [Ghidra](https://www.ghidra-sre.org/) ### [Rizin](https://rizin.re/) ## 安装 `pip install BINocular` ## CLI 使用示例 **列出可安装的 Ghidra 版本** ``` $ binocular install ghidra -l 11.1.1 11.1 11.0.3 11.0.2 11.0.1 11.0 10.4 10.3.3 10.3.2 10.3.1 10.3 ``` **通过命令行安装 Ghidra** ``` $ binocular install ghidra -v 11.1 -p ~/Documents/ghidra_install_location 2024-06-15 13:41:04 binocular.ghidra[472653] INFO Installing Ghidra 11.1 to /home/brandon/Documents/ghidra_install_location 2024-06-15 13:41:27 binocular.ghidra[472653] INFO Extracting Ghidra 2024-06-15 13:41:31 pyhidra.javac[472653] INFO WARNING 2024-06-15 13:41:32 pyhidra.launcher[472653] INFO Installed plugin: pyhidra 1.1.0 ``` **解析二进制文件并将其加载到 SQLite 数据库** ``` $ binocular parse ./test/example rizin --uri sqlite:///$(pwd)/example.db 2024-06-15 13:46:23 binocular.disassembler[473064] INFO [Rizin] Analyzing test/example 2024-06-15 13:46:23 binocular.disassembler[473064] INFO [Rizin] Analysis Complete: 0.03s 2024-06-15 13:46:23 binocular.disassembler[473064] INFO [Rizin] Binary Data Loaded: 0.00s 2024-06-15 13:46:25 binocular.disassembler[473064] INFO [Rizin] 49 Basic Blocks Loaded 2024-06-15 13:46:25 binocular.disassembler[473064] INFO [Rizin] 18 Functions Loaded 2024-06-15 13:46:25 binocular.disassembler[473064] INFO [Rizin] Function Data Loaded: 2.26s 2024-06-15 13:46:25 binocular.disassembler[473064] INFO [Rizin] Ave Function Load Time: 0.13s 2024-06-15 13:46:25 binocular.disassembler[473064] INFO [Rizin] Parsing Complete: 2.26s Binary: Name: example Arch: x86 Bits: 64 Endian: Endian.LITTLE SHA256: a7f9141c1781c20d13b8442f24fcddba4b75b4b73ae04e734a92a79fcf0869c3 Size: 18088 Num Functions: 18 Inserting to DB ``` ## Python 使用示例 ### 在 commit `dee48e9` 处安装 Ghidra 这假设您已经拥有构建 Ghidra 的所有构建依赖项(其他反汇编器同理)。 ``` from binocular import Ghidra install_dir = "./test_install" if not Ghidra.is_installed(install_dir=install_dir): # Install Ghidra @ commit dee48e9 if Ghidra isn't installed already # This make take a while since it does build Ghidra from scratch Ghidra.install(version='dee48e9', install_dir=install_dir, build=True) ``` ### 序列化对象 所有基本原语(如 `Instruction`、`Basic Block` 和 `NativeFunction`)均基于 [Pydantic](https://docs.pydantic.dev/latest/) 并使用 Python 类型提示构建。这意味着我们获得了 Pydantic 的所有优势,如类型验证和 JSON 序列化。 ``` from binocular import Ghidra with Ghidra() as g: g.load("./test/example") b = g.binary f = g.function_sym("fib") bb = list(f.basic_blocks)[0] print(bb.model_dump_json()) ``` **输出(经 jq 管道处理后)** ``` { "endianness": 0, "architecture": "x86", "bitness": 64, "address": 1053275, "pie": 3, "instructions": [ { "endianness": 0, "architecture": "x86", "bitness": 64, "address": 1053275, "data": "837dec01", "asm": "CMP", "comment": "", "ir": { "lang_name": 2, "data": "(unique, 0x4400, 8) INT_ADD (register, 0x28, 8) , (const, 0xffffffffffffffec, 8);(unique, 0xdb00, 4) LOAD (const, 0x1b1, 4) , (unique, 0x4400, 8);(unique, 0x27600, 4) COPY (unique, 0xdb00, 4);(register, 0x200, 1) INT_LESS (unique, 0x27600, 4) , (const, 0x1, 4);(register, 0x20b, 1) INT_SBORROW (unique, 0x27600, 4) , (const, 0x1, 4);(unique, 0x27700, 4) INT_SUB (unique, 0x27600, 4) , (const, 0x1, 4);(register, 0x207, 1) INT_SLESS (unique, 0x27700, 4) , (const, 0x0, 4);(register, 0x206, 1) INT_EQUAL (unique, 0x27700, 4) , (const, 0x0, 4);(unique, 0x15080, 4) INT_AND (unique, 0x27700, 4) , (const, 0xff, 4);(unique, 0x15100, 1) POPCOUNT (unique, 0x15080, 4);(unique, 0x15180, 1) INT_AND (unique, 0x15100, 1) , (const, 0x1, 1);(register, 0x202, 1) INT_EQUAL (unique, 0x15180, 1) , (const, 0x0, 1)" } }, { "endianness": 0, "architecture": "x86", "bitness": 64, "address": 1053279, "data": "7507", "asm": "JNZ", "comment": "", "ir": { "lang_name": 2, "data": "(unique, 0xe480, 1) BOOL_NEGATE (register, 0x206, 1); --- CBRANCH (ram, 0x101268, 8) , (unique, 0xe480, 1)" } } ], "branches": [ { "btype": 1, "target": 1053288 }, { "btype": 1, "target": 1053281 } ], "is_prologue": false, "is_epilogue": false, "xrefs": [ { "from_": 1053279, "to": 1053288, "type": 1 } ] } ``` ### 加载二进制文件并上传到数据库 每个原语都有一个对应的 [SQLAlchemy](https://www.sqlalchemy.org/) ORM 类,其后缀为 "ORM"。(例如 `NativeFunctionORM`、`BinaryORM`)。 ``` from sqlalchemy.orm import Session from binocular import Ghidra, Backend, FunctionSource Backend.set_engine('sqlite:////home/brandon/Documents/BINocular/example.db') # 如果未指定 install_dir 参数,将使用内置默认路径(位于 python 包内部) with Ghidra() as g: g.load("./test/example") b = g.binary for f in b.functions: name = f.names[0] # Auto parse the source code and associate the functions within # the source to the parsed functions that Ghidra has found src = FunctionSource.from_file(name, './test/example.c') if src is not None: f.sources.add(src) # Load the entire binary to the database set in line 4 with Session(Backend.engine) as s: b.db_add(s) s.commit() ``` ### 从数据库查询数据 这是一个按名称查询二进制文件的示例。这完全基于 SQL/SQLAlchemy,因此您可以执行任何想要的查询。 使用 `.from_orm()` 函数将 ORM 对象还原为 Pydantic BaseModel 对象。 ``` from sqlalchemy import select from sqlalchemy.orm import Session from binocular import Backend, Binary from binocular.db import BinaryORM, NameORM Backend.set_engine('sqlite:////home/brandon/Documents/BINocular/example.db') with Session(Backend.engine) as session: # Select a binary whoes file name has been "example" binary = session.execute( select(BinaryORM).join(NameORM, BinaryORM.names).where(NameORM.name == 'example') ).all() binary = [b[0] for b in binary][0] # Convert the BinaryORM object to a Binary Object # and get all its functions funcs = Binary.from_orm(binary).functions print(f"example has {len(funcs)} functions") ```
标签:API, CFG, CLI, Cloudflare Workers, DAST, Ghidra, Python, Rizin, SQLite, WiFi技术, 中间语言, 二进制分析, 二进制安全, 云安全监控, 云安全运维, 云资产清单, 代码分析, 凭证管理, 反汇编器, 恶意软件分析, 情报收集, 抽象层, 控制流图, 无后门, 漏洞研究, 网络调试, 自动化, 逆向工具, 逆向工程, 镜像验证, 静态分析