cea-sec/miasm
GitHub: cea-sec/miasm
一个基于 Python 的开源逆向工程框架,提供多架构反汇编、中间语言、JIT 模拟和符号执行能力,用于二进制程序分析、修改和自动化处理。
Stars: 3844 | Forks: 487
[](https://travis-ci.org/cea-sec/miasm)
[](https://ci.appveyor.com/project/cea-sec/miasm)
[](https://github.com/cea-sec/miasm/actions/workflows/tests.yml?branch=master)
[](https://codeclimate.com/github/cea-sec/miasm)
[](https://gitter.im/cea-sec/miasm?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
```
在地址 `0` 处反汇编 shellcode:
```
>>> from miasm.analysis.machine import Machine
>>> machine = Machine('x86_32')
>>> mdis = machine.dis_engine(c.bin_stream, loc_db=loc_db)
>>> asmcfg = mdis.dis_multiblock(0)
>>> for block in asmcfg.blocks:
... print(block)
...
loc_0
LEA ECX, DWORD PTR [ECX + 0x4]
LEA EBX, DWORD PTR [EBX + 0x1]
CMP CL, 0x1
JZ loc_10
-> c_next:loc_b c_to:loc_10
loc_10
LEA EBX, DWORD PTR [EBX + 0x1]
-> c_next:loc_13
loc_b
LEA EBX, DWORD PTR [EBX + 0xFFFFFFFF]
JMP loc_13
-> c_to:loc_13
loc_13
MOV EAX, EBX
RET
```
使用栈初始化 JIT 引擎:
```
>>> jitter = machine.jitter(loc_db, jit_type='python')
>>> jitter.init_stack()
```
将 shellcode 添加到任意内存位置:
```
>>> run_addr = 0x40000000
>>> from miasm.jitter.csts import PAGE_READ, PAGE_WRITE
>>> jitter.vm.add_memory_page(run_addr, PAGE_READ | PAGE_WRITE, s)
```
创建一个哨兵来捕获 shellcode 的返回:
```
def code_sentinelle(jitter):
jitter.running = False
jitter.pc = 0
return True
>>> jitter.add_breakpoint(0x1337beef, code_sentinelle)
>>> jitter.push_uint32_t(0x1337beef)
```
激活日志:
```
>>> jitter.set_trace_log()
```
在任意地址运行:
```
>>> jitter.init_run(run_addr)
>>> jitter.continue_run()
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000000 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000000
40000000 LEA ECX, DWORD PTR [ECX+0x4]
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
....
4000000e JMP loc_0000000040000013:0x40000013
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000013
40000013 MOV EAX, EBX
RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000004 RDX 0000000000000000
RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123FFF8 RBP 0000000000000000
zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000
RIP 0000000040000013
40000015 RET
>>>
```
与 jitter 交互:
```
>>> jitter.vm
ad 1230000 size 10000 RW_ hpad 0x2854b40
ad 40000000 size 16 RW_ hpad 0x25e0ed0
>>> hex(jitter.cpu.EAX)
'0x0L'
>>> jitter.cpu.ESI = 12
```
## 符号执行
初始化 IR 池:
```
>>> lifter = machine.lifter_model_call(loc_db)
>>> ircfg = lifter.new_ircfg_from_asmcfg(asmcfg)
```
使用默认符号值初始化引擎:
```
>>> from miasm.ir.symbexec import SymbolicExecutionEngine
>>> sb = SymbolicExecutionEngine(lifter)
```
启动执行:
```
>>> symbolic_pc = sb.run_at(ircfg, 0)
>>> print(symbolic_pc)
((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
```
同样的操作,带有步骤日志(仅显示更改):
```
>>> sb = SymbolicExecutionEngine(lifter, machine.mn.regs.regs_init)
>>> symbolic_pc = sb.run_at(ircfg, 0, step=True)
Instr LEA ECX, DWORD PTR [ECX + 0x4]
Assignblk:
ECX = ECX + 0x4
________________________________________________________________________________
ECX = ECX + 0x4
________________________________________________________________________________
Instr LEA EBX, DWORD PTR [EBX + 0x1]
Assignblk:
EBX = EBX + 0x1
________________________________________________________________________________
EBX = EBX + 0x1
ECX = ECX + 0x4
________________________________________________________________________________
Instr CMP CL, 0x1
Assignblk:
zf = (ECX[0:8] + -0x1)?(0x0,0x1)
nf = (ECX[0:8] + -0x1)[7:8]
pf = parity((ECX[0:8] + -0x1) & 0xFF)
of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8]
cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8]
af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]
________________________________________________________________________________
af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
pf = parity((ECX + 0x4)[0:8] + 0xFF)
zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX = ECX + 0x4
of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX = EBX + 0x1
________________________________________________________________________________
Instr JZ loc_key_1
Assignblk:
IRDst = zf?(loc_key_1,loc_key_2)
EIP = zf?(loc_key_1,loc_key_2)
________________________________________________________________________________
af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
pf = parity((ECX + 0x4)[0:8] + 0xFF)
IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX = ECX + 0x4
of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX = EBX + 0x1
________________________________________________________________________________
>>>
```
使用具体的 ECX 重试执行。在这里,符号/混合执行到达了 shellcode 的末尾:
```
>>> from miasm.expression.expression import ExprInt
>>> sb.symbols[machine.mn.regs.ECX] = ExprInt(-3, 32)
>>> symbolic_pc = sb.run_at(ircfg, 0, step=True)
Instr LEA ECX, DWORD PTR [ECX + 0x4]
Assignblk:
ECX = ECX + 0x4
________________________________________________________________________________
af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
pf = parity((ECX + 0x4)[0:8] + 0xFF)
IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX = 0x1
of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX = EBX + 0x1
________________________________________________________________________________
Instr LEA EBX, DWORD PTR [EBX + 0x1]
Assignblk:
EBX = EBX + 0x1
________________________________________________________________________________
af = (((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[4:5]
EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
pf = parity((ECX + 0x4)[0:8] + 0xFF)
IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
zf = ((ECX + 0x4)[0:8] + 0xFF)?(0x0,0x1)
ECX = 0x1
of = ((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1))[7:8]
nf = ((ECX + 0x4)[0:8] + 0xFF)[7:8]
cf = (((((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8]) & ((ECX + 0x4)[0:8] ^ 0x1)) ^ ((ECX + 0x4)[0:8] + 0xFF) ^ (ECX + 0x4)[0:8] ^ 0x1)[7:8]
EBX = EBX + 0x2
________________________________________________________________________________
Instr CMP CL, 0x1
Assignblk:
zf = (ECX[0:8] + -0x1)?(0x0,0x1)
nf = (ECX[0:8] + -0x1)[7:8]
pf = parity((ECX[0:8] + -0x1) & 0xFF)
of = ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1))[7:8]
cf = (((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1)) ^ ((ECX[0:8] ^ (ECX[0:8] + -0x1)) & (ECX[0:8] ^ 0x1)))[7:8]
af = ((ECX[0:8] ^ 0x1) ^ (ECX[0:8] + -0x1))[4:5]
________________________________________________________________________________
af = 0x0
EIP = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
pf = 0x1
IRDst = ((ECX + 0x4)[0:8] + 0xFF)?(0xB,0x10)
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x2
________________________________________________________________________________
Instr JZ loc_key_1
Assignblk:
IRDst = zf?(loc_key_1,loc_key_2)
EIP = zf?(loc_key_1,loc_key_2)
________________________________________________________________________________
af = 0x0
EIP = 0x10
pf = 0x1
IRDst = 0x10
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x2
________________________________________________________________________________
Instr LEA EBX, DWORD PTR [EBX + 0x1]
Assignblk:
EBX = EBX + 0x1
________________________________________________________________________________
af = 0x0
EIP = 0x10
pf = 0x1
IRDst = 0x10
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x3
________________________________________________________________________________
Instr LEA EBX, DWORD PTR [EBX + 0x1]
Assignblk:
IRDst = loc_key_3
________________________________________________________________________________
af = 0x0
EIP = 0x10
pf = 0x1
IRDst = 0x13
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x3
________________________________________________________________________________
Instr MOV EAX, EBX
Assignblk:
EAX = EBX
________________________________________________________________________________
af = 0x0
EIP = 0x10
pf = 0x1
IRDst = 0x13
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x3
EAX = EBX + 0x3
________________________________________________________________________________
Instr RET
Assignblk:
IRDst = @32[ESP[0:32]]
ESP = {ESP[0:32] + 0x4 0 32}
EIP = @32[ESP[0:32]]
________________________________________________________________________________
af = 0x0
EIP = @32[ESP]
pf = 0x1
IRDst = @32[ESP]
zf = 0x1
ECX = 0x1
of = 0x0
nf = 0x0
cf = 0x0
EBX = EBX + 0x3
ESP = ESP + 0x4
EAX = EBX + 0x3
________________________________________________________________________________
>>>
```
# 它是如何工作的?
Miasm 嵌入了其自己的反汇编器、中间语言和指令语义。它是用 Python 编写的。
为了模拟代码,它使用 LLVM、GCC、Clang 或 Python 来 JIT 中间表示。它可以模拟 shellcode 和全部或部分的二进制文件。Python 回调可以被执行以与执行过程交互,例如模拟库函数的效果。
# 文档
[doc](doc) 文件夹中提供了一些文档资源。
自动生成的文档可在此处获取:
* [Doxygen](http://miasm.re/miasm_doxygen)
* [pdoc](http://miasm.re/miasm_pdoc)
# 获取 Miasm
* 克隆仓库:[GitHub 上的 Miasm](https://github.com/cea-sec/miasm/)
* 获取 [Docker Hub](https://registry.hub.docker.com/u/miasm/) 上的 Docker 镜像之一
## 软件需求
Miasm 使用:
* python-pyparsing
* python-dev
* 可选 python-pycparser (版本 >= 2.17)
要启用代码 JIT,以下模块之一是强制性的:
* GCC
* Clang
* 带有 Numba llvmlite 的 LLVM,见下文
“可选” Miasm 也可以使用:
* Z3,[定理证明器](https://github.com/Z3Prover/z3)
## 配置
要使用 jitter,推荐 GCC 或 LLVM
* GCC (任何版本)
* Clang (任何版本)
* LLVM
* Debian (testing/unstable):未测试
* Debian stable/Ubuntu/Kali/其他:`pip install llvmlite` 或从 [llvmlite](https://github.com/numba/llvmlite) 安装
* Windows:未测试
* 构建并安装 Miasm:
```
$ cd miasm_directory
$ python setup.py build
$ sudo python setup.py install
```
如果在其中一个 jitter 模块编译期间出现问题,Miasm 将跳过错误并禁用相应模块(参见编译输出)。
## Windows & IDA
大多数 Miasm 的 IDA 插件使用 Miasm 功能的一个子集。
让它们工作的一种快速方法是添加:
* `pyparsing.py` 到 `C:\...\IDA\python\` 或 `pip install pyparsing`
* `miasm/miasm` 目录到 `C:\...\IDA\python\`
除 JITter 相关功能外的所有功能都将可用。有关更完整的安装,请参阅上述段落。
# 测试
Miasm 附带一组回归测试。要运行所有测试:
```
cd miasm_directory/test
# 使用我们自己的 test runner 运行测试
python test_all.py
# 使用标准 frameworks 运行测试(较慢,需要 'parameterized')
python -m unittest test_all.py # sequential, requires 'unittest'
python -m pytest test_all.py # sequential, requires 'pytest'
python -m pytest -n auto test_all.py # parallel, requires 'pytest' and 'pytest-xdist'
```
可以指定一些选项:
* 单线程:`-m`
* 代码覆盖率检测:`-c`
* 仅快速测试:`-t long`(排除耗时长的测试)
# 他们已在用 Miasm
## 工具
* [Sibyl](https://github.com/cea-sec/Sibyl):一个函数占卜工具
* [R2M2](https://github.com/guedou/r2m2):将 miasm 用作 radare2 插件
* [CGrex](https://github.com/mechaphish/cgrex):针对 CGC 二进制文件的定向补丁工具
* [ethRE](https://github.com/jbcayrou/ethRE):Ethereum EVM 的逆向工具(带有相应的 Miasm2 架构)
## 博客文章 / 论文 / 会议
* [去混淆:恢复受 OLLVM 保护的程序](http://blog.quarkslab.com/deobfuscation-recovering-an-ollvm-protected-program.html)
* [用符号执行驯服受 Nanomite 保护的 MIPS 二进制文件:无此 crackme](https://doar-e.github.io/blog/2014/10/11/taiming-a-wild-nanomite-protected-mips-binary-with-symbolic-execution-no-such-crackme/)
* [Génération rapide de DGA avec Miasm](https://www.lexsi.com/securityhub/generation-rapide-de-dga-avec-miasm/):DGA 的快速计算(法语文章)
* [启用客户端崩溃抵抗以克服多样化和信息隐藏](https://www.internetsociety.org/sites/default/files/blogs-media/enabling-client-side-crash-resistance-overcome-diversification-information-hiding.pdf):检测未定向调用的潜在参数
* [Miasm:逆向工程框架](https://www.sstic.org/2012/presentation/miasm_framework_de_reverse_engineering/)(法语)
* [Miasm 教程](https://www.sstic.org/2014/presentation/Tutorial_miasm/)(法语视频)
* [依赖图:Petit Poucet 风格](https://www.sstic.org/2016/presentation/graphes_de_dpendances__petit_poucet_style/):DepGraph(法语)
## 书籍
* [实用逆向工程:X86, X64, Arm, Windows 内核, 逆向工具和混淆](http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118787315,subjectCd-CSJ0.html):Miasm 简介(第 5 章“混淆”)
* [BlackHat Python - 附录](https://github.com/oreilly-japan/black-hat-python-jp-support/tree/master/appendix-A):日本安全书籍示例
标签:Assetfinder, CTF工具, DAST, DNS 反向解析, ELF文件, JIT模拟, Miasm, MIPS, PE文件, Python, TLS抓取, X86, 中间语言, 二进制分析, 二进制插桩, 云安全监控, 云安全运维, 云资产清单, 代码混淆, 反汇编器, 反混淆, 恶意软件分析, 无后门, 汇编器, 漏洞搜索, 符号执行, 软件安全, 逆向工具, 逆向工程, 静态分析