airbus-cert/minusone
GitHub: airbus-cert/minusone
一款基于 tree-sitter 的脚本反混淆引擎,通过可插拔规则集对 PowerShell 和 JavaScript 脱壳与简化表达式。
Stars: 93 | Forks: 7
# minusone
$$\textit{obfuscation}^{-1}$$
脚本混淆的逆操作
🌐 可访问在线版本:https://minusone.skyblue.team/ 🌐
## 用法
MinusOne 使用 Rust 编写,可通过 Cargo 包管理器进行构建、部署或执行:
```
cargo run -- --path test.ps1 # Run default ruleset
cargo run -- --path test.ps1 --debug # Run debug mode to print the inferred tree
cargo run -- --list # List available rule
cargo run -- --path test.ps1 -r forward,addint # Only use Forward and AddInt
cargo run -- --path test.ps1 -R foreach # Do not use foreach rule
```
默认情况下,cargo 将构建 minusone 库并运行 minusone-cli 二进制文件。
## 绑定
以下绑定可用:
- Python,允许将 MinusOne 轻松集成到 Jupyter 笔记本中
- JS (WASM),允许在 https://minusone.skyblue.team/ 等 Web 应用中嵌入 minusone
要构建并发布这些包,请使用 `justfile` 模块:
```
just py build # Build the python wheel
just js build # Build the WASM module and serve it on localhost to test it
just js serve # Build the WASM module and serve it on localhost to test it
```
## 项目结构
- `core`:minusone 核心库
- `src/ps`:minusone 的 PowerShell 特定规则
- `crates`
- `minusone-cli`:用于在终端中使用 minusone 的简单 CLI
- `pyminusone`:minusone 的 Python 绑定
- `minusone-cli`:用于 minusone 的 JS 绑定,使用 WASM 构建
## 描述
MinusOne 是一个专注于脚本语言的反混淆引擎。MinusOne 基于 [tree-sitter](https://tree-sitter.github.io/tree-sitter/) 进行解析,并将应用一组规则来推断节点值并简化表达式。
MinusOne 支持以下语言:
* PowerShell
以下示例来自 [`Invoke-Obfuscation`](https://github.com/gh0x0st/Invoke-PSObfuscation/blob/main/layer-0-obfuscation.md#final-payload):
```
${Pop-pKkAp}=1;${Clear-OK3Emf}=4;${Push-Jh8ps}=9;${Format-qqM9C}=16;${Redo-kSQuo}=86;${Format-LyC}=51;${Pop-ASPJ}=74;${Join-pIuV}=112;${Hide-Rhpet}=100;${Copy-TWaj}=71;${Set-yYE}=85;${Exit-shq}=116;${Skip-5qa}=83;${Push-bAik}=57;${Split-f7hDr6}=122;${Open-YGi}=65;${Open-LPQk}=61;${Select-YUyq}=84;${Move-sS6mJ}=87;${Search-wa0}=108;${Join-YJq}=117;${Hide-iQ5}=88;${Select-iV0F7}=78;${Select-cI9j}=80;${Open-Hec}=98;${Reset-4QePz}=109;${Format-4e7UHy}=103;${Lock-UyaF}=97;${Select-ZGdxB}=77;${Move-FtkTLt}=104;${Push-VUUQsE}=73;${Add-LHgggw}=99;${Reset-sc3}=81;${Format-AlmdYS}=50;${Resize-mYqZ}=121;${Reset-hp9}=66;${Reset-qC3Yd}=48;${Find-6QywvV}=120;${Select-v7sja}=110;${Step-7WvUL}=82;$DJ2=[System.Text.Encoding];$1Ro=[System.Convert];${Step-xE2}=-join'8FTU'[-${Pop-pKkAp}..-${Clear-OK3Emf}];${Unlock-Zdbkvh}=-join'gnirtSteG'[-${Pop-pKkAp}..-${Push-Jh8ps}];${Close-yjy}=-join'gnirtS46esaBmorF'[-${Pop-pKkAp}..-${Format-qqM9C}];. ($DJ2::${Step-xE2}.${Unlock-Zdbkvh}($1Ro::${Close-yjy}(([char]${Redo-kSQuo}+[char]${Format-LyC}+[char]${Pop-ASPJ}+[char]${Join-pIuV}+[char]${Hide-Rhpet}+[char]${Copy-TWaj}+[char]${Set-yYE}+[char]${Exit-shq}+[char]${Skip-5qa}+[char]${Copy-TWaj}+[char]${Push-bAik}+[char]${Split-f7hDr6}+[char]${Hide-Rhpet}+[char]${Open-YGi}+[char]${Open-LPQk}+[char]${Open-LPQk})))) ($DJ2::${Step-xE2}.${Unlock-Zdbkvh}($1Ro::${Close-yjy}(([char]${Select-YUyq}+[char]${Move-sS6mJ}+[char]${Search-wa0}+[char]${Join-YJq}+[char]${Hide-Rhpet}+[char]${Hide-iQ5}+[char]${Select-iV0F7}+[char]${Select-cI9j}+[char]${Open-Hec}+[char]${Reset-4QePz}+[char]${Set-yYE}+[char]${Format-4e7UHy}+[char]${Lock-UyaF}+[char]${Hide-iQ5}+[char]${Select-ZGdxB}+[char]${Format-4e7UHy}+[char]${Hide-Rhpet}+[char]${Copy-TWaj}+[char]${Move-FtkTLt}+[char]${Search-wa0}+[char]${Push-VUUQsE}+[char]${Copy-TWaj}+[char]${Pop-ASPJ}+[char]${Search-wa0}+[char]${Add-LHgggw}+[char]${Format-LyC}+[char]${Reset-sc3}+[char]${Format-4e7UHy}+[char]${Add-LHgggw}+[char]${Format-AlmdYS}+[char]${Select-iV0F7}+[char]${Resize-mYqZ}+[char]${Lock-UyaF}+[char]${Hide-iQ5}+[char]${Reset-hp9}+[char]${Reset-qC3Yd}+[char]${Push-VUUQsE}+[char]${Copy-TWaj}+[char]${Find-6QywvV}+[char]${Join-pIuV}+[char]${Open-Hec}+[char]${Select-v7sja}+[char]${Step-7WvUL}+[char]${Search-wa0}+[char]${Add-LHgggw}+[char]${Format-4e7UHy}+[char]${Open-LPQk}+[char]${Open-LPQk}))))
```
它将产生以下输出:
```
Write-Host "MinusOne is the best script linter"
```
## 什么是规则?
规则在访问特定节点时根据其子节点或父节点产生结果。规则将在进入和离开节点时被调用。
为 PowerShell 创建规则就像实现 `RuleMut` trait 一样简单:
```
#[derive(Default)]
pub struct MyRule;
impl<'a> RuleMut<'a> for MyRule {
type Language = Powershell;
fn enter(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
Ok(())
}
fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
Ok(())
}
}
```
`enter()` 方法在访问节点之前调用,`leave()` 方法将在离开节点时调用,即在访问节点及其所有子节点之后。
### 示例:一条将两个整数相加的规则
在此示例中,我们将看到如何推断:
```
$a = 40 + 2
```
得到:
```
$a = 42
```
我们需要的第一个规则是能够解析整数:
```
#[derive(Default)]
pub struct ParseInt;
impl<'a> RuleMut<'a> for ParseInt {
type Language = Powershell;
fn enter(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
Ok(())
}
fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
let view = node.view();
let token = view.text()?;
match view.kind() {
"decimal_integer_literal" => {
if let Ok(number) = token.parse::() {
node.set(Raw(Num(number)));
}
},
_ => ()
}
Ok(())
}
}
```
该规则将在离开 `tree-sitter-powershell` 语法中的 `decimal_integer_literal` 类型的节点时进行处理,
然后它将尝试使用 [`std::str::parse`](https://doc.rust-lang.org/std/primitive.str.html#method.parse) 方法(`token.parse::()`)解析该标记。
一个更完整的规则实现可以参考 [这里](src/ps/integer.rs)。
现在我们将创建一个新规则,用于推断涉及 `+` 操作的两个节点的值。该规则将专注于 `additive_expression` 节点类型。
它将检查该节点是否有三个子节点:
* 第一个必须由前一条规则推断为整数
* 第二个必须是 `+` 标记
* 第三个必须由前一条规则推断为整数
```
#[derive(Default)]
pub struct AddInt;
impl<'a> RuleMut<'a> for AddInt {
type Language = Powershell;
fn enter(&mut self, _node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
Ok(())
}
fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{
let node_view = node.view();
if node_view.kind() == "additive_expression" {
if let (Some(left_op), Some(operator), Some(right_op)) = (node_view.child(0), node_view.child(1), node_view.child(2)) {
match (left_op.data(), operator.text()?, right_op.data()) {
(Some(Raw(Num(number_left))), "+", Some(Raw(Num(number_right)))) => node.set(Raw(Num(number_left + number_right))),
_ => {}
}
}
}
Ok(())
}
}
```
然后我们可以将这些规则应用于由 `tree-sitter-powershell` 生成的 PowerShell 语法树:
```
let mut tree = build_powershell_tree("40 + 2").unwrap();
tree.apply_mut(&mut (
ParseInt::default(),
Forward::default(),
AddInt::default()
)).unwrap();
```
`Forward` 规则是一种特殊规则,当节点未被语义化使用时会转发其推断类型,这主要是由于 PowerShell 语法树的生成方式。
然后,你可以使用 `Linter` 对象打印 PowerShell 结果:
```
let mut ps_linter_view = Linter::default();
ps_linter_view.print(&tree.root().unwrap()).unwrap();
// => 42
```
## PowerShell 的规则集
### 静态规则集
使用 `Engine` 对象时,将自动使用为 PowerShell 设计的预定义规则。这些规则可以在 [src/ps/mod.rs](src/ps/mod.rs) 中找到:
```
impl_powershell_ruleset!(
Forward, // Special rule that will forward inferred value in case the node is transparent
ParseInt, // Parse integer
AddInt, // +, - operations on integer
MultInt, // *, / operations on integer
ParseString, // Parse string token, including multiline strings
ConcatString, // String concatenation operation
Cast, // cast operation, like [char]0x65
ParseArrayLiteral, // It will parse array declared using separate value (integer or string) by a comma
ParseRange, // It will parse .. operator and generate an array
AccessString, // The access operator [] apply to a string : "foo"[0] => "f"
JoinComparison, // It will infer join string operation using the -join operator : @('a', 'b', 'c') -join '' => "abc"
JoinStringMethod, // It will infer join string operation using the [string]::join method : [string]::join('', @('a', 'b', 'c'))
JoinOperator, // It will infer join string operation using the -join unary operator -join @('a', 'b', 'c')
PSItemInferrator, // PsItem is used to inferred commandlet pattern like % { [char] $_ }
ForEach, // It will used PSItem rules to inferred foreach-object command
StringReplaceMethod, // It will infer replace method apply to a string : "foo".replace("oo", "aa") => "faa"
ComputeArrayExpr, // It will infer array that start with @
NewObjectArray, // Infers arrays constructed via New-Object cmdlet
StringReplaceOp, // It will infer replace method apply to a string by using the -replace operator
StaticVar, // It will infer value of known variable : $pshome, $shellid
CastNull, // It will infer value of +$() or -$() which will produce 0
ParseHash, // Parse hashtable
FormatString, // It will infer string when format operator is used ; "{1}-{0}" -f "Debug", "Write"
ParseBool, // It will infer boolean operator
Comparison, // It will infer comparison when it's possible
Not, // It will infer the ! operator
ParseType, // Parse type
DecodeBase64, // Decode calls to FromBase64
FromUTF, // Decode calls to FromUTF{8,16}.GetText
Length, // Decode attribute length of string and array
BoolAlgebra, // Add support to boolean algebra (or and)
Var, // Variable replacement in case of predictable flow
AddArray, // Array concat using +, operator
StringSplitMethod, // Handle split method
AccessArray, // Handle static array element access
AccessHashMap, // Handle hashmap access
ForStatementCondition, // Infer for condition to remove fake loops
ForStatementFlowControl // Simplify for statment based on flow control
);
```
默认情况下,如果选择使用某语言的完整反混淆规则集,`minusone` 将使用静态实现。
它允许将 `PowershellDefaultRuleSet` 类型声明为实现 `RuleMut` 的类型元组。
得益于 `impl_data` 宏,该类型也将实现 `RuleMut`,从而可以将其传递给反混淆引擎。
### 动态规则集
`minusone` 提供了在执行时动态选择使用哪些规则的能力,通过使用 `-r` 和 `-R` 标志分别包含或排除规则。
规则名称不区分大小写。
在底层,引擎将创建一个包含所有可用规则的向量,然后过滤掉未使用的规则。
## 路线图
* 更准确的 PowerShell HashTable 解析
* 对 JavaScript 的基础支持
标签:AI合规, AST 处理, Cargo, CMS安全, JavaScript, JIT 编译, Powershell, Python, Rust, TCP/UDP协议, tree-sitter, WASM, Web 前端, 云计算, 云资产清单, 代码简化, 包管理, 去混淆, 反混淆引擎, 可视化界面, 数据可视化, 无后门, 树解析, 网络流量审计, 脚本语言, 脱混淆, 表达式推断, 规则引擎, 逆向工具, 逆向工程, 通知系统