airbus-cert/minusone

GitHub: airbus-cert/minusone

一款基于 tree-sitter 的脚本反混淆引擎,通过可插拔规则集对 PowerShell 和 JavaScript 脱壳与简化表达式。

Stars: 93 | Forks: 7

# minusone $$\textit{obfuscation}^{-1}$$ 脚本混淆的逆操作 🌐 可访问在线版本:https://minusone.skyblue.team/ 🌐 ## 用法 MinusOne 使用 Rust 编写,可通过 Cargo 包管理器进行构建、部署或执行: ``` cargo run -- --path test.ps1 # Run default ruleset cargo run -- --path test.ps1 --debug # Run debug mode to print the inferred tree cargo run -- --list # List available rule cargo run -- --path test.ps1 -r forward,addint # Only use Forward and AddInt cargo run -- --path test.ps1 -R foreach # Do not use foreach rule ``` 默认情况下,cargo 将构建 minusone 库并运行 minusone-cli 二进制文件。 ## 绑定 以下绑定可用: - Python,允许将 MinusOne 轻松集成到 Jupyter 笔记本中 - JS (WASM),允许在 https://minusone.skyblue.team/ 等 Web 应用中嵌入 minusone 要构建并发布这些包,请使用 `justfile` 模块: ``` just py build # Build the python wheel just js build # Build the WASM module and serve it on localhost to test it just js serve # Build the WASM module and serve it on localhost to test it ``` ## 项目结构 - `core`:minusone 核心库 - `src/ps`:minusone 的 PowerShell 特定规则 - `crates` - `minusone-cli`:用于在终端中使用 minusone 的简单 CLI - `pyminusone`:minusone 的 Python 绑定 - `minusone-cli`:用于 minusone 的 JS 绑定,使用 WASM 构建 ## 描述 MinusOne 是一个专注于脚本语言的反混淆引擎。MinusOne 基于 [tree-sitter](https://tree-sitter.github.io/tree-sitter/) 进行解析,并将应用一组规则来推断节点值并简化表达式。 MinusOne 支持以下语言: * PowerShell 以下示例来自 [`Invoke-Obfuscation`](https://github.com/gh0x0st/Invoke-PSObfuscation/blob/main/layer-0-obfuscation.md#final-payload): ``` ${Pop-pKkAp}=1;${Clear-OK3Emf}=4;${Push-Jh8ps}=9;${Format-qqM9C}=16;${Redo-kSQuo}=86;${Format-LyC}=51;${Pop-ASPJ}=74;${Join-pIuV}=112;${Hide-Rhpet}=100;${Copy-TWaj}=71;${Set-yYE}=85;${Exit-shq}=116;${Skip-5qa}=83;${Push-bAik}=57;${Split-f7hDr6}=122;${Open-YGi}=65;${Open-LPQk}=61;${Select-YUyq}=84;${Move-sS6mJ}=87;${Search-wa0}=108;${Join-YJq}=117;${Hide-iQ5}=88;${Select-iV0F7}=78;${Select-cI9j}=80;${Open-Hec}=98;${Reset-4QePz}=109;${Format-4e7UHy}=103;${Lock-UyaF}=97;${Select-ZGdxB}=77;${Move-FtkTLt}=104;${Push-VUUQsE}=73;${Add-LHgggw}=99;${Reset-sc3}=81;${Format-AlmdYS}=50;${Resize-mYqZ}=121;${Reset-hp9}=66;${Reset-qC3Yd}=48;${Find-6QywvV}=120;${Select-v7sja}=110;${Step-7WvUL}=82;$DJ2=[System.Text.Encoding];$1Ro=[System.Convert];${Step-xE2}=-join'8FTU'[-${Pop-pKkAp}..-${Clear-OK3Emf}];${Unlock-Zdbkvh}=-join'gnirtSteG'[-${Pop-pKkAp}..-${Push-Jh8ps}];${Close-yjy}=-join'gnirtS46esaBmorF'[-${Pop-pKkAp}..-${Format-qqM9C}];. ($DJ2::${Step-xE2}.${Unlock-Zdbkvh}($1Ro::${Close-yjy}(([char]${Redo-kSQuo}+[char]${Format-LyC}+[char]${Pop-ASPJ}+[char]${Join-pIuV}+[char]${Hide-Rhpet}+[char]${Copy-TWaj}+[char]${Set-yYE}+[char]${Exit-shq}+[char]${Skip-5qa}+[char]${Copy-TWaj}+[char]${Push-bAik}+[char]${Split-f7hDr6}+[char]${Hide-Rhpet}+[char]${Open-YGi}+[char]${Open-LPQk}+[char]${Open-LPQk})))) ($DJ2::${Step-xE2}.${Unlock-Zdbkvh}($1Ro::${Close-yjy}(([char]${Select-YUyq}+[char]${Move-sS6mJ}+[char]${Search-wa0}+[char]${Join-YJq}+[char]${Hide-Rhpet}+[char]${Hide-iQ5}+[char]${Select-iV0F7}+[char]${Select-cI9j}+[char]${Open-Hec}+[char]${Reset-4QePz}+[char]${Set-yYE}+[char]${Format-4e7UHy}+[char]${Lock-UyaF}+[char]${Hide-iQ5}+[char]${Select-ZGdxB}+[char]${Format-4e7UHy}+[char]${Hide-Rhpet}+[char]${Copy-TWaj}+[char]${Move-FtkTLt}+[char]${Search-wa0}+[char]${Push-VUUQsE}+[char]${Copy-TWaj}+[char]${Pop-ASPJ}+[char]${Search-wa0}+[char]${Add-LHgggw}+[char]${Format-LyC}+[char]${Reset-sc3}+[char]${Format-4e7UHy}+[char]${Add-LHgggw}+[char]${Format-AlmdYS}+[char]${Select-iV0F7}+[char]${Resize-mYqZ}+[char]${Lock-UyaF}+[char]${Hide-iQ5}+[char]${Reset-hp9}+[char]${Reset-qC3Yd}+[char]${Push-VUUQsE}+[char]${Copy-TWaj}+[char]${Find-6QywvV}+[char]${Join-pIuV}+[char]${Open-Hec}+[char]${Select-v7sja}+[char]${Step-7WvUL}+[char]${Search-wa0}+[char]${Add-LHgggw}+[char]${Format-4e7UHy}+[char]${Open-LPQk}+[char]${Open-LPQk})))) ``` 它将产生以下输出: ``` Write-Host "MinusOne is the best script linter" ``` ## 什么是规则? 规则在访问特定节点时根据其子节点或父节点产生结果。规则将在进入和离开节点时被调用。 为 PowerShell 创建规则就像实现 `RuleMut` trait 一样简单: ``` #[derive(Default)] pub struct MyRule; impl<'a> RuleMut<'a> for MyRule { type Language = Powershell; fn enter(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ Ok(()) } fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ Ok(()) } } ``` `enter()` 方法在访问节点之前调用,`leave()` 方法将在离开节点时调用,即在访问节点及其所有子节点之后。 ### 示例:一条将两个整数相加的规则 在此示例中,我们将看到如何推断: ``` $a = 40 + 2 ``` 得到: ``` $a = 42 ``` 我们需要的第一个规则是能够解析整数: ``` #[derive(Default)] pub struct ParseInt; impl<'a> RuleMut<'a> for ParseInt { type Language = Powershell; fn enter(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ Ok(()) } fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ let view = node.view(); let token = view.text()?; match view.kind() { "decimal_integer_literal" => { if let Ok(number) = token.parse::() { node.set(Raw(Num(number))); } }, _ => () } Ok(()) } } ``` 该规则将在离开 `tree-sitter-powershell` 语法中的 `decimal_integer_literal` 类型的节点时进行处理, 然后它将尝试使用 [`std::str::parse`](https://doc.rust-lang.org/std/primitive.str.html#method.parse) 方法(`token.parse::()`)解析该标记。 一个更完整的规则实现可以参考 [这里](src/ps/integer.rs)。 现在我们将创建一个新规则,用于推断涉及 `+` 操作的两个节点的值。该规则将专注于 `additive_expression` 节点类型。 它将检查该节点是否有三个子节点: * 第一个必须由前一条规则推断为整数 * 第二个必须是 `+` 标记 * 第三个必须由前一条规则推断为整数 ``` #[derive(Default)] pub struct AddInt; impl<'a> RuleMut<'a> for AddInt { type Language = Powershell; fn enter(&mut self, _node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ Ok(()) } fn leave(&mut self, node: &mut NodeMut<'a, Self::Language>, flow: BranchFlow) -> MinusOneResult<()>{ let node_view = node.view(); if node_view.kind() == "additive_expression" { if let (Some(left_op), Some(operator), Some(right_op)) = (node_view.child(0), node_view.child(1), node_view.child(2)) { match (left_op.data(), operator.text()?, right_op.data()) { (Some(Raw(Num(number_left))), "+", Some(Raw(Num(number_right)))) => node.set(Raw(Num(number_left + number_right))), _ => {} } } } Ok(()) } } ``` 然后我们可以将这些规则应用于由 `tree-sitter-powershell` 生成的 PowerShell 语法树: ``` let mut tree = build_powershell_tree("40 + 2").unwrap(); tree.apply_mut(&mut ( ParseInt::default(), Forward::default(), AddInt::default() )).unwrap(); ``` `Forward` 规则是一种特殊规则,当节点未被语义化使用时会转发其推断类型,这主要是由于 PowerShell 语法树的生成方式。 然后,你可以使用 `Linter` 对象打印 PowerShell 结果: ``` let mut ps_linter_view = Linter::default(); ps_linter_view.print(&tree.root().unwrap()).unwrap(); // => 42 ``` ## PowerShell 的规则集 ### 静态规则集 使用 `Engine` 对象时,将自动使用为 PowerShell 设计的预定义规则。这些规则可以在 [src/ps/mod.rs](src/ps/mod.rs) 中找到: ``` impl_powershell_ruleset!( Forward, // Special rule that will forward inferred value in case the node is transparent ParseInt, // Parse integer AddInt, // +, - operations on integer MultInt, // *, / operations on integer ParseString, // Parse string token, including multiline strings ConcatString, // String concatenation operation Cast, // cast operation, like [char]0x65 ParseArrayLiteral, // It will parse array declared using separate value (integer or string) by a comma ParseRange, // It will parse .. operator and generate an array AccessString, // The access operator [] apply to a string : "foo"[0] => "f" JoinComparison, // It will infer join string operation using the -join operator : @('a', 'b', 'c') -join '' => "abc" JoinStringMethod, // It will infer join string operation using the [string]::join method : [string]::join('', @('a', 'b', 'c')) JoinOperator, // It will infer join string operation using the -join unary operator -join @('a', 'b', 'c') PSItemInferrator, // PsItem is used to inferred commandlet pattern like % { [char] $_ } ForEach, // It will used PSItem rules to inferred foreach-object command StringReplaceMethod, // It will infer replace method apply to a string : "foo".replace("oo", "aa") => "faa" ComputeArrayExpr, // It will infer array that start with @ NewObjectArray, // Infers arrays constructed via New-Object cmdlet StringReplaceOp, // It will infer replace method apply to a string by using the -replace operator StaticVar, // It will infer value of known variable : $pshome, $shellid CastNull, // It will infer value of +$() or -$() which will produce 0 ParseHash, // Parse hashtable FormatString, // It will infer string when format operator is used ; "{1}-{0}" -f "Debug", "Write" ParseBool, // It will infer boolean operator Comparison, // It will infer comparison when it's possible Not, // It will infer the ! operator ParseType, // Parse type DecodeBase64, // Decode calls to FromBase64 FromUTF, // Decode calls to FromUTF{8,16}.GetText Length, // Decode attribute length of string and array BoolAlgebra, // Add support to boolean algebra (or and) Var, // Variable replacement in case of predictable flow AddArray, // Array concat using +, operator StringSplitMethod, // Handle split method AccessArray, // Handle static array element access AccessHashMap, // Handle hashmap access ForStatementCondition, // Infer for condition to remove fake loops ForStatementFlowControl // Simplify for statment based on flow control ); ``` 默认情况下,如果选择使用某语言的完整反混淆规则集,`minusone` 将使用静态实现。 它允许将 `PowershellDefaultRuleSet` 类型声明为实现 `RuleMut` 的类型元组。 得益于 `impl_data` 宏,该类型也将实现 `RuleMut`,从而可以将其传递给反混淆引擎。 ### 动态规则集 `minusone` 提供了在执行时动态选择使用哪些规则的能力,通过使用 `-r` 和 `-R` 标志分别包含或排除规则。 规则名称不区分大小写。 在底层,引擎将创建一个包含所有可用规则的向量,然后过滤掉未使用的规则。 ## 路线图 * 更准确的 PowerShell HashTable 解析 * 对 JavaScript 的基础支持
标签:AI合规, AST 处理, Cargo, CMS安全, JavaScript, JIT 编译, Powershell, Python, Rust, TCP/UDP协议, tree-sitter, WASM, Web 前端, 云计算, 云资产清单, 代码简化, 包管理, 去混淆, 反混淆引擎, 可视化界面, 数据可视化, 无后门, 树解析, 网络流量审计, 脚本语言, 脱混淆, 表达式推断, 规则引擎, 逆向工具, 逆向工程, 通知系统