CybercentreCanada/Maco
GitHub: CybercentreCanada/Maco
Maco 是一个标准化的恶意软件配置提取框架,旨在统一定义解析器输出模型并简化多提取器的管理与执行流程。
Stars: 44 | Forks: 14
# Maco - 恶意软件配置提取器框架
## Maco 是一个用于 恶意软件 配置提取器 (malware config extractors) 的框架。
它旨在解决两个问题:
- 为提取器输出定义标准化的本体(或模型)。这极大地有助于将提取的值存入数据库。
- 提供一种标准方式来识别应运行哪些解析器以及如何执行它们。
## Maco 组件
- `model.py`
- 提取器通用输出的数据模型
- `extractor.py`
- 提取器实现的基类
- `collector.py`
- 用于加载和运行提取器的工具
- `cli.py`
- 一个名为 `maco` 的 CLI 工具,用于协助在本地运行您的提取器
- `base_test.py`
- 协助为您的提取器编写单元测试
**注意:如果您有兴趣在项目中仅使用该模型,可以通过 `pip install maco-model` 安装,这是一个仅包含模型定义的较小包**
## 项目集成 🛠️
该框架正被以下项目积极使用:
| Project | 描述 | 许可证 |
| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|
| 一个恶意软件分析平台,使用 MACO 模型将恶意软件配置提取结果导出为可解析、机器友好的格式 | [](https://github.com/CybercentreCanada/assemblyline/blob/main/LICENSE.md) |
| [configextractor-py](https://github.com/CybercentreCanada/configextractor-py) | 一个旨在运行来自多个框架的提取器的工具,并使用 MACO 模型进行输出统一化 | [](https://github.com/CybercentreCanada/configextractor-py/blob/main/LICENSE.md) |
|
| 一个健壮的、支持多进程的多家族 RAT 配置解析器/提取器,兼容 MACO | [](https://github.com/jeFF0Falltrades/rat_king_parser/blob/master/LICENSE) |
|
| 一个包含 MACO 提取器的解析器/提取器仓库,由 CAPE 社区编写,但集成在 [CAPE](https://github.com/kevoreilly/CAPEv2) 部署环境中。
**注意:这些 MACO 提取器封装并解析了原始的 CAPE 提取器。** | [](https://github.com/kevoreilly/CAPEv2/blob/master/LICENSE) | |
| 一个包含 MACO 提取器的解析器/提取器仓库,由 SEKOIA 社区编写。 | [](https://github.com/SEKOIA-IO/Community/blob/main/LICENSE.md) |
## 模型示例
参阅 [模型定义](https://github.com/CybercentreCanada/Maco/blob/0f447a66de5e5ce8770ef3fe2325aec002842e63/maco/model.py#L127) 以获取所有支持的字段。
您可以独立于框架的其余部分使用该模型。
这对于系统间的兼容性仍然很有用!
```
from maco import model
# 'family' is the only required property on the model
output = model.ExtractorModel(family="wanabee")
output.version = "2019" # variant first found in 2019
output.category.extend([model.CategoryEnum.cryptominer, model.CategoryEnum.clickfraud])
output.http.append(model.ExtractorModel.Http(protocol="https",
uri="https://bad-domain.com/c2_payload",
usage="c2"))
output.tcp.append(model.ExtractorModel.Connection(server_ip="127.0.0.1",
usage="ransom"))
output.campaign_id.append("859186-3224-9284")
output.inject_exe.append("explorer.exe")
output.binaries.append(
output.Binary(
data=b"sam I am",
datatype=output.Binary.TypeEnum.config,
encryption=output.Binary.Encryption(
algorithm="rot26",
mode="block",
),
)
)
# data about the malware that doesn't fit the model
output.other["author_lunch"] = "green eggs and ham"
output.other["author_lunch_time"] = "3pm"
print(output.model_dump(exclude_defaults=True))
# Generated model
{
'family': 'wanabee',
'version': '2019',
'category': ['cryptominer', 'clickfraud'],
'campaign_id': ['859186-3224-9284'],
'inject_exe': ['explorer.exe'],
'other': {'author_lunch': 'green eggs and ham', 'author_lunch_time': '3pm'},
'http': [{'uri': 'https://bad-domain.com/c2_payload', 'usage': 'c2', 'protocol': 'https'}],
'tcp': [{'server_ip': '127.0.0.1', 'usage': 'ransom'}],
'binaries': [{
'datatype': 'config', 'data': b'sam I am',
'encryption': {'algorithm': 'rot26', 'mode': 'block'}
}]
}
```
您也可以从字典创建模型实例:
```
from maco import model
output = {
"family": "wanabee2",
"version": "2022",
"ssh": [
{
"username": "wanna",
"password": "bee2",
"hostname": "10.1.10.100",
}
],
}
print(model.ExtractorModel(**output))
# Generated model
family='wanabee2' version='2022' category=[] attack=[] capability_enabled=[]
capability_disabled=[] campaign_id=[] identifier=[] decoded_strings=[]
password=[] mutex=[] pipe=[] sleep_delay=None inject_exe=[] other={}
binaries=[] ftp=[] smtp=[] http=[]
ssh=[SSH(username='wanna', password='bee2', hostname='10.1.10.100', port=None, usage=None)]
proxy=[] dns=[] tcp=[] udp=[] encryption=[] service=[] cryptocurrency=[]
paths=[] registry=[]
```
## 提取器示例
以下提取器将针对任何具有超过 50 个 ELF 区段 (section) 的文件触发,
并在模型中设置一些属性。
您的提取器在发现有用信息方面会比这个做得更好!
```
class Elfy(extractor.Extractor):
"""Check basic elf property."""
family = "elfy"
author = "blue"
last_modified = "2022-06-14"
yara_rule = """
import "elf"
rule Elfy
{
condition:
elf.number_of_sections > 50
}
"""
def run(
self, stream: BytesIO, matches: List[yara.Match]
) -> Optional[model.ExtractorModel]:
# return config model formatted results
ret = model.ExtractorModel(family=self.family)
# the list for campaign_id already exists and is empty, so we just add an item
ret.campaign_id.append(str(len(stream.read())))
return ret
```
## 编写提取器
在 '`demo_extractors`' 文件夹中有几个使用 Maco 的示例。
需要注意的几点:
- Yara 规则名称必须以提取器类名为前缀。
- 例如:类 'MyScript' 拥有的 Yara 规则应命名为 'MyScriptDetect1' 和 'MyScriptDetect2',而不是 'Detect1'
- 您可以通过 Python 相对导入加载同一文件夹内的其他脚本
- 详见 `complex.py`
- 您可以规范对 '`other`' 字典的使用
- 此项可选,详见 `limit_other.py`
- 或者,考虑为您经常使用的属性提交一个 PR
# 环境要求
Python 3.8+。
使用 `pip install maco` 安装此包。
所有必需的 Python 包都在 `requirements.txt` 中。
# CLI 使用方法
```
> maco --help
usage: maco [-h] [-v] [--pretty] [--base64] [--logfile LOGFILE] [--include INCLUDE] [--exclude EXCLUDE] [-f] [--create_venv] extractors samples
Run extractors over samples.
positional arguments:
extractors path to extractors
samples path to samples
optional arguments:
-h, --help show this help message and exit
-v, --verbose print debug logging. -v extractor info, -vv extractor debug, -vvv cli debug
--pretty pretty print json output
--base64 Include base64 encoded binary data in output (can be large, consider printing to file rather than console)
--logfile LOGFILE file to log output
--include INCLUDE comma separated extractors to run
--exclude EXCLUDE comma separated extractors to not run
-f, --force ignore yara rules and execute all extractors
--create_venv Creates venvs for every requirements.txt found (only applies when extractor path is a directory)
```
## CLI 输出示例
CLI 有助于在独立系统中(例如在逆向工程环境中)使用您的提取器。
```
> maco demo_extractors/ /usr/lib --include Complex
extractors loaded: ['Complex']
complex by blue 2022-06-14 TLP:WHITE
This script has multiple yara rules and coverage of the data model.
path: /usr/lib/udev/hwdb.bin
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/9956330", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
path: /usr/lib/udev/hwdb.d/20-OUI.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1986908", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
path: /usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1257481", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
15884 analysed, 3 hits, 3 extracted
```
演示提取器设计为在 '`demo_extractors`' 文件夹上运行时触发。
例如:`maco demo_extractors demo_extractors`
# 贡献
请使用 ruff 对 PR 进行格式化和 lint 检查。这可能是导致 PR 测试失败的原因。
Ruff 会尝试修复大多数问题,但有些可能需要手动解决。
```
pip install ruff
ruff format
ruff check --fix
```
| 一个恶意软件分析平台,使用 MACO 模型将恶意软件配置提取结果导出为可解析、机器友好的格式 | [](https://github.com/CybercentreCanada/assemblyline/blob/main/LICENSE.md) |
| [configextractor-py](https://github.com/CybercentreCanada/configextractor-py) | 一个旨在运行来自多个框架的提取器的工具,并使用 MACO 模型进行输出统一化 | [](https://github.com/CybercentreCanada/configextractor-py/blob/main/LICENSE.md) |
|
| 一个健壮的、支持多进程的多家族 RAT 配置解析器/提取器,兼容 MACO | [](https://github.com/jeFF0Falltrades/rat_king_parser/blob/master/LICENSE) |
|
| 一个包含 MACO 提取器的解析器/提取器仓库,由 CAPE 社区编写,但集成在 [CAPE](https://github.com/kevoreilly/CAPEv2) 部署环境中。**注意:这些 MACO 提取器封装并解析了原始的 CAPE 提取器。** | [](https://github.com/kevoreilly/CAPEv2/blob/master/LICENSE) | |
| 一个包含 MACO 提取器的解析器/提取器仓库,由 SEKOIA 社区编写。 | [](https://github.com/SEKOIA-IO/Community/blob/main/LICENSE.md) |
## 模型示例
参阅 [模型定义](https://github.com/CybercentreCanada/Maco/blob/0f447a66de5e5ce8770ef3fe2325aec002842e63/maco/model.py#L127) 以获取所有支持的字段。
您可以独立于框架的其余部分使用该模型。
这对于系统间的兼容性仍然很有用!
```
from maco import model
# 'family' is the only required property on the model
output = model.ExtractorModel(family="wanabee")
output.version = "2019" # variant first found in 2019
output.category.extend([model.CategoryEnum.cryptominer, model.CategoryEnum.clickfraud])
output.http.append(model.ExtractorModel.Http(protocol="https",
uri="https://bad-domain.com/c2_payload",
usage="c2"))
output.tcp.append(model.ExtractorModel.Connection(server_ip="127.0.0.1",
usage="ransom"))
output.campaign_id.append("859186-3224-9284")
output.inject_exe.append("explorer.exe")
output.binaries.append(
output.Binary(
data=b"sam I am",
datatype=output.Binary.TypeEnum.config,
encryption=output.Binary.Encryption(
algorithm="rot26",
mode="block",
),
)
)
# data about the malware that doesn't fit the model
output.other["author_lunch"] = "green eggs and ham"
output.other["author_lunch_time"] = "3pm"
print(output.model_dump(exclude_defaults=True))
# Generated model
{
'family': 'wanabee',
'version': '2019',
'category': ['cryptominer', 'clickfraud'],
'campaign_id': ['859186-3224-9284'],
'inject_exe': ['explorer.exe'],
'other': {'author_lunch': 'green eggs and ham', 'author_lunch_time': '3pm'},
'http': [{'uri': 'https://bad-domain.com/c2_payload', 'usage': 'c2', 'protocol': 'https'}],
'tcp': [{'server_ip': '127.0.0.1', 'usage': 'ransom'}],
'binaries': [{
'datatype': 'config', 'data': b'sam I am',
'encryption': {'algorithm': 'rot26', 'mode': 'block'}
}]
}
```
您也可以从字典创建模型实例:
```
from maco import model
output = {
"family": "wanabee2",
"version": "2022",
"ssh": [
{
"username": "wanna",
"password": "bee2",
"hostname": "10.1.10.100",
}
],
}
print(model.ExtractorModel(**output))
# Generated model
family='wanabee2' version='2022' category=[] attack=[] capability_enabled=[]
capability_disabled=[] campaign_id=[] identifier=[] decoded_strings=[]
password=[] mutex=[] pipe=[] sleep_delay=None inject_exe=[] other={}
binaries=[] ftp=[] smtp=[] http=[]
ssh=[SSH(username='wanna', password='bee2', hostname='10.1.10.100', port=None, usage=None)]
proxy=[] dns=[] tcp=[] udp=[] encryption=[] service=[] cryptocurrency=[]
paths=[] registry=[]
```
## 提取器示例
以下提取器将针对任何具有超过 50 个 ELF 区段 (section) 的文件触发,
并在模型中设置一些属性。
您的提取器在发现有用信息方面会比这个做得更好!
```
class Elfy(extractor.Extractor):
"""Check basic elf property."""
family = "elfy"
author = "blue"
last_modified = "2022-06-14"
yara_rule = """
import "elf"
rule Elfy
{
condition:
elf.number_of_sections > 50
}
"""
def run(
self, stream: BytesIO, matches: List[yara.Match]
) -> Optional[model.ExtractorModel]:
# return config model formatted results
ret = model.ExtractorModel(family=self.family)
# the list for campaign_id already exists and is empty, so we just add an item
ret.campaign_id.append(str(len(stream.read())))
return ret
```
## 编写提取器
在 '`demo_extractors`' 文件夹中有几个使用 Maco 的示例。
需要注意的几点:
- Yara 规则名称必须以提取器类名为前缀。
- 例如:类 'MyScript' 拥有的 Yara 规则应命名为 'MyScriptDetect1' 和 'MyScriptDetect2',而不是 'Detect1'
- 您可以通过 Python 相对导入加载同一文件夹内的其他脚本
- 详见 `complex.py`
- 您可以规范对 '`other`' 字典的使用
- 此项可选,详见 `limit_other.py`
- 或者,考虑为您经常使用的属性提交一个 PR
# 环境要求
Python 3.8+。
使用 `pip install maco` 安装此包。
所有必需的 Python 包都在 `requirements.txt` 中。
# CLI 使用方法
```
> maco --help
usage: maco [-h] [-v] [--pretty] [--base64] [--logfile LOGFILE] [--include INCLUDE] [--exclude EXCLUDE] [-f] [--create_venv] extractors samples
Run extractors over samples.
positional arguments:
extractors path to extractors
samples path to samples
optional arguments:
-h, --help show this help message and exit
-v, --verbose print debug logging. -v extractor info, -vv extractor debug, -vvv cli debug
--pretty pretty print json output
--base64 Include base64 encoded binary data in output (can be large, consider printing to file rather than console)
--logfile LOGFILE file to log output
--include INCLUDE comma separated extractors to run
--exclude EXCLUDE comma separated extractors to not run
-f, --force ignore yara rules and execute all extractors
--create_venv Creates venvs for every requirements.txt found (only applies when extractor path is a directory)
```
## CLI 输出示例
CLI 有助于在独立系统中(例如在逆向工程环境中)使用您的提取器。
```
> maco demo_extractors/ /usr/lib --include Complex
extractors loaded: ['Complex']
complex by blue 2022-06-14 TLP:WHITE
This script has multiple yara rules and coverage of the data model.
path: /usr/lib/udev/hwdb.bin
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/9956330", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
path: /usr/lib/udev/hwdb.d/20-OUI.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1986908", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
path: /usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb
run Complex extractor from rules ['ComplexAlt']
{"family": "complex", "version": "5", "decoded_strings": ["Paradise"],
"binaries": [{"datatype": "payload", "size": 9, "hex_sample": "736F6D652064617461", "sha256": "1307990e6ba5ca145eb35e99182a9bec46531bc54ddf656a602c780fa0240dee",
"encryption": {"algorithm": "something"}}],
"http": [{"protocol": "https", "hostname": "blarg5.com", "path": "/malz/1257481", "usage": "c2"}],
"encryption": [{"algorithm": "sha256"}]}
15884 analysed, 3 hits, 3 extracted
```
演示提取器设计为在 '`demo_extractors`' 文件夹上运行时触发。
例如:`maco demo_extractors demo_extractors`
# 贡献
请使用 ruff 对 PR 进行格式化和 lint 检查。这可能是导致 PR 测试失败的原因。
Ruff 会尝试修复大多数问题,但有些可能需要手动解决。
```
pip install ruff
ruff format
ruff check --fix
```
标签:DAST, Homebrew安装, Python, YARA, 云资产可视化, 云资产清单, 威胁情报, 开发者工具, 恶意软件分析, 恶意软件配置, 数据模型, 文档结构分析, 无后门, 网络安全, 自动化分析, 解析器, 跨站脚本, 逆向工具, 逆向工程, 配置提取, 隐私保护