Netomize/HTTP-Basma

GitHub: Netomize/HTTP-Basma

HTTP-Basma 是一个主动 Web 服务器指纹识别算法，通过多层探测生成哈希，用于威胁检测和网络映射。

Stars: 0 | Forks: 0

# 自适应指纹识别：HTTP-Basma 用于精细服务器区分的多阶段探测 # 简介在网络安全领域，准确识别和刻画 Web 服务器对于威胁检测、漏洞评估和网络映射至关重要。我们推出了 **HTTP-Basma**，这是一种新颖的主动指纹识别算法，它通过一种多层次的方法揭示独特的服务器配置文件，从而应对这一挑战。主要特性：精心构造的请求，揭示性的响应：HTTP-Basma 发送 8 个经过精心设计的 HTTP 探针，以引发反映服务器配置的独特响应。双哈希，用途广泛。该算法生成两个哈希值： - 一个 38 字节的模糊哈希 "verbosus"，具有可逆性 - 一个 16 字节的单向哈希 "pacto"，派生自 verbosus，增强了隐私和安全性聚类与追踪：这些哈希值使得服务器聚类成为可能，能够识别独特和相似的服务器，并以更高的信心追查恶意行为者。模块化设计，便于扩展：该算法的架构支持添加新的哈希变体，鼓励协作和适应性。在本文中，我们首先调研了 HTTP 指纹识别领域重要的现有工作，然后探讨了算法的功能、设计、架构和结果。此外，我们将展示在[扫描 Majestic 前 100 万网站](https://github.com/Netomize/HTTP-Basma/releases/download/v1.0/majestic_1_million_http_basma_fingerprints_csv.zip) 中得出的引人注目的发现，包括识别和聚类各种恶意软件家族的 C&C HTTP 服务器。 HTTP-Basma 算法的核心思想在于发送 8 个具有不同要求的特制 HTTP 请求，以引发服务器的不同响应。一旦获得服务器响应，HTTP 状态行将被精确解析，所有元素都被最优编码。此外，服务器响应中的选定头部也会被检查并进行编码。它发送的请求类型如下： 1. **P1** - GET 正常 - 有效请求 2. **P2** - GET 无效 HTTP 版本请求 3. **P3** - GET 随机资源请求 4. **P4** - 随机动词请求 5. **P5** - GET 小写动词 6. **P6** - GET 请求 - Accept-Encoding - 完整 7. **P7** - GET 请求 - Accept-Encoding - 精简 8. **P8** - OPTIONS 请求每次请求之后，都会分析服务器的响应以提取特定的头部及其值。提取的数据会经过进一步处理，包括解析和编码，以生成一个可逆的指纹。算法工作原理的完整技术细节在所附的[论文](https://github.com/Netomize/HTTP-Basma/blob/main/http-basma_paper_mfmokbel_v1.pdf)中。这种模块化设计哲学将每个请求的指纹视为一个构建块，允许进行优雅的重构，并具有添加和减去任何请求指纹的可能性。指纹示例： [**CobaltStrike**](https://www.cobaltstrike.com/) ``` - verbosus fp: 011420958a0014514bd5221420958a221420958a221420958a2200001420958a22000000001f - pacto fp: 02464ae8b7d86f82c9918e2c2b9d6b91 - note: false-positive rate (72/986,910) ``` [**Havoc**](https://github.com/HavocFramework/Havoc) ``` - verbosus fp: 01142494d60914514bd522142494d6221420958a701420958a220000140e04922032c37f1609 - pacto fp: 020769322f3d94ac2f258ddf5ce08502 - note-1: false-positive rate 0 - note-2: tevedadav.site/43.209.165.126:443 (TLS) - sample-(sha-256): 9aa1dec8dd12f8adc7fc1274e1958f3613450109ee8b4ec6442a0fcf06df0972 ``` [**BruteRatel**](https://bruteratel.com/) ``` - verbosus fp: 01140a85e40014512f3612140a85e422140a85e422140a85e4220000140a85e4220000000001 - pacto fp: 0207292309a7a7e798e417d69df5f2a5 - note: false-positive rate (73/986,910) ``` [**Google**](https://google.com/) ``` - verbosus fp: 01140a85e4001320958a22142494d62214254c5e2214254c5e22080014254c5e220000000000 - pacto fp: 0202be780e1eaae0eaa6184e20c909b6 - note: false-positive rate (4/986,910) ``` [**YouTube**](https://youtube.com/) ``` - verbosus fp: 01140a85e4011320958a22142494d67214254c5e2214254c5e22080014254c5e220000000000 - pacto fp: 02cc5be6d05192e17de041538508bc22 - note: false-positive rate (38/986,910) ``` [**X**](https://x.com/) ``` - verbosus fp: 01140a85e40914514bd522140a85e4721420958a701420958a220800140a85e4720000001609 - pacto fp: 0221b4e46bbd0e5c037f5a852ca3fdc0 - note: false-positive rate (6/986,910) ``` # HTTP-Basma 工具 HTTP-Basma 是我开发的一个 C++ 工具，旨在展示该算法的实用性和可行性。它利用 Chilkat 的库进行所有 HTTP 套接字交互，并使用该库中的其他支持类。此外，该工具包含一个反混淆功能，可以解析并反转 verbosus 模糊哈希，输出一个全面的 JSON 对象，以及一个比较器功能，输出两个 verbosus 指纹之间的差异。请注意，工具的某些输出可能使用略有不同的探针编号，但底层顺序保持一致：`P1->P1, P2->P2, P3->P3, P4->P4, P->P5, P6->P6F, P7->P6L, P8->P7a`。 ## 工具选项

点击展开

``` Usage: HTTP-Basma [OPTION...] -d, --domain arg domains/IPs (you may query multiple domains, comma separated) -p, --port arg port number -s, --ssl does the HTTP connection have to be carried over SSL/TLS? -q, --qpath check domain with url path included (not recommended) -w, --redirect enable/disable HTTP redirects. If disabled/false, only the next redirect is followed, otherwise, all redirects are followed (default: true) -t, --ctimeout arg socket connection timeout value in seconds (default: 1) -g, --rtimeout arg socket read (from the server) timeout value in seconds (default: 1) -e, --sleep arg the duration (in milliseconds) to pause between each request (default: 100) -x, --proxy arg proxy config: <"socks4|socks5|http">,,,,, all values are comma-separated. is ignored with a non-HTTP proxy -f, --file arg file with list of domains/IPs (requires "-c/--csv" or "-j/--json") -P, --parallel Scan list of domains passed via the "-f/--file" option in parallel -c, --csv save to csv file; if the option 'n' is not specified, the CSV filename will be auto generated -n, --csvfile arg name of the CSV file -j, --json save to json file; if the option 'l' is not specified, the JSON filename will be auto generated -l, --jsonfile arg name of the JSON file -r, --saveh save request response headers -o, --pjson display fingerprint dissection to the console as a JSON object -i, --demangle_json arg demangle a fingerprint into a detailed json format (you can have more than one, comma separated) -u, --demangle_txt arg output a concise text format of the fingerprint, comma-separated for multiple results -C, --compare arg compare two verbosus fingerprints (comma-separated) -a, --pacto arg obtain the Pacto fingerprint using Verbosus -h, --help print usage ```

## 详细输出当请求给定的域名/IP 时，响应可以保存为 CSV 或 JSON 文件，其中包含大量关于服务器响应头和每个探针独特指纹的信息。例如，要获取服务器 https://google.com 的指纹，并将结果保存为 JSON 和 CSV 文件，同时保存每个探针的 HTTP 响应头： ``` HTTPBasma.exe -d https://google.com --json --csv --saveh ``` 在 `Output` 文件夹中，您将找到 CSV 文件 [google_hb_results_2026-05-19_08-35-38_am.csv](https://github.com/Netomize/HTTP-Basma/blob/main/Output/google_hb_results_2026-05-19_08-35-38_am.csv) 和 JSON 文件 [google_hb_results_2026-05-19_08-35-38_am.json](https://github.com/Netomize/HTTP-Basma/blob/main/Output/google_hb_results_2026-05-19_08-35-38_am.json)。 ## 反混淆器工具的反混淆功能 "-i/--demangle_json" 接受一个 verbosus 指纹，并重建每个探针的属性，输出一个全面的 JSON 对象。值得注意的是，当尝试反转 FNV-1a 哈希时，反混淆器使用两个本地数据库：`options.csv` 用于允许的 HTTP 方法，`status_line_db.csv` 用于状态行原因短语。如果这两个数据库文件中的任何一个缺失，相应的哈希反转功能将自动禁用。这些数据库是通过扫描 Majestic 前 100 万网站编制的。对域名 example.com 的 verbosus 指纹进行反混淆： ``` HTTPBasma.exe --demangle_json 01140a85e40014514bd522142494d67214254c5e721420958a22020214254c5e720000001609 ```

反混淆器输出（点击展开）

``` { "type": "verbosus", "fp": "01140a85e40014514bd522142494d67214254c5e721420958a22020214254c5e720000001609", "p1": { "type": "get_normal", "fp": "140a85e400", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "0a", "val_cmt": "200" }, "http_reason": { "fp": "85e4", "val_cmt": "OK" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 200, 404, 403, 500, 204, 999, 888, 603 ], "http_reason": "OK" } }, "sts_hdr": { "fp": "00", "cmt": "this header is not used" } }, "p2": { "type": "get_invalid_ver_nb", "fp": "14514bd522", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "51", "val_cmt": "505" }, "http_reason": { "fp": "4bd5", "val_cmt": "HTTP Version Not Supported" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 505 ], "http_reason": "HTTP Version Not Supported" } }, "cont_len_hdr": { "fp": "22", "name": "Content-Length", "value": ">1", "cmt": "content-length/transfer-encoding:chunked header is present with either of the size values: [0,1,>1]" }, "cnx": { "ka": false, "c": true } }, "p3": { "type": "get_rnd_resource", "fp": "142494d672", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "24", "val_cmt": "404" }, "http_reason": { "fp": "94d6", "val_cmt": "Not Found" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 404, 403, 501, 410, 204, 400, 200, 418 ], "http_reason": "Not Found" } }, "cont_len_hdr": { "fp": "72", "name": "Transfer-Encoding", "value": ">1", "cmt": "content-length/transfer-encoding:chunked header is present with either of the size values: [0,1,>1]" }, "cnx": { "ka": true, "c": false } }, "p4": { "type": "get_rnd_verb", "fp": "14254c5e72", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "25", "val_cmt": "405" }, "http_reason": { "fp": "4c5e", "val_cmt": "Method Not Allowed" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 405, 403, 204, 418, 404 ], "http_reason": "Method Not Allowed" } }, "cont_len_hdr": { "fp": "72", "name": "Transfer-Encoding", "value": ">1", "cmt": "content-length/transfer-encoding:chunked header is present with either of the size values: [0,1,>1]" }, "cnx": { "ka": true, "c": false } }, "p5": { "type": "get_lowercase_verb", "fp": "1420958a22", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "20", "val_cmt": "400" }, "http_reason": { "fp": "958a", "val_cmt": "Bad Request" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 400, 422, 405, 401 ], "http_reason": "Bad Request" } }, "cont_len_hdr": { "fp": "22", "name": "Content-Length", "value": ">1", "cmt": "content-length/transfer-encoding:chunked header is present with either of the size values: [0,1,>1]" }, "cnx": { "ka": false, "c": true } }, "p6f": { "type": "get_accept_encoding_full", "fp": "02", "cont_enc_hdr": { "value": "br", "empty_value": false, "total_plus": 0 } }, "p6l": { "type": "get_accept_encoding_less", "fp": "02", "cont_enc_hdr": { "value": "br", "empty_value": false, "total_plus": 0 } }, "p7a": { "type": "options_allow_hdr", "fp": "14254c5e72000000", "status_line": { "http_version": { "fp": "14", "val_cmt": "HTTP/1.1" }, "status_code": { "fp": "25", "val_cmt": "405" }, "http_reason": { "fp": "4c5e", "val_cmt": "Method Not Allowed" }, "sl_reversed_db": { "http_version": "HTTP/1.1", "status_code": [ 405, 403, 204, 418, 404 ], "http_reason": "Method Not Allowed" } }, "cont_len_hdr": { "fp": "72", "name": "Transfer-Encoding", "value": ">1", "cmt": "content-length/transfer-encoding:chunked header is present with either of the size values: [0,1,>1]" }, "allow_hdr": { "fp": "000000", "cmt": "this header is not used" }, "cnx": { "ka": true, "c": false } } } ```

请注意，"status_code" 数组包含多个 HTTP 状态码。这是因为不同的服务器可能对不同的状态码使用相同的原因短语，导致相同的 FNV-1a 哈希。 ## 比较器选项比较器选项 "-C/--compare" 比较两个 verbosus 指纹，并打印出每个探针的主要组件之间的差异。例如，比较以下两个针对 Google 和 YouTube 的指纹： ``` HTTPBasma.exe --compare 01140a85e4001320958a22142494d62214254c5e2214254c5e22080014254c5e220000000000,01140a85e4011320958a22142494d67214254c5e2214254c5e22080014254c5e220000000000 ``` 导致以下输出： ``` < FPrnt-1 Vs. FPrnt-2 > [ P1 ] {Strict-Transport-Security} sts header: 00 != 01 [ P2 ] [ P3 ] {Content-Length} cl_name: 2 != 7 [ P4 ] [ P5 ] [ P6F ] [ P6L ] [ P7a ] ``` 输出揭示了 P1 探针中哈希组件的具体差异，其中一个指纹存在 STS 头部而另一个不存在。此外，探针 P3 的两个指纹中 "Content-Length" 的编码也不同。 # 发布 Netomize 提供了本仓库公共代码的编译后 Windows 和 Linux x64 版本。此外，[majestic 1-million HTTP-Basma 指纹 CSV 文件 - 数据集](https://github.com/Netomize/HTTP-Basma/releases/download/v1.0/majestic_1_million_http_basma_fingerprints_csv.zip) 已包含在首次发布中。 # 使用的第三方库 - [Chilkat v11.4.0](https://www.chilkatsoft.com/) - [rang：用于控制台着色](https://github.com/agauniyal/rang) - [cxxopts (v3.3.1)](https://github.com/jarro2783/cxxopts)

标签：AES-256, AMSI绕过, C&C服务器检测, DAST, DNS枚举, Homebrew安装, HTTP协议, HTTP探测, Talos规则, Web安全, 主动扫描, 单向哈希, 哈希算法, 威胁情报, 威胁检测, 安全运营中心, 实时处理, 密码管理, 开发者工具, 恶意软件分析, 服务器指纹识别, 服务器聚类, 模糊哈希, 漏洞评估, 网络安全, 网络安全工具, 网络映射, 网络服务发现, 蓝队分析, 隐私保护