FuzzAnything/Hopper

GitHub: FuzzAnything/Hopper

Hopper 是一个使用解释式模糊测试自动为库生成模糊测试用例的工具，旨在提升库级模糊测试的覆盖与效率。

Stars: 263 | Forks: 36

# Hopper Hopper 是一个使用**解释式模糊测试**自动为库生成模糊测试用例的工具。它将库模糊测试的问题转化为解释器模糊测试的问题，开箱即可探索库 API 使用的广泛范围。 Hopper 的一些关键特性包括： - 无需任何模糊测试驱动程序的解释式 API 调用。 - 针对参数的类型感知变异。 - 自动的库内与库间 API 约束学习。 - 二进制仪器支持。如需了解更多关于 Hopper 的信息，请查阅我们的 [论文](https://arxiv.org/pdf/2309.03496)，该论文发表于 CCS '23。 ## 构建 Hopper ### 构建要求 - Linux-amd64（已在 Ubuntu 20.04 和 Debian Buster 上测试） - Rust 稳定版（>= 1.60），可通过 [rustup](https://rustup.rs/) 获取 - Clang（>= 5.0，[安装 Clang](https://rust-lang.github.io/rust-bindgen/requirements.html)），[rust-bindgen](https://rust-lang.github.io/rust-bindgen/) 利用 libclang 预处理、解析和类型检查 C 与 C++ 头文件。 ### 构建 Hopper 自身 ``` ./build.sh ``` 该脚本将在 hopper 的根目录创建一个 `install` 目录，随后即可使用 `hopper` 命令。若要在任意位置使用此命令，可将该项目目录添加到你的 PATH 环境变量中。 ### 使用 Docker 你也可以选择使用 Dockerfile，它会自动构建所需依赖并编译 Hopper。 ``` docker build -t hopper ./ docker run --name hopper_dev --privileged -v /path-to-lib:/fuzz -it --rm hopper /bin/bash ``` ## 使用 Hopper 编译库以 `csjon` 为例（[更多示例](./examples/)）。 ``` hopper compile --header ./cJSON.h --library ./libcjson.so --output output ``` 使用 `hopper compile --help` 查看详细用法。如果编译过程中报告头文件相关错误，可参考 [rust-bindgen](https://rust-lang.github.io/rust-bindgen/) 的用法，它被用于解析头文件。你可能需要为头文件补充缺失的定义。 Hopper 默认使用 [E9Patch](https://github.com/GJDuck/e9patch) 对二进制文件进行仪器化。你也可以选择使用 [LLVM](./hopper-instrument/llvm-mode/) 进行源代码仪器化。运行 `compile` 后，你将在输出目录中得到以下文件： - `bin/hopper-fuzzer`：生成输入、维护状态，并使用 `harness` 执行输入。 - `bin/hopper-harness`：执行输入。 - `bin/hopper-translate`：将输入翻译为 C 源代码。 - `bin/hopper-generator`：回放生成过程。 - `bin/hopper-sanitizer`：清理并最小化崩溃。 ### 头文件 - 如果存在多个头文件，可以新建一个头文件并 *include* 所有文件。 - 如果头文件的编译依赖特定环境变量，可通过设置 `BINDGEN_EXTRA_CLANG_ARGS` 指定。 - 如果头文件中包含你不想测试的 API 函数，可使用 `--func-pattern` 在运行时过滤。 ### 编译时的环境变量 - `HOPPER_MAP_SIZE_POW2`：控制覆盖率路径的大小。默认值为 16，范围为 [16, 20]。例如：`HOPPER_MAP_SIZE_POW2=18`。 - `HOPPER_INST_RATIO`：控制块被选中进行仪器化的概率。默认值为 100，范围为 (0, 100]。例如：`HOPPER_INST_RATIO=75`。 - `HOPPER_INCLUDE_SEARCH_PATH`：包含头文件中文件的搜索路径。例如：`HOPPER_INCLUDE_SEARCH_PATH=../`。 - `HOPPER_FUNC_BLACKLIST`：包含 Hopper 不会编译的函数黑名单。`bindgen` 不会为这些函数生成代码。例如：`HOPPER_FUNC_BLACKLIST=f1,f2`。 - `HOPPER_TYPE_BLACKLIST`：包含 Hopper 不会编译的类型黑名单。`bindgen` 不会为这些类型生成代码。例如：`HOPPER_TYPE_BLACKLIST=type1,type2`。 - `HOPPER_ITEM_BLACKLIST`：包含 Hopper 不会编译的常量/变量黑名单。`bindgen` 不会为这些项生成代码。例如：`HOPPER_ITEM_BLACKLIST=IPPORT_RESERVED`。 - `HOPPER_CUSTOM_OPAQUE_LIST`：包含我们定义的自定义不透明类型。例如：`HOPPER_CUSTOM_OPAQUE_LIST=type1`。 - `HOPPER_FUZZ_INLINE_FUNCTION`：将内联函数作为目标函数，详见 bindgen 的 [FAQ](https://rust-lang.github.io/rust-bindgen/faq.html#why-isnt-bindgen-generating-bindings-to-inline-functions)。 #### 提示 - 可在名为 `hopper.config` 的配置文件中设置编译与运行参数，详情参见 `examples/*`。 - 降低密度：如果密度超过 20%，边 ID 可能出现哈希冲突。可采取以下措施：a) 增大 `HOPPER_MAP_SIZE_POW2`；b) 降低 `HOPPER_INST_RATIO`。 - 多库支持：(1) 将多个归档文件合并为一个共享库，例如 `gcc -shared -o c.so -Wl,--whole-archive a.a b.a -Wl,--no-whole-archive`；(2) 将所有库通过 `--library a.so b.so` 传递给 Hopper 编译器。 ## 使用 Hopper 进行模糊测试 ``` hopper fuzz output --func-pattern cJSON_* ``` 使用 `hopper fuzz output --help` 查看详细用法。运行 `fuzz` 后，将生成以下目录： - `queue`：生成的正常输入。 - `hangs`：生成的超时输入。 - `crashes`：生成的崩溃输入。 - `misc`：存储临时文件或统计信息。 ### 运行时的环境变量 - `DISABLE_CALL_DET`：禁用调用的确定性变异。 - `DISABLE_GEN_FAIL`：禁用对调用失败的函数生成程序。 - `HOPPER_SEED_DIR`：为字节类参数提供种子（默认：如果存在则使用 `output/seeds`）。 - `HOPPER_DICT`：为字节类参数提供字典，语法与 AFL 相同。 - `HOPPER_API_INSENSITIVE_COV`：禁用 API 敏感分支计数。 - `HOPPER_FAST_EXECUTE_LOOP`：每个 fork 执行的程序循环次数，设为 0 或 1 可跳出循环。例如：`HOPPER_FAST_EXECUTE_LOOP=10`。 #### 系统配置将系统核心转储配置为 AFL 模式（在主机上执行，若在 Docker 容器中运行 Hopper）。 ``` echo core | sudo tee /proc/sys/kernel/core_pattern ``` ### 函数模式 Hopper 默认会为头文件和库文件中出现的所有函数生成输入。但有两种方式可用于过滤函数：排除函数或包含函数，从而聚焦于感兴趣的目标。 #### `--func-pattern` ``` hopper fuzz output --func-pattern @cJSON_parse,!cJSON_InitHook,cJSON_* ``` - 模式可以是函数名，例如 `cJSON_parse`，也可以是简单模式，例如 `cJSON_*`。 - 如果有多个模式，可用 `,` 连接，例如 `cJSON_*,HTTP_*`。 - 可使用 `@` 前缀限定模糊器仅对特定函数进行模糊测试，而其他函数可作为提供字段或参数值的候选，例如 `@cJSON_parse,cJSON_*`。 - `!` 用作前缀以排除特定函数，例如 `!cJSON_InitHook,cJSON_*`。 #### `--custom-rules` 模式也可通过在 `--custom-rules` 指定的文件中定义。 ``` // hopper fuzz output --custom-rules path-to-file func_target cJSON_parse func_exclude cJSON_InitHook func_include cJSON_*,HTTP_* ``` ### 约束条件 Hopper 会推断库内与库间的 API 约束，以确保正确调用 API。约束条件会写入 `output/misc/constraint.config`。你可以删除该文件以重置约束。此外，用户可通过 `--custom-rules` 指定描述 API 调用自定义约束的文件，这些约束将覆盖推断出的约束。 ``` // hopper fuzz output --custom-rules path-to-file // Grammar: // func, type : prefix for adding a rule for function or type // $[0-9]+ : function's i-th argument, or index in array // [a-zA-Z_]+ : object field // 0, 128 .. : integer constants // "xxxx" : string constants // methods : $len, $range, $null, $non_null, $need_init, $read_file, $write_file, $ret_from, $cast_from, $use, $arr_len, $opaque, $len_factors // others : pointer(&) , option(?), e.g &.$0.len, `len` field in the pointer's first element // // Set one argument in a function to be specific constant func test_add[$0] = 128 // One argument must be the length of another one func test_arr[$1] = $len($0) // Or one field must be the length of another field func test_arr[$0][len] = $len([$0][name]) // One argument must be in a certain range func test_arr[$1] = $range(0, $len($0)) // Argument should be non-null func test_non_null[$0] = $non_null // Argument should be null func test_null[$0] = $null // Argument should be specific string func test_magic[$0] = "magic" // Argument should be a file and the file will be read func test_path[$0] = $read_file // Argument should be use the value of specific function's return func test_use[$0] = $ret_from(test_create) // Argument should be specific type for void pointer. The type should start with *mut or *const. func test_void[$0] = $cast_from(*mut u8) // The array suppose has a minimal array length func test_void[$0][&] = $arr_len(256) // The array's length is formed by the factors func fread[$0][&] = $len_factors(1, $2) // Or func gzfread[$0][&] = $len_factors($1, $2) // Field in argument should be specific constant func test_field[$0][len] = 128 // Deeper fields func test_field[$0][&.elements.$0] = 128 // One field `len` in a type must be the length of another field `p` type ArrayWrap[len] = $len(p) // One nested union `inner_union` in a type must be set to `member2` type ComplicatedStruct[inner_union] = $use(member2) // Type is opaque that used as an opaque pointer type Partial = $opaque // A type should be init with specific function type Partial = $init_with(test_init, 0) // ctx: set context for specific function // Add a context for function ctx test_use[$0] <- test_init // Add implicit context ctx test_use[*] <- test_init // Add optional context that preferred to use ctx test_use[$0] <- test_init ? // Add forbidden context ctx test_use[$0] <- ! test_init // alias: alias types across different function alias handleA <- useA($0),createA($ret),freeA($0) // assert: adding specific assertions for calls assert test_one == 1 assert test_non_zero != 0 ``` ### 字节参数的种子如果存在名为 `seeds` 的目录（由 `HOPPER_SEED_DIR` 指定），Hopper 会尝试读取其中的文件，并将其用作字节类参数（例如 `char*`）的种子。此外，你也可以通过参数名称为特定参数指定种子，例如将子目录命名为 `@buf`，对应参数名为 `buf`。 ### 日志记录 Hopper 使用 Rust 的 log crate 打印日志信息。默认日志级别为 `INFO`。如需输出全部日志（`DEBUG` 和 `TRACE`），可在运行 Hopper 时设置环境变量 `LOG_TYPE`，例如：`LOG_TYPE=trace ./hopper`。详细日志将写入 `output/fuzzer_r*.log` 和 `output/harness_r*.log`。 ### 重现执行 Hopper 可在输出目录中重现程序的执行过程。 - `hopper-harness` 可解析并解释 Hopper 的运行时输入，将在执行过程中详细打印内部状态。 ``` ./bin/hopper-harness ./queue/id_000000 ``` - `hopper-translate` 可将输入翻译为 C 源代码，生成的 C 文件可作为报告问题的见证。 ``` ./bin/hopper-translate --input ./queue/id_000000 --header path-to/xx.h --output test.c # 然后使用特定库编译 gcc -I/path-to-head -L/path-to-lib -l:libcjson.so test.c -o test ``` - `hopper-generator` 可重放输入生成过程（除执行外）。你可以用它来分析输入是如何生成或变异的。 - `hopper-sanitizer` 可最小化并验证由 Hopper 生成的崩溃。它会排除违反约束的崩溃，并根据调用栈去重。 ## 测试 ### 测试 Rust 代码 - 运行所有测试用例 ``` RUST_BACKTRACE=1 cargo test -- --nocapture ``` ### 测试套件（测试库） - [如何运行与编写测试套件](./testsuite/README.md) ### 实际示例 - [示例](./examples/) ## 通过基于源代码的覆盖率评估结果 - 使用 [LLVM 源代码级代码覆盖率工具](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html) 编译库源码。需要设置编译标志，例如： ``` export CFLAGS="${CFLAGS:-} -fprofile-instr-generate -fcoverage-mapping -gline-tables-only -g" make ``` - 使用 `cov` 仪器化模式编译库，例如： ``` hopper compile --instrument cov --header ./cJSON.h --library ./libcjson_cov.so --output output_cov ``` - 运行解释器并传入所有生成的种子输入（SEED_DIR）。 ``` # 运行 Hopper 并使用 llvm-cov 计算覆盖率。 SEED_DIR=./output/queue hopper cov output_cov ``` ## 贡献指南我们已在 [Roadmap](https://github.com/FuzzAnything/hopper/discussions/2) 中列出了一些待办任务。如果你感兴趣，欢迎随时与我们讨论并贡献代码。 ### 编码规范 - *Zero* `cargo check` 警告 - *Zero* `cargo clippy` 警告 - *Zero* `FAILED` 在 `cargo test` 中 - *Try* 为你的代码编写测试 ### 性能分析 - [Rust 程序性能分析](https://gist.github.com/KodrAus/97c92c07a90b1fdd6853654357fd557a) - [Inferno](https://github.com/jonhoo/inferno) ``` perf record --call-graph=dwarf ./bin/hopper-fuzzer # 直接使用 flamegraph perf script | stackcollapse-perf.pl | rust-unmangle | flamegraph.pl > flame.svg # 使用 inferno perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg ``` 性能分析会产生大量中间数据用于分析，因此*不要*对模糊测试器运行超过 2 分钟。

标签：API 模糊, binary instrumentation, bindgen, CCS 论文, Clang, Docker, E9Patch, fuzzing test cases, interpretative fuzzing, interpreter fuzzing, intra- and inter-API constraints learning, library fuzzing, Linux 编译, Rust, type-aware mutation, 二进制插桩, 可视化界面, 安全防御评估, 库模糊测试, 类型感知变异, 约束学习, 网络流量审计, 自动模糊驱动, 解释式模糊测试, 请求拦截, 通知系统