teilomillet/ordeal

GitHub: teilomillet/ordeal

面向 Python 的自动化混沌测试框架,通过故障注入、属性断言和覆盖率引导的有状态探索,无需手写测试即可发现隐藏的代码缺陷。

Stars: 1 | Forks: 0

# ordeal [![CI](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/35cfd05500103723.svg)](https://github.com/teilomillet/ordeal/actions/workflows/ci.yml) [![Docs](https://static.pigsec.cn/wp-content/uploads/repos/2026/04/dd051e26dc103724.svg)](https://docs.byordeal.com/) [![PyPI](https://img.shields.io/pypi/v/ordeal)](https://pypi.org/project/ordeal/) [![Python 3.12+](https://img.shields.io/pypi/pyversions/ordeal)](https://pypi.org/project/ordeal/) [![License](https://img.shields.io/github/license/teilomillet/ordeal)](LICENSE) **你的测试通过了。你的代码依然会崩溃。** Ordeal 能找出你遗漏的问题 —— 边缘情况、未测试的代码路径、以及只在生产环境中出现的 bug。无需编写测试代码。只需指定目标并运行。 打开终端并粘贴以下内容([uvx](https://docs.astral.sh/uv/guides/tools/) 无需安装即可运行 Python 工具): ``` uvx ordeal mine ordeal.demo ``` ``` mine(score): 500 examples ALWAYS output in [0, 1] (500/500) ← always returns a value between 0 and 1 ALWAYS monotonically non-decreasing ← bigger input = bigger output, always mine(normalize): 500 examples 97% idempotent (29/30) ← normalizing twice should give the same result ...but ordeal found a case where it doesn't ``` 现在将其指向你的代码。如果你有一个 `myapp/scoring.py` 文件,模块路径就是 `myapp.scoring`: ``` uvx ordeal mine myapp.scoring # what do my functions actually do? uvx ordeal audit myapp.scoring # what are my tests missing? ``` 或者让你的 AI 助手来做 —— 打开 Claude Code、Cursor 或任何编程助手并粘贴: ordeal 附带了一个 [AGENTS.md](https://github.com/teilomillet/ordeal/blob/main/AGENTS.md) 文件 —— 你的 AI 会自动读取它,并知道如何使用每一个命令。 ``` pip install ordeal # or: uv tool install ordeal ``` ## 30 秒示例 ``` from ordeal import ChaosTest, rule, invariant, always from ordeal.faults import timing, numerical class MyServiceChaos(ChaosTest): faults = [ timing.timeout("myapp.api.call"), # API times out numerical.nan_injection("myapp.predict"), # model returns NaN ] @rule() def call_service(self): result = self.service.process("input") always(result is not None, "process never returns None") @invariant() def no_corruption(self): for item in self.service.results: always(not math.isnan(item), "no NaN in output") TestMyServiceChaos = MyServiceChaos.TestCase ``` ``` pytest --chaos # explore fault interleavings pytest --chaos --chaos-seed 42 # reproduce exactly ordeal explore # coverage-guided, reads ordeal.toml ``` 你可以声明可能出错的内容(faults,故障)、系统的行为(rules,规则),以及必须保持为真的条件(assertions,断言)。Ordeal 会为你探索各种组合的可能性。 ## 为什么选择 ordeal 测试能捕获你能想象到的 bug。危险的 bug 往往是你想象不到的 —— 在重试*期间*发生超时、在恢复路径*内部*出现 NaN、在缓存预热*之后*发生权限错误。这些问题源于复杂的组合,而组合的空间太大,无法手动探索。 Ordeal 将这些自动化了。它将世界上最严谨的工程文化中的理念带到了 Python 中: | 内容 | 理念 | 来源 | |---|---|---| | 带有 nemesis 的有状态混沌测试 | 对抗者在 Hypothesis 探索交错操作的同时切换故障 | [Jepsen](https://jepsen.io) + [Hypothesis](https://hypothesis.works) | | 覆盖率引导的探索 | 在新的代码路径处保存检查点,并从高价值状态分支出去 | [Antithesis](https://antithesis.com) | | 属性断言 | `always`、`sometimes`、`reachable`、`unreachable` —— 在多次运行中积累证据 | [Antithesis](https://antithesis.com/docs/properties_assertions/) | | 内联故障注入 | `buggify()` —— 在生产环境中无操作,在测试中产生概率性故障 | [FoundationDB](https://apple.github.io/foundationdb/testing.html) | | 偏向边界的生成 | 在 0、-1、空值、最大长度等边界处进行测试 —— bug 实际聚集的地方 | [Jane Street QuickCheck](https://blog.janestreet.com/quickcheck-for-core/) | | 变异测试 | 将 `+` 翻转为 `-`、`<` 翻转为 `<=` —— 验证你的测试是否真的能捕获真实的 bug | [Meta ACH](https://engineering.fb.com) | | 差分测试 | 在随机输入上比较两个实现 —— 捕获回归问题 | 等价性测试 | | 属性挖掘 | 从执行轨迹中发现不变量 —— 类型、边界、单调性 | 规约挖掘 | | 变异关系测试 | 检查转换输入后输出之间的*关系* | [Metamorphic relations](https://en.wikipedia.org/wiki/Metamorphic_testing) | **阅读完整的[哲学理念](https://docs.byordeal.com/philosophy)以了解这为什么重要。** ## ordeal 标准 当一个项目使用 ordeal 并且通过测试时,这意味着: - 探索器在覆盖率引导下运行了数千个操作序列 - 注入了人类不会想到的故障组合 - 属性断言在所有运行中均保持成立 - 变异被成功捕获 这不是“测试通过了”。这是证据 —— 比仅仅变绿的单体测试强大得多 —— 证明你的代码能够应对逆境。 ## 安装 ``` # 从 PyPI pip install ordeal # 带 extras pip install ordeal[atheris] # coverage-guided fuzzing via Atheris pip install ordeal[all] # everything # 作为 CLI 工具 uv tool install ordeal # global install uvx ordeal explore # ephemeral, no install # 开发 git clone https://github.com/teilomillet/ordeal cd ordeal && uv sync uv run pytest # 205 tests ``` ## 开箱即用 ### 有状态混沌测试 `ChaosTest` 扩展了 Hypothesis 的 `RuleBasedStateMachine`。你只需声明故障和规则,ordeal 会自动注入一个 **nemesis**,在探索期间切换故障。该 nemesis 只是一个普通的 Hypothesis 规则,因此引擎会像探索其他任何状态转换一样探索故障的时间表。缩减(Shrinking)会自动进行。 ``` from ordeal import ChaosTest, rule, invariant from ordeal.faults import io, numerical, timing class StorageChaos(ChaosTest): faults = [ io.error_on_call("myapp.storage.save", IOError), timing.intermittent_crash("myapp.worker.process", every_n=3), numerical.nan_injection("myapp.scoring.predict"), ] swarm = True # random fault subsets per run — better aggregate coverage @rule() def write_data(self): self.service.save({"key": "value"}) @rule() def read_data(self): result = self.service.load("key") always(result is not None, "reads never return None after write") ``` Swarm 模式:每次运行都会激活一个随机的故障子集。经过多次运行,这会比一直开启所有故障覆盖更多的故障组合。Hypothesis 负责子集的选择,因此缩减机制能隔离出触发失败的确切故障组合。 ### 属性断言 四种类型,各自具有不同的语义: ``` from ordeal import always, sometimes, reachable, unreachable always(len(results) > 0, "never empty") # must hold every time — fails immediately sometimes(cache_hit, "cache is used") # must hold at least once — checked at session end reachable("error-recovery-path") # code path must execute at least once unreachable("silent-data-corruption") # code path must never execute — fails immediately ``` `always` 和 `unreachable` 会立即失败,并触发 Hypothesis 的缩减。`sometimes` 和 `reachable` 会在整个会话期间积累证据 —— 它们会在测试结束时进行检查。这是引入 Python 的 [Antithesis 断言模型](https://antithesis.com/docs/properties_assertions/assertions/)。 ### 内联故障注入 (BUGGIFY) 在你的生产代码中放置 `buggify()` 门禁。它们在正常情况下返回 `False`。在混沌测试期间,它们会以概率形式返回 `True`: ``` from ordeal.buggify import buggify, buggify_value def process(data): if buggify(): # sometimes inject delay time.sleep(random.random() * 5) result = compute(data) return buggify_value(result, float('nan')) # sometimes corrupt output ``` 受随机种子控制。线程本地化。未激活时为无操作(no-op)。这是 Python 版的 [FoundationDB BUGGIFY](https://apple.github.io/foundationdb/testing.html) —— 被测试的代码*就是*测试工具本身。 ### 覆盖率引导的探索 Explorer 会跟踪每次运行发现的代码路径(AFL 风格的边缘哈希)。当某次运行发现新的覆盖率时,它会保存一个检查点。未来的运行将从高价值的检查点分支出去,系统地探索状态空间: ``` from ordeal.explore import Explorer explorer = Explorer( MyServiceChaos, target_modules=["myapp"], checkpoint_strategy="energy", # favor productive checkpoints ) result = explorer.run(max_time=60) print(result.summary()) # 探索: 5000 次运行, 52000 步, 60.0s # 覆盖率: 287 条边, 43 个检查点 # 发现故障: 2 # 运行 342, 步骤 15: ValueError (shrinking 后 3 步) ``` 失败会被**缩减** —— 增量调试(delta debugging)会移除不必要的步骤,然后故障简化会移除不必要的故障。你将获得用于重现该 bug 的最小序列。 通过 `workers` 进行扩展 —— 每个进程获得一个独立的随机种子用于独立探索,结果会被汇总: ``` explorer = Explorer(MyServiceChaos, target_modules=["myapp"], workers=8) ``` ### 配置 ``` # ordeal.toml — 单个文件, 人类与机器可读 [explorer] target_modules = ["myapp"] max_time = 60 seed = 42 checkpoint_strategy = "energy" [[tests]] class = "tests.test_chaos:MyServiceChaos" [report] format = "both" traces = true verbose = true ``` 请参阅 [`ordeal.toml.example`](ordeal.toml.example) 获取包含所有选项说明的完整模式。 ### 带有边界偏向的 QuickCheck `@quickcheck` 从类型提示推断策略。它偏向于边界值 —— 0、-1、空列表、最大长度 —— 也就是实现 bug 聚集的地方: ``` from ordeal.quickcheck import quickcheck @quickcheck def test_sort_idempotent(xs: list[int]): assert sorted(sorted(xs)) == sorted(xs) @quickcheck def test_score_bounded(x: float, y: float): result = score(x, y) assert 0 <= result <= 1 ``` ### 可组合的不变量 ``` from ordeal.invariants import no_nan, no_inf, bounded, finite valid_score = finite & bounded(0, 1) valid_score(model_output) # raises AssertionError with clear message ``` 不变量可以使用 `&` 进行组合。支持标量、序列和 numpy 数组。 ### 模拟原语 确定性的 Clock 和 FileSystem —— 无需 mock,无真实 I/O,瞬间完成: ``` from ordeal.simulate import Clock, FileSystem clock = Clock() fs = FileSystem() clock.advance(3600) # instant — no real waiting fs.inject_fault("/data.json", "corrupt") # reads return random bytes ``` ### 差分测试 在相同的随机输入上比较两种实现 —— 捕获回归问题并验证重构的正确性: ``` from ordeal.diff import diff result = diff(score_v1, score_v2, rtol=1e-6) assert result.equivalent, result.summary() # diff(score_v1, score_v2): 100 个示例, EQUIVALENT ``` ### 变异测试 验证你的混沌测试是否真的能捕获 bug。如果你在代码中将 `+` 改成 `-` 而测试依然通过,说明你的测试存在盲区: ``` from ordeal.mutations import mutate_function_and_test result = mutate_function_and_test("myapp.scoring.compute", my_tests) print(result.summary()) # 变异分数: 15/18 (83%) # SURVIVED L42:8 + -> - # SURVIVED L67:4 否定 if-condition ``` ### 故障库 ``` from ordeal.faults import io, numerical, timing, network, concurrency # I/O 故障 io.error_on_call("mod.func") # raise IOError io.disk_full() # writes fail with ENOSPC io.corrupt_output("mod.func") # return random bytes io.truncate_output("mod.func", 0.5) # truncate to half # 数值故障 numerical.nan_injection("mod.func") # output becomes NaN numerical.inf_injection("mod.func") # output becomes Inf numerical.wrong_shape("mod.func", (1,512), (1,256)) # 时序故障 timing.timeout("mod.func") # raise TimeoutError timing.slow("mod.func", delay=2.0) # add delay timing.intermittent_crash("mod.func", every_n=3) timing.jitter("mod.func", magnitude=0.01) # 网络故障 network.http_error("mod.client.post", status_code=503) network.connection_reset("mod.client.post") network.rate_limited("mod.client.get", retry_after=60) network.dns_failure("mod.client.resolve") # 并发故障 concurrency.contended_call("mod.pool.acquire", contention=0.1) concurrency.thread_boundary("mod.cache.get") concurrency.stale_state(service, "config", old_config) ``` ### 集成 ``` # Atheris — 覆盖率引导的 fuzzing 指导 buggify() 决策 from ordeal.integrations.atheris_engine import fuzz fuzz(my_function, max_time=60) # API chaos 测试 (内置, 无需额外安装) from ordeal.integrations.openapi import chaos_api_test chaos_api_test("http://localhost:8080/openapi.json", faults=[...]) ``` ### 审计 —— 用数据证明采纳的价值 ``` ordeal audit myapp.scoring --test-dir tests/ ``` ``` myapp.scoring current: 33 tests | 343 lines | 98% coverage [verified] migrated: 12 tests | 130 lines | 96% coverage [verified] saving: 64% fewer tests | 62% less code | same coverage mined: compute: output in [0, 1] (500/500, >=99% CI) mutation: 14/18 (78%) suggest: - L42 in compute(): test when x < 0 - L67 in normalize(): test that ValueError is raised ``` 每一个数字要么是 `[verified]`(通过 coverage.py 的 JSON 测量并交叉验证),要么是 `FAILED: reason` —— 审计绝不会默默地返回 0%。挖掘出的属性包含 Wilson 置信区间。当存在覆盖率缺口时,它会读取源代码并确切地告诉你应该测试什么。使用 `--show-generated` 检查测试文件,使用 `--save-generated` 保留它。 ## CLI ``` ordeal audit myapp.scoring # compare existing tests vs ordeal ordeal explore # run from ordeal.toml ordeal explore -w 8 # parallel with 8 workers ordeal explore -c ci.toml -v # custom config, verbose ordeal explore --max-time 300 --seed 99 # override settings ordeal replay trace.json # reproduce a failure ordeal replay --shrink trace.json # minimize a failure trace ordeal explore --generate-tests tests/test_gen.py # turn traces into pytest tests ``` ## 找到你需要的 每一个目标都映射到一个起点 —— 一个可运行的命令、一个可导入的模块,以及一篇可阅读的文档。没有任何内容被隐藏。 | 我想... | 从这里开始 | 在代码库中 | 文档 | |---|---|---|---| | 无需编写测试就能发现 bug | `ordeal mine mymodule` | `ordeal/auto.py` | [Auto Testing](https://docs.byordeal.com/guides/auto) | | 检查我的测试是否足够好 | `ordeal audit mymodule` | `ordeal/mutations.py` | [Mutations](https://docs.byordeal.com/guides/mutations) | | 编写一个混沌测试 | `from ordeal import ChaosTest` | `ordeal/chaos.py` | [Getting Started](https://docs.byordeal.com/getting-started) | | 注入特定故障(超时、NaN 等) | `from ordeal.faults import timing` | `ordeal/faults/` 目录 | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) | | 探索所有的故障组合 | `ordeal explore` | `ordeal/explore.py` | [Explorer](https://docs.byordeal.com/guides/explorer) | | 重现并缩减故障 | `ordeal replay trace.json` | `ordeal/trace.py` | [Shrinking](https://docs.byordeal.com/concepts/shrinking) | | 在生产代码中添加故障安全门禁 | `from ordeal.buggify import buggify` | `ordeal/buggify.py` | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) | | 在所有运行中进行断言 | `from ordeal import always, sometimes` | `ordeal/assertions.py` | [Assertions](https://docs.byordeal.com/concepts/property-assertions) | | 在测试中控制时间/文件系统 | `from ordeal.simulate import Clock` | `ordeal/simulate.py` | [Simulation](https://docs.byordeal.com/guides/simulate) | | 比较两种实现 | `ordeal mine-pair mod.fn1 mod.fn2` | `ordeal/diff.py` | [Auto Testing](https://docs.byordeal.com/guides/auto) | | 组合验证规则 | `from ordeal.invariants import no_nan` | `ordeal/invariants.py` | [API Reference](https://docs.byordeal.com/reference/api) | | 针对故障测试 API 端点 | `from ordeal.integrations.openapi import chaos_api_test` | `ordeal/integrations/openapi.py` | [Integrations](https://docs.byordeal.com/guides/integrations) | | 使用新的故障扩展 ordeal | 遵循 `faults/*.py` 中的模式 | `ordeal/faults/` | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) | | 配置可重现的运行 | 创建 `ordeal.toml` | `ordeal/config.py` | [Configuration](https://docs.byordeal.com/guides/configuration) | | 发现所有的故障、断言、策略 | `from ordeal import catalog; catalog()` | `ordeal/__init__.py` | [API Reference](https://docs.byordeal.com/reference/api) | ## 架构 —— 代码地图 每个模块只做一件事。当你想要理解、使用或扩展 ordeal 时,这里会告诉你该去哪里找。 ``` ordeal/ ├── chaos.py Your tests extend this — ChaosTest base class, nemesis, swarm mode ├── explore.py The exploration engine — coverage tracking, checkpoints, energy scheduling ├── assertions.py always / sometimes / reachable / unreachable — the assertion model ├── buggify.py Inline fault gates — thread-local, seed-controlled, no-op when inactive ├── quickcheck.py @quickcheck — type-driven strategies with boundary bias ├── simulate.py Deterministic Clock and FileSystem — no mocks, no real I/O ├── invariants.py Composable checks — no_nan & bounded(0, 1), works with numpy ├── mutations.py AST mutation testing — 14 operators, count-and-apply pattern ├── auto.py Auto-testing — scan_module, fuzz, mine, diff, chaos_for ├── trace.py Trace recording, JSON serialization, replay, delta-debugging shrink ├── config.py ordeal.toml loader — strict validation ├── cli.py CLI entry point — explore, replay, mine, audit ├── plugin.py Pytest plugin — --chaos, --chaos-seed, --buggify-prob ├── strategies.py Adversarial Hypothesis strategies for fuzzing ├── faults/ All fault types, organized by what can go wrong: │ ├── io.py disk_full, corrupt_output, permission_denied │ ├── numerical.py nan_injection, inf_injection, wrong_shape │ ├── timing.py timeout, slow, intermittent_crash, jitter │ ├── network.py http_error, connection_reset, dns_failure │ └── concurrency.py contended_call, thread_boundary, stale_state └── integrations/ Optional bridges to specialized tools: ├── openapi.py Built-in API chaos testing (no extra deps) └── atheris_engine.py Coverage-guided fuzzing (pip install ordeal[atheris]) ``` ## 许可证 Apache 2.0
标签:AI编程助手, Python, 威胁情报, 安全规则引擎, 属性测试, 开发者工具, 故障注入, 无后门, 测试框架, 混沌工程, 状态探索, 研发效能, 自动化payload嵌入, 边缘用例发现, 逆向工具