teilomillet/ordeal
GitHub: teilomillet/ordeal
面向 Python 的自动化混沌测试框架,通过故障注入、属性断言和覆盖率引导的有状态探索,无需手写测试即可发现隐藏的代码缺陷。
Stars: 1 | Forks: 0
# ordeal
[](https://github.com/teilomillet/ordeal/actions/workflows/ci.yml)
[](https://docs.byordeal.com/)
[](https://pypi.org/project/ordeal/)
[](https://pypi.org/project/ordeal/)
[](LICENSE)
**你的测试通过了。你的代码依然会崩溃。**
Ordeal 能找出你遗漏的问题 —— 边缘情况、未测试的代码路径、以及只在生产环境中出现的 bug。无需编写测试代码。只需指定目标并运行。
打开终端并粘贴以下内容([uvx](https://docs.astral.sh/uv/guides/tools/) 无需安装即可运行 Python 工具):
```
uvx ordeal mine ordeal.demo
```
```
mine(score): 500 examples
ALWAYS output in [0, 1] (500/500) ← always returns a value between 0 and 1
ALWAYS monotonically non-decreasing ← bigger input = bigger output, always
mine(normalize): 500 examples
97% idempotent (29/30) ← normalizing twice should give the same result
...but ordeal found a case where it doesn't
```
现在将其指向你的代码。如果你有一个 `myapp/scoring.py` 文件,模块路径就是 `myapp.scoring`:
```
uvx ordeal mine myapp.scoring # what do my functions actually do?
uvx ordeal audit myapp.scoring # what are my tests missing?
```
或者让你的 AI 助手来做 —— 打开 Claude Code、Cursor 或任何编程助手并粘贴:
ordeal 附带了一个 [AGENTS.md](https://github.com/teilomillet/ordeal/blob/main/AGENTS.md) 文件 —— 你的 AI 会自动读取它,并知道如何使用每一个命令。
```
pip install ordeal # or: uv tool install ordeal
```
## 30 秒示例
```
from ordeal import ChaosTest, rule, invariant, always
from ordeal.faults import timing, numerical
class MyServiceChaos(ChaosTest):
faults = [
timing.timeout("myapp.api.call"), # API times out
numerical.nan_injection("myapp.predict"), # model returns NaN
]
@rule()
def call_service(self):
result = self.service.process("input")
always(result is not None, "process never returns None")
@invariant()
def no_corruption(self):
for item in self.service.results:
always(not math.isnan(item), "no NaN in output")
TestMyServiceChaos = MyServiceChaos.TestCase
```
```
pytest --chaos # explore fault interleavings
pytest --chaos --chaos-seed 42 # reproduce exactly
ordeal explore # coverage-guided, reads ordeal.toml
```
你可以声明可能出错的内容(faults,故障)、系统的行为(rules,规则),以及必须保持为真的条件(assertions,断言)。Ordeal 会为你探索各种组合的可能性。
## 为什么选择 ordeal
测试能捕获你能想象到的 bug。危险的 bug 往往是你想象不到的 —— 在重试*期间*发生超时、在恢复路径*内部*出现 NaN、在缓存预热*之后*发生权限错误。这些问题源于复杂的组合,而组合的空间太大,无法手动探索。
Ordeal 将这些自动化了。它将世界上最严谨的工程文化中的理念带到了 Python 中:
| 内容 | 理念 | 来源 |
|---|---|---|
| 带有 nemesis 的有状态混沌测试 | 对抗者在 Hypothesis 探索交错操作的同时切换故障 | [Jepsen](https://jepsen.io) + [Hypothesis](https://hypothesis.works) |
| 覆盖率引导的探索 | 在新的代码路径处保存检查点,并从高价值状态分支出去 | [Antithesis](https://antithesis.com) |
| 属性断言 | `always`、`sometimes`、`reachable`、`unreachable` —— 在多次运行中积累证据 | [Antithesis](https://antithesis.com/docs/properties_assertions/) |
| 内联故障注入 | `buggify()` —— 在生产环境中无操作,在测试中产生概率性故障 | [FoundationDB](https://apple.github.io/foundationdb/testing.html) |
| 偏向边界的生成 | 在 0、-1、空值、最大长度等边界处进行测试 —— bug 实际聚集的地方 | [Jane Street QuickCheck](https://blog.janestreet.com/quickcheck-for-core/) |
| 变异测试 | 将 `+` 翻转为 `-`、`<` 翻转为 `<=` —— 验证你的测试是否真的能捕获真实的 bug | [Meta ACH](https://engineering.fb.com) |
| 差分测试 | 在随机输入上比较两个实现 —— 捕获回归问题 | 等价性测试 |
| 属性挖掘 | 从执行轨迹中发现不变量 —— 类型、边界、单调性 | 规约挖掘 |
| 变异关系测试 | 检查转换输入后输出之间的*关系* | [Metamorphic relations](https://en.wikipedia.org/wiki/Metamorphic_testing) |
**阅读完整的[哲学理念](https://docs.byordeal.com/philosophy)以了解这为什么重要。**
## ordeal 标准
当一个项目使用 ordeal 并且通过测试时,这意味着:
- 探索器在覆盖率引导下运行了数千个操作序列
- 注入了人类不会想到的故障组合
- 属性断言在所有运行中均保持成立
- 变异被成功捕获
这不是“测试通过了”。这是证据 —— 比仅仅变绿的单体测试强大得多 —— 证明你的代码能够应对逆境。
## 安装
```
# 从 PyPI
pip install ordeal
# 带 extras
pip install ordeal[atheris] # coverage-guided fuzzing via Atheris
pip install ordeal[all] # everything
# 作为 CLI 工具
uv tool install ordeal # global install
uvx ordeal explore # ephemeral, no install
# 开发
git clone https://github.com/teilomillet/ordeal
cd ordeal && uv sync
uv run pytest # 205 tests
```
## 开箱即用
### 有状态混沌测试
`ChaosTest` 扩展了 Hypothesis 的 `RuleBasedStateMachine`。你只需声明故障和规则,ordeal 会自动注入一个 **nemesis**,在探索期间切换故障。该 nemesis 只是一个普通的 Hypothesis 规则,因此引擎会像探索其他任何状态转换一样探索故障的时间表。缩减(Shrinking)会自动进行。
```
from ordeal import ChaosTest, rule, invariant
from ordeal.faults import io, numerical, timing
class StorageChaos(ChaosTest):
faults = [
io.error_on_call("myapp.storage.save", IOError),
timing.intermittent_crash("myapp.worker.process", every_n=3),
numerical.nan_injection("myapp.scoring.predict"),
]
swarm = True # random fault subsets per run — better aggregate coverage
@rule()
def write_data(self):
self.service.save({"key": "value"})
@rule()
def read_data(self):
result = self.service.load("key")
always(result is not None, "reads never return None after write")
```
Swarm 模式:每次运行都会激活一个随机的故障子集。经过多次运行,这会比一直开启所有故障覆盖更多的故障组合。Hypothesis 负责子集的选择,因此缩减机制能隔离出触发失败的确切故障组合。
### 属性断言
四种类型,各自具有不同的语义:
```
from ordeal import always, sometimes, reachable, unreachable
always(len(results) > 0, "never empty") # must hold every time — fails immediately
sometimes(cache_hit, "cache is used") # must hold at least once — checked at session end
reachable("error-recovery-path") # code path must execute at least once
unreachable("silent-data-corruption") # code path must never execute — fails immediately
```
`always` 和 `unreachable` 会立即失败,并触发 Hypothesis 的缩减。`sometimes` 和 `reachable` 会在整个会话期间积累证据 —— 它们会在测试结束时进行检查。这是引入 Python 的 [Antithesis 断言模型](https://antithesis.com/docs/properties_assertions/assertions/)。
### 内联故障注入 (BUGGIFY)
在你的生产代码中放置 `buggify()` 门禁。它们在正常情况下返回 `False`。在混沌测试期间,它们会以概率形式返回 `True`:
```
from ordeal.buggify import buggify, buggify_value
def process(data):
if buggify(): # sometimes inject delay
time.sleep(random.random() * 5)
result = compute(data)
return buggify_value(result, float('nan')) # sometimes corrupt output
```
受随机种子控制。线程本地化。未激活时为无操作(no-op)。这是 Python 版的 [FoundationDB BUGGIFY](https://apple.github.io/foundationdb/testing.html) —— 被测试的代码*就是*测试工具本身。
### 覆盖率引导的探索
Explorer 会跟踪每次运行发现的代码路径(AFL 风格的边缘哈希)。当某次运行发现新的覆盖率时,它会保存一个检查点。未来的运行将从高价值的检查点分支出去,系统地探索状态空间:
```
from ordeal.explore import Explorer
explorer = Explorer(
MyServiceChaos,
target_modules=["myapp"],
checkpoint_strategy="energy", # favor productive checkpoints
)
result = explorer.run(max_time=60)
print(result.summary())
# 探索: 5000 次运行, 52000 步, 60.0s
# 覆盖率: 287 条边, 43 个检查点
# 发现故障: 2
# 运行 342, 步骤 15: ValueError (shrinking 后 3 步)
```
失败会被**缩减** —— 增量调试(delta debugging)会移除不必要的步骤,然后故障简化会移除不必要的故障。你将获得用于重现该 bug 的最小序列。
通过 `workers` 进行扩展 —— 每个进程获得一个独立的随机种子用于独立探索,结果会被汇总:
```
explorer = Explorer(MyServiceChaos, target_modules=["myapp"], workers=8)
```
### 配置
```
# ordeal.toml — 单个文件, 人类与机器可读
[explorer]
target_modules = ["myapp"]
max_time = 60
seed = 42
checkpoint_strategy = "energy"
[[tests]]
class = "tests.test_chaos:MyServiceChaos"
[report]
format = "both"
traces = true
verbose = true
```
请参阅 [`ordeal.toml.example`](ordeal.toml.example) 获取包含所有选项说明的完整模式。
### 带有边界偏向的 QuickCheck
`@quickcheck` 从类型提示推断策略。它偏向于边界值 —— 0、-1、空列表、最大长度 —— 也就是实现 bug 聚集的地方:
```
from ordeal.quickcheck import quickcheck
@quickcheck
def test_sort_idempotent(xs: list[int]):
assert sorted(sorted(xs)) == sorted(xs)
@quickcheck
def test_score_bounded(x: float, y: float):
result = score(x, y)
assert 0 <= result <= 1
```
### 可组合的不变量
```
from ordeal.invariants import no_nan, no_inf, bounded, finite
valid_score = finite & bounded(0, 1)
valid_score(model_output) # raises AssertionError with clear message
```
不变量可以使用 `&` 进行组合。支持标量、序列和 numpy 数组。
### 模拟原语
确定性的 Clock 和 FileSystem —— 无需 mock,无真实 I/O,瞬间完成:
```
from ordeal.simulate import Clock, FileSystem
clock = Clock()
fs = FileSystem()
clock.advance(3600) # instant — no real waiting
fs.inject_fault("/data.json", "corrupt") # reads return random bytes
```
### 差分测试
在相同的随机输入上比较两种实现 —— 捕获回归问题并验证重构的正确性:
```
from ordeal.diff import diff
result = diff(score_v1, score_v2, rtol=1e-6)
assert result.equivalent, result.summary()
# diff(score_v1, score_v2): 100 个示例, EQUIVALENT
```
### 变异测试
验证你的混沌测试是否真的能捕获 bug。如果你在代码中将 `+` 改成 `-` 而测试依然通过,说明你的测试存在盲区:
```
from ordeal.mutations import mutate_function_and_test
result = mutate_function_and_test("myapp.scoring.compute", my_tests)
print(result.summary())
# 变异分数: 15/18 (83%)
# SURVIVED L42:8 + -> -
# SURVIVED L67:4 否定 if-condition
```
### 故障库
```
from ordeal.faults import io, numerical, timing, network, concurrency
# I/O 故障
io.error_on_call("mod.func") # raise IOError
io.disk_full() # writes fail with ENOSPC
io.corrupt_output("mod.func") # return random bytes
io.truncate_output("mod.func", 0.5) # truncate to half
# 数值故障
numerical.nan_injection("mod.func") # output becomes NaN
numerical.inf_injection("mod.func") # output becomes Inf
numerical.wrong_shape("mod.func", (1,512), (1,256))
# 时序故障
timing.timeout("mod.func") # raise TimeoutError
timing.slow("mod.func", delay=2.0) # add delay
timing.intermittent_crash("mod.func", every_n=3)
timing.jitter("mod.func", magnitude=0.01)
# 网络故障
network.http_error("mod.client.post", status_code=503)
network.connection_reset("mod.client.post")
network.rate_limited("mod.client.get", retry_after=60)
network.dns_failure("mod.client.resolve")
# 并发故障
concurrency.contended_call("mod.pool.acquire", contention=0.1)
concurrency.thread_boundary("mod.cache.get")
concurrency.stale_state(service, "config", old_config)
```
### 集成
```
# Atheris — 覆盖率引导的 fuzzing 指导 buggify() 决策
from ordeal.integrations.atheris_engine import fuzz
fuzz(my_function, max_time=60)
# API chaos 测试 (内置, 无需额外安装)
from ordeal.integrations.openapi import chaos_api_test
chaos_api_test("http://localhost:8080/openapi.json", faults=[...])
```
### 审计 —— 用数据证明采纳的价值
```
ordeal audit myapp.scoring --test-dir tests/
```
```
myapp.scoring
current: 33 tests | 343 lines | 98% coverage [verified]
migrated: 12 tests | 130 lines | 96% coverage [verified]
saving: 64% fewer tests | 62% less code | same coverage
mined: compute: output in [0, 1] (500/500, >=99% CI)
mutation: 14/18 (78%)
suggest:
- L42 in compute(): test when x < 0
- L67 in normalize(): test that ValueError is raised
```
每一个数字要么是 `[verified]`(通过 coverage.py 的 JSON 测量并交叉验证),要么是 `FAILED: reason` —— 审计绝不会默默地返回 0%。挖掘出的属性包含 Wilson 置信区间。当存在覆盖率缺口时,它会读取源代码并确切地告诉你应该测试什么。使用 `--show-generated` 检查测试文件,使用 `--save-generated` 保留它。
## CLI
```
ordeal audit myapp.scoring # compare existing tests vs ordeal
ordeal explore # run from ordeal.toml
ordeal explore -w 8 # parallel with 8 workers
ordeal explore -c ci.toml -v # custom config, verbose
ordeal explore --max-time 300 --seed 99 # override settings
ordeal replay trace.json # reproduce a failure
ordeal replay --shrink trace.json # minimize a failure trace
ordeal explore --generate-tests tests/test_gen.py # turn traces into pytest tests
```
## 找到你需要的
每一个目标都映射到一个起点 —— 一个可运行的命令、一个可导入的模块,以及一篇可阅读的文档。没有任何内容被隐藏。
| 我想... | 从这里开始 | 在代码库中 | 文档 |
|---|---|---|---|
| 无需编写测试就能发现 bug | `ordeal mine mymodule` | `ordeal/auto.py` | [Auto Testing](https://docs.byordeal.com/guides/auto) |
| 检查我的测试是否足够好 | `ordeal audit mymodule` | `ordeal/mutations.py` | [Mutations](https://docs.byordeal.com/guides/mutations) |
| 编写一个混沌测试 | `from ordeal import ChaosTest` | `ordeal/chaos.py` | [Getting Started](https://docs.byordeal.com/getting-started) |
| 注入特定故障(超时、NaN 等) | `from ordeal.faults import timing` | `ordeal/faults/` 目录 | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) |
| 探索所有的故障组合 | `ordeal explore` | `ordeal/explore.py` | [Explorer](https://docs.byordeal.com/guides/explorer) |
| 重现并缩减故障 | `ordeal replay trace.json` | `ordeal/trace.py` | [Shrinking](https://docs.byordeal.com/concepts/shrinking) |
| 在生产代码中添加故障安全门禁 | `from ordeal.buggify import buggify` | `ordeal/buggify.py` | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) |
| 在所有运行中进行断言 | `from ordeal import always, sometimes` | `ordeal/assertions.py` | [Assertions](https://docs.byordeal.com/concepts/property-assertions) |
| 在测试中控制时间/文件系统 | `from ordeal.simulate import Clock` | `ordeal/simulate.py` | [Simulation](https://docs.byordeal.com/guides/simulate) |
| 比较两种实现 | `ordeal mine-pair mod.fn1 mod.fn2` | `ordeal/diff.py` | [Auto Testing](https://docs.byordeal.com/guides/auto) |
| 组合验证规则 | `from ordeal.invariants import no_nan` | `ordeal/invariants.py` | [API Reference](https://docs.byordeal.com/reference/api) |
| 针对故障测试 API 端点 | `from ordeal.integrations.openapi import chaos_api_test` | `ordeal/integrations/openapi.py` | [Integrations](https://docs.byordeal.com/guides/integrations) |
| 使用新的故障扩展 ordeal | 遵循 `faults/*.py` 中的模式 | `ordeal/faults/` | [Fault Injection](https://docs.byordeal.com/concepts/fault-injection) |
| 配置可重现的运行 | 创建 `ordeal.toml` | `ordeal/config.py` | [Configuration](https://docs.byordeal.com/guides/configuration) |
| 发现所有的故障、断言、策略 | `from ordeal import catalog; catalog()` | `ordeal/__init__.py` | [API Reference](https://docs.byordeal.com/reference/api) |
## 架构 —— 代码地图
每个模块只做一件事。当你想要理解、使用或扩展 ordeal 时,这里会告诉你该去哪里找。
```
ordeal/
├── chaos.py Your tests extend this — ChaosTest base class, nemesis, swarm mode
├── explore.py The exploration engine — coverage tracking, checkpoints, energy scheduling
├── assertions.py always / sometimes / reachable / unreachable — the assertion model
├── buggify.py Inline fault gates — thread-local, seed-controlled, no-op when inactive
├── quickcheck.py @quickcheck — type-driven strategies with boundary bias
├── simulate.py Deterministic Clock and FileSystem — no mocks, no real I/O
├── invariants.py Composable checks — no_nan & bounded(0, 1), works with numpy
├── mutations.py AST mutation testing — 14 operators, count-and-apply pattern
├── auto.py Auto-testing — scan_module, fuzz, mine, diff, chaos_for
├── trace.py Trace recording, JSON serialization, replay, delta-debugging shrink
├── config.py ordeal.toml loader — strict validation
├── cli.py CLI entry point — explore, replay, mine, audit
├── plugin.py Pytest plugin — --chaos, --chaos-seed, --buggify-prob
├── strategies.py Adversarial Hypothesis strategies for fuzzing
├── faults/ All fault types, organized by what can go wrong:
│ ├── io.py disk_full, corrupt_output, permission_denied
│ ├── numerical.py nan_injection, inf_injection, wrong_shape
│ ├── timing.py timeout, slow, intermittent_crash, jitter
│ ├── network.py http_error, connection_reset, dns_failure
│ └── concurrency.py contended_call, thread_boundary, stale_state
└── integrations/ Optional bridges to specialized tools:
├── openapi.py Built-in API chaos testing (no extra deps)
└── atheris_engine.py Coverage-guided fuzzing (pip install ordeal[atheris])
```
## 许可证
Apache 2.0
标签:AI编程助手, Python, 威胁情报, 安全规则引擎, 属性测试, 开发者工具, 故障注入, 无后门, 测试框架, 混沌工程, 状态探索, 研发效能, 自动化payload嵌入, 边缘用例发现, 逆向工具