hinanohart/conformlock
GitHub: hinanohart/conformlock
Stars: 0 | Forks: 0
# conformlock
`conformlock` is a small CPU-only Python library that wraps a streaming ML predictor
with four runtime checks and a tamper-evident audit log:
1. **Split conformal prediction** (and an adaptive variant in the style of Gibbs & Candès 2021) for per-decision prediction intervals / sets.
2. **Finite-trace temporal-logic property automata** (LTLf, evaluated incrementally) for "the system did *X* before *Y*" rules over the recent decision stream.
3. **Online drift detectors** (ADWIN, CUSUM, Page-Hinkley, sliding KS, PSI) to flag when calibration is no longer trustworthy.
4. **Append-only audit ledger** using BLAKE3 hash chaining + ULID identifiers, so a verifier can later show that a recorded decision was not edited after the fact. The ledger is **tamper-evident only against in-place edits under a single-writer assumption**: an attacker who can replace the whole file undetected, or two concurrent writers across processes, can rewrite history. External-verifiability anchoring (Sigstore/Rekor or a public chain) is on the v0.2 roadmap. Within a single process, `Ledger.append` is now thread-safe (added in v0.1.0a3).
If any check rejects a decision, `conformlock` returns an `abstain` verdict instead of the model's prediction; the caller decides what to do (escalate to a human, return a default, retry, etc.).
## What this is — and is not
| Is | Is not |
|---|---|
| An engineering convenience layer over well-known statistical methods | A formal verification system |
| CPU-only, deterministic given seeds | A guarantee of safety, correctness, or compliance |
| Useful for streaming tabular / time-series inference | An LLM agent oversight tool (see [Subjunctor](https://github.com/hinanohart/subjunctor) for that scope) |
| A starting point for runtime monitoring policies | A drop-in replacement for human review |
The word "lock" in the name is metaphorical; this library does not lock anything down.
## Install
# Not on PyPI yet — install from the GitHub release tag.
pip install "conformlock @ git+https://github.com/hinanohart/conformlock@v0.1.0a3"
# or with notebook ML extras:
pip install "conformlock[ml] @ git+https://github.com/hinanohart/conformlock@v0.1.0a3"
Extras: `[ml]` adds `scikit-learn` and `torch` (notebook examples only); the core has no ML framework dependency.
## 30-second example
import numpy as np
from conformlock import (
ConformalCalibrator,
Decision,
LTLfSpec,
Verifier,
)
# 1. Calibrate split conformal on a held-out set.
cal = ConformalCalibrator(alpha=0.1) # target 90% coverage
cal.fit(scores=np.array([0.12, 0.08, 0.31, 0.05, 0.22]))
# 2. Write the LTLf rule:
# "if a 'risky' decision is made, an 'audit' decision must follow within 5 steps".
spec = LTLfSpec.parse("G (risky -> F[0,5] audit)")
# 3. Build a verifier. The calibrator already carries the target ``alpha``;
# the verifier just orchestrates conformal → property → drift.
v = Verifier(calibrator=cal, spec=spec)
# 4. Use it on the streaming predictor. ``stream`` is whatever your caller
# yields; one ``Decision`` per inference is the contract.
for record_id, score, atom in stream: # e.g. atom in {"risky", "audit", ""}
decision = Decision(
record_id=record_id,
score=score,
atoms=frozenset({atom}) if atom else frozenset(),
)
verdict = v.observe(decision)
if verdict.action == "abstain":
handle_escalation(verdict) # e.g. route to a human reviewer
else:
act_on(verdict) # the caller decides what acting means
Numbers in the snippet above are illustrative inputs, not benchmark claims; `stream`, `handle_escalation`, and `act_on` are placeholders the caller supplies. See `examples/` for runnable scripts with deterministic seeds and printable output, and `tests/test_readme_example.py` for the CI-enforced regression test that runs this exact snippet end-to-end.
## What `conformlock` does *not* try to do
- It does **not** prove that the underlying model is calibrated, fair, or correct.
- It does **not** claim coverage guarantees in the strict statistical sense once the data-generating distribution drifts; drift detection only tells you that the assumption has likely broken.
- It does **not** verify LLM agents, function-call traces, or tool use — see [Subjunctor](https://github.com/hinanohart/subjunctor).
- It is **not** evaluated against any regulatory certification scheme.
## Why another conformal library?
| Library | Latest release | Streaming online conformal | LTLf/MTL property layer | Tamper-evident ledger | License |
|---|---|---|---|---|---|
| [MAPIE](https://github.com/scikit-learn-contrib/MAPIE) | v1.4.0 (2026-04-30) | Partial (batch time-series only, no ACI) | No | No | BSD-3-Clause |
| [crepes](https://github.com/henrikbostrom/crepes) | active | No streaming hook | No | No | BSD-3-Clause |
| [nonconformist](https://github.com/donlnz/nonconformist) | maintenance only | No | No | No | MIT |
| **conformlock** | **v0.1.0a3 (2026-05-24)** | **Yes (split CP + ACI; ACI advances through `Verifier.record_outcome`)** | **Yes (self-implemented LTLf, finite-trace re-evaluator; no DFA pre-compile)** | **Yes (BLAKE3 chain; single-writer threat model — see ledger note below)** | **MIT** |
### Adjacent OSS we are *not* trying to replace
The combination *split conformal + temporal-logic monitor + drift + ledger* is the unit of value `conformlock` claims to add; on any single one of those four axes, several mature OSS projects already exist and `conformlock` is not trying to replace them:
- [alibi-detect](https://github.com/SeldonIO/alibi-detect) — drift and outlier detection (no conformal, no temporal logic, no ledger).
- [Frouros](https://github.com/IFCA-Advanced-Computing/frouros) — 31 drift-detection methods (no conformal, no temporal logic, no ledger).
- [NannyML](https://github.com/NannyML/nannyml) — drift + performance estimation under absent labels (no conformal interval, no temporal logic, no ledger).
- [Evidently](https://github.com/evidentlyai/evidently), [deepchecks](https://github.com/deepchecks/deepchecks), [whylogs](https://github.com/whylabs/whylogs) — ML-monitoring dashboards (no per-decision conformal interval, no temporal logic).
- [river](https://github.com/online-ml/river) — online learning primitives (no conformal, no ledger).
If you only need the *drift* axis you should look at those first; `conformlock` exists for the case where you specifically need the per-decision conformal abstain + finite-trace temporal-property monitor + tamper-evident log together. No equivalent OSS combination was found at scaffold time (2026-05-24); please file an issue if one exists.
## Regulatory framing — read carefully
The EU AI Act ([Article 15 — Accuracy, Robustness and Cybersecurity](https://artificialintelligenceact.eu/article/15/), enforceable for high-risk AI systems from **2 August 2026**) requires high-risk systems to "achieve an appropriate level of accuracy, robustness and cybersecurity, and perform consistently … throughout their lifecycle," and to disclose accuracy metrics in the instructions for use.
`conformlock` is **designed with reference to** that text: it gives operators a programmatic way to produce per-decision uncertainty bounds, detect distributional drift, and retain an audit log. It does **not** by itself make any AI system "Article 15 compliant"; compliance is an organisational and process determination that an operator's notified body or supervisory authority makes.
Similarly, the library is **not** certified against ISO/IEC 23894:2023, NIST AI RMF, FDA SaMD Good Machine Learning Practice, or any other framework. We deliberately avoid the marketing register that would imply such a posture (see `docs/honest-marketing-policy.md` for the exact CI-enforced exclusion list).
## How this release was assembled — read carefully
`conformlock` v0.1.0a1 and v0.1.0a3 were assembled by an LLM-driven autonomous workflow under the project author's account (`hinanohart`). The only human in the loop is the author; no third party has independently reviewed the implementation or the marketing claims at release time. The git tag `v0.1.0a3` supersedes `v0.1.0a1`, which was retained on GitHub solely so the audit trail remains intact; new users should install `v0.1.0a3` and treat the line accordingly until an external reviewer signs off. Issues filed against either tag are welcome.
## Related work and prior art
- Vovk, Gammerman, Shafer — *Algorithmic Learning in a Random World* (Springer 2005) — split / inductive conformal prediction.
- Gibbs & Candès (2021) — *Adaptive Conformal Inference Under Distribution Shift* (NeurIPS 2021) — ACI update rule.
- De Giacomo & Vardi (2013) — *Linear Temporal Logic and Linear Dynamic Logic on Finite Traces* (IJCAI 2013) — LTLf semantics.
- Bauer, Leucker, Schallhart (2011) — *Runtime Verification for LTL and TLTL* (ACM TOSEM 20(4)) — three-valued LTL₃ semantics; this library's permanent-verdict heuristic follows the same spirit.
- Lindemann, Qin, Fan, Pappas, Bastani (2022) — *Conformal Prediction for STL Runtime Verification* ([arXiv:2211.01539](https://arxiv.org/abs/2211.01539)) — academic predecessor targeting **STL** (continuous-time temporal logic); `conformlock` targets **LTLf** (discrete-step finite-trace) and ships a public MIT implementation.
- Bifet & Gavaldà (2007) — *Learning from Time-Changing Data with Adaptive Windowing* — ADWIN.
- O'Connor, Aumasson, Neves, Wilcox-O'Hearn — *BLAKE3* (2020) — hash function used for the ledger.
## Project layout
conformlock/
├─ src/conformlock/ # core library (numpy + scipy + blake3 + python-ulid)
├─ tests/ # unit + property + ledger-tamper tests
├─ examples/ # runnable CPU-only examples
├─ notebooks/ # optional [ml] extra: scikit-learn / torch
├─ docs/ # background and design notes
├─ CHANGELOG.md
├─ ROADMAP.md
└─ LICENSE # MIT
## Roadmap
See [ROADMAP.md](ROADMAP.md). Highlights: a more general MTL fragment, an offline `conformlock-replay` tool to re-verify an existing ledger, and HuggingFace dataset bindings (deferred to v0.1.1).
The names `verielle` and `decisionscope` appear in the roadmap as **v0.3 backlog** items, not promises.
## License
MIT License — see [LICENSE](LICENSE).