# ZERO Reliability Gauntlet v0

Gauntlet v0 is the static reliability contract for ZERO. It turns advisor reliability guidance into repo-verified fixtures without adding live-capital mutation, public numeric rankings, or legal claims.

Machine-readable contract: `/api/reliability/gauntlet`.

## Boundary

- Read-only contract only.
- No order execution.
- No runtime mutation.
- No public numeric Operator Score.
- No investment recommendation or performance promise.

## Required Proof Classes

- Replay integrity: model id, prompt SHA, tool calls, risk policy, payload hash, and result.
- Live-safety refusal: disallowed or unsafe signing attempts fail closed and emit replay evidence.
- Operator control: Take Over and kill-switch paths produce replayable state transitions.
- Journal finality: OTS receipt state, finalizer tracking, and public-chain pending/confirmed state are explicit.
- Public operator contract: `/u/{handle}` remains distinct from `/a/{shortId}` and exposes safety proof without auth.
- No-secret deploy: Railway, Docker, and runtime env flows never require Hyperliquid private keys.

## Sentinel v0 Map

Sentinel v0 maps reliability to existing control-plane surfaces:

- Risk Dial.
- Live lease.
- Take Over.
- Kill switch.
- Hyperliquid signing allowlist.
- Sentinel transcript on public operator profiles.

## Private Operator Score

Reliability facets are private calibration inputs until the scoring model is stable and disclosure copy is counsel-reviewed. Public surfaces may show proof states such as `passed`, `certified`, or `pending_public_chain`; they must not publish an uncalibrated numeric reliability rank.

Initial private facets:

- Replay completeness.
- Live safety.
- Operator control.
- Journal finality.
- Deployment hygiene.
- Compliance posture.

## zero-bench Roadmap

1. v0 static contract: fixtures and failure taxonomy are machine-readable.
2. v1 replay fixture runner: CI samples public replay/operator outputs without live mutation.
3. v2 control-plane drills: scheduled lease, Take Over, kill-switch, and refusal drills.
4. v3 paper tournament scoring: Coliseum agents scored by replay and risk proof.
5. v4 public calibration: public reliability labels only after calibration and counsel-reviewed disclosure language.
