Signed attestation · v1

Training you can
prove, not just describe.

Attestable is a constitution-bound training pipeline that provably runs its safety rewards — and ships with the signed receipts proving the model was hardened. The model is the demo. The proof is the product.

Selling the training run's testimony instead of the model's weights, with the model kept on hand as the witness.
sha256 · 74 principles, 0 fully-bound · 108 findings reviewed adversarially · 5 recorded decisions, reversals kept in · one GPU, lease-disciplined
The problem 01

A model card is a claim. Nobody can price a claim.

Every model on the market documents its safety in a model card — a vendor's written description of what the model is supposed to do. It is prose, not a verified control. You can't trust it, audit it, or insure it.

Buyers can't verify

Trust & safety teams and regulated fine-tuners take safety on faith. There is no artifact that says which controls are actually active.

Regulators demand proof

The EU AI Act turns "document what your training did" into a procurement question with a deadline. No one sells the answer.

Insurers can't underwrite it

AI liability carries a heavy uncertainty loading or goes unwritten. Opacity is expensive — and the black box is the whole book.

Why now 02

Training got commoditized. Provable training didn't.

Anyone can fine-tune a model with an open framework and a rented GPU. The differentiated, expensive asset is everything around the run — the discipline, the gates, the bindings, the ledger of what was tried and refuted.

Model cards describe. Nothing on the market attests. That gap — between a safety claim and a verifiable safety control — is opening exactly as regulation and an emerging AI-liability market start demanding evidence on both sides of the transaction.

THE MARKET FORMING AROUND THE GAP
EU AI Act — GPAI documentation dutiesdeadline-driven
AI assurance / model-audit firmsforming
AI / tech-E&O liability underwritingunpriced
The product 03

Constitution in.
Model and receipts out.

Attestable is the reference pipeline that turns a written constitution into a trained model and a signed, machine-readable proof bundle — an SBOM for training. Four surfaces, one narrative spine.

Attestation artifact

The signed per-run proof: constitution → per-principle bindings → gate verdicts → eval receipts → findings ledger.

Harness engine

The lease-disciplined, single-GPU pipeline that emits attestations. Production-hardened reference implementation.

/detect honesty gate

The classification surface — and where our own gate caught us. The current per-principle heads tested 0/64 under strict ablation; disclosed, not shipped. A designed fix, not a claim.

The exhibit witness

w1tch — an exhaustively-audited small model; provenance, not capability, is its claim. The model proves the pipeline ran; the receipts prove the model was bound.

The principle 04

Measured from inside the system — not asserted beside it.

Every safety claim on the market is out-of-band: a document next to the model, or a reward the training can game. Attestable measures and enforces in-band — the control is part of the running system, so it cannot be bypassed, gamed, or faked.

Out-of-band the industry

A model card asserted beside the model · a reward oracle divorced from the real output · a gate that sits off the serving path. Assertions, not controls — bypassable, gameable, unverifiable.

In-band Attestable

The reward registered on every serving path · within-group σ read from the live run · the judge gate in the render path · coverage computed from running state. The measurement is part of the system it measures — re-verifiable, unbypassable.

Why it matters — three failures out-of-band invites and in-band forecloses. (1) A control active on one path and missing on another ran in production 19 days undetected; in-band means every path, verified. (2) A reward divorced from the output gets reward-hacked; in-band rewards bind to what the model actually does. (3) An underwriter can price a control read from the running system — a claim beside it is just a model card. In-band is what makes binding coverage a warranty, not a brochure.
The metric 05

Binding coverage: the honest number.

The fraction of a system's declared safety controls that are provably wired, active, and exerting force in the live run — measured, not asserted. A control only counts if the whole chain holds: trained on real harm examples · registered on every serving path · fails safe when uncertain · demonstrably changes behavior · no output reaches a user unchecked.

The number is not 100%. Uncovered controls stay visibly marked. That is the product — not a bug in it.

The project's worst defect — production training silently running without its safety rewards for 19 days — was invisible because no surface rendered the binding. Binding coverage makes that impossible to miss.
74 principles
live · fail-closed
read from running state
full 6-link bound · 0 sub-axis wired · 6 judge-only · 62 dedicated link failing · 6
All 74 carry holistic-judge coverage; 0 reach the full binding. The scope-tier gate makes a narrow reward structurally unable to report as a full-principle green.
The wedge 06

We make AI risk underwritable.

Binding coverage is an actuarial input. It turns "we can't price AI risk because it's a black box" into a number an underwriter's desk can work with — and it hands them the honest denominator on purpose.

Model cardBinding-coverage attestation
Vendor asserts "it's safe"System proves which controls are provably active — and which are not
Point-in-time marketingA live, re-verifiable warranty — re-checked continuously through the policy period
Zero residual risk (uninsurable)Disclosed residual risk — the priceable denominator, standardized across insureds
Annual snapshotA monitoring feed that flags a lapsed control when it lapses, not at claim time
We supply the measurement; the actuary sets the price. We remove the reason the rate had to guess.
The moat 07

The receipts include what marketing would delete.

The ledger ships the findings the hardening process refuted and later reversed — the reversal of our own "#1 critical", the 148-agent audit that changed nothing, the paid-for negative results.

0/64

Classifiers we refused to ship

We adversarially tested our own /detect heads. Apparent AUC ~1.0 collapsed to chance under strict ablation — representation leakage, not detection. We disclosed the honest number and retracted the catalog.

108

Findings, reviewed adversarially

Ranked, each linked finding → verdict → fix → pinning test — the confirmed, the plausible, and the refuted, all kept in.

5

Recorded decisions

Reversals kept in — including reversing our own once-"#1 critical". The reasoning stays in the record, not the trash.

This is differentiation no incumbent can copy quickly — their ledgers were curated from day one. Trust is the one asset you cannot retrofit.
Who buys 08

Four doors, one standard.

Trust & Safety teams /detect

UGC platforms, marketplaces, gaming, dating — standing budget for classification APIs; child-safety and grooming/blackmail shapes are the corpus's deepest coverage. Nearest revenue — once the classifiers pass their own honesty gate.

Regulated fine-tuners attestation

Health, finance, legal, edtech tuning open models. The EU AI Act makes "prove what your training did" a procurement question — nobody sells them an answer today.

AI assurance / audit firms format

They need machine-readable evidence to audit against. A firm that standardizes on the format becomes a channel, not just a customer.

Insurers & on-prem warranty

Underwriters pricing AI/tech-E&O exposure; privacy-critical, air-gapped operators who can't tell a cloud vendor their story.

Explicit non-targets: frontier labs (build in-house), benchmark chasers (wrong product), anyone whose use case requires weakening the gates.
How it earns 09

High-margin proof, not compute-hungry platform.

The binding constraint is one GPU and one operator, so the model favors metered artifacts over managed services. None of these is live revenue yet — this is the intended model, and each rung names the gate that opens it.

1/detect — metered classificationPer-request harm-shape endpoints. CPU-tier serving already runs; the heads must first pass strict-ablation to be honest.gate · honest heads
2Attestation-as-a-servicePer-run fee for teams who fine-tune elsewhere and need a provable record. Compliance-shaped willingness to pay.gate · assembler
3Open-core harness + supportReference implementation open (credibility engine); paid tier = assembler, signing, hosted verification, SLA.credibility engine
4Bespoke discriminators & managed runsCustom classifiers on a customer's policy; boutique attested runs — each doubles as a case study.as capacity allows
What's real today 10

Not a slide — a running system.

Production-hardened pipeline live

Lease-safe single-GPU discipline, judge gates, per-principle reward bindings — running, with the operator console rendering binding coverage from live state.

/detect serving path infra live · heads not

CPU-tier serving runs. The current per-principle heads are disclosed at 0/64 under strict ablation — honest, not shipped; the redesigned architecture is specified.

The exhibit model in run

A full fine-tune is training now (44%-pretrained base — provenance, not capability, is the claim). GRPO, judged eval, and the attestation export follow.

Attestation assembler building

The bundle joins over data that already exists (constitution ↔ reward registry ↔ discriminator index ↔ receipts). Mostly plumbing.

The ask 11

Fund the run that ships with its receipts.

We're raising to close the launch gaps, ship the first publicly-attested model, and take the standard to design partners — turning a hardened pipeline into a category.

Round size, terms, and milestones — to be set with the operator. This brief carries no projected financials on purpose: an Attestable deck that inflated its own numbers would refute its own thesis.

Twenty days spent buying belief.
The product is to sell what it bought.

Attestable
provable AI training · investor brief · draft