Process artifact Report artifact 2026-07-02

Factory Run Schema v1

A specialist factory needs proof folders, not vibes. This schema defines the local run directory, required metrics, and decision vocabulary that every public artifact should eventually satisfy.

Headline Numbers

Required files

8 config, dataset, train log, evals, report, artifact, decision

Decisions

6 ship, reject, retry-data, retry-training, retry-eval, park

First-class outputs

5 data, training, eval, package, report

Competitive Context

System Metric Score Size / Class Comparable? Readout
TinyGPT factory schema required public run files 8 repo-local contract Direct Defines the minimum evidence bundle each future public model artifact must carry.
Ad hoc model card only before/after reproducibility weak single document Directional Useful for release notes, but insufficient for a factory claim without eval JSON, decision, and blockers.

Direct rows share this artifact's eval setup. Directional rows are useful market context but should not be read as leaderboard claims.

Run folder contract

FilePurposePublic relevance
config.jsonTarget, base, method, thresholdsExplains what was attempted
dataset.jsonSources, rows, filtering, heldoutProvenance
eval-baseline.jsonFrozen baseline resultBefore number
eval-candidate.jsonCandidate resultAfter number
decision.jsonShip/reject/retry callHonest release status

Release Blockers

Needs a canonical rendered example

The schema is real, but the website should show one complete run folder as the public example.

Unblock: Promote the SQL routed result into a small report-only rendered artifact.

Evidence

Next Release Action

Turn the SQL routed result into the first website-native factory report that follows this schema.