SQL factory POC
Current-best candidate A two-adapter routed SQL artifact: public schema-only SQL routes to a b-mc2 adapter; local SQLite execution routes to a synthetic execution adapter.
- Public exact
- 0.531
- T5-small baseline
- 0.484
- Synthetic execution
- 0.860
Competition: TinyGPT routed SQL v1 vs T5-small local baseline
Next: Publish this as a report artifact first. Do not present it as a shipped SQL model until public execution eval and clean-output gates pass.
First specialist package
Release-ready metadata A routed 4B file-operation specialist distilled from frontier/gold trajectories, with the breadth regression disclosed in the package.
- File-ops hard gate
- 100%
- Heldout file-ops
- 95%
- Breadth after tuning
- 42.3%
Competition: TinyGPT Qwen3-4B file-ops specialist vs Stock Qwen3-4B
Next: Release as metadata/model-card first, or publish the fused weights only with routed-only warnings attached.
Process artifact
Report artifact The canonical target -> data -> post-training -> eval -> package -> report shape for TinyGPT runs.
- Required files
- 8
- Decisions
- 6
- First-class outputs
- 5
Competition: TinyGPT factory schema vs Ad hoc model card only
Next: Turn the SQL routed result into the first website-native factory report that follows this schema.
Browser performance artifact
Report artifact The original browser TinyGPT track: hand-written WebGPU kernels beat WASM SIMD more as model width grows.
- WebGPU speedup
- 12.1x
- Small-width speedup
- 2.6x
- Browser track
- shipped
Competition: TinyGPT WebGPU vs TinyGPT WASM SIMD
Next: Keep as a public performance artifact and cross-link it from factory reports when browser-local training matters.
Browser memory artifact
Report artifact A WebAssembly Memory64 build lifted the browser model allocation ceiling past the old 4GB tab limit.
- Allocated params
- 473M
- Allocation time
- 3.7s
- Train step
- 82.2s
Competition: TinyGPT Memory64 build vs TinyGPT wasm32 build
Next: Keep this as a public technical artifact; do not make it active factory work unless a browser-run specialist needs it.
Mac runtime benchmark
Report artifact The native Mac runtime reached high local decode throughput on the Huge preset, showing the serving path is viable for local eval loops.
- Huge decode
- 696 tok/s
- Mega pilot
- 293 tok/s
- Warm TTFT p99
- 5.8ms
Competition: TinyGPT Huge preset vs TinyGPT Mega pilot
Next: Use this as the baseline expectation for future artifact performance tables.