← TinyGPT · docs · devlog · roadmap · speedup
source: docs/roadmap/index.md · view on GitHub ↗

Single-machine roadmap — index

A complete inventory of techniques that run on one Mac (or in a browser tab), ROI-ranked for TinyGPT. The original 1,400-line master plan is split across the files below — each is short enough to read in one sitting.

How to read this

Filter — single-machine only. Anything requiring a GPU cluster (ZeRO/FSDP, tensor parallelism, large RLHF runs) is in tier4_skip.md.

Three views of the same landscape:

Status legend (used throughout):

Markers last verified against the codebase: 2026-05-30 (post-CPU-bundle merge). Ship count: Tier 1: 8/9 · Tier 2: 13/16 · Tier 3: 17/21 · Tier 5: 0/5 (frontier research).

For current external benchmark landscape, see docs/research/inference_benchmarks_may_2026.md and docs/research/quality_benchmarks_may_2026.md.

The files

FileWhat it covers
tier1.mdHigh ROI — build next. Distillation, sequence packing, QLoRA, ORPO, SimPO, NEFTune, gradient checkpointing, speculative decoding, browser benchmark runner.
tier2.mdMedium ROI. KTO, IPO, DoRA, GaLore, VeRA, LoftQ, AWQ/GPTQ readers, HQQ, sliding window, ALiBi, KIVI, MTP, MQA, attention sink, prefix caching, prefix tuning.
tier3.mdNiche / specialized. RLAIF, GPTQ from-scratch, pruning, LASER, RLHF, Medusa, MoE, differential attention, MoD, LoRA variants (LoRA+, AdaLoRA, ReLoRA, PISSA, etc.).
tier4_skip.mdSkip — not worth it for us. fp16, ZeRO at single device, SSMs, etc.
tier5_frontier_2026.md2026 research frontier. Reasoning training, test-time compute scaling, vision-language toy, diffusion LM micro-impl, real sparse MoE kernels. Plus cloud pipeline + pausable training as enabler infrastructure.
categories.mdOrthogonal categories — optimizers, training stability, data, tokenization, interpretability, inference, browser perf, architecture, PEFT taxonomy, infra.
recommended_order.mdThe top-10 order to build in.
datasets.mdOpen-source datasets we’d actually use (pretrain / SFT / DPO / code / math / eval).
recent_research.md2024-2026 highlights with arxiv links (alignment, PEFT, quantization, inference, optimizers, distillation, reasoning RL).
phased_plan.mdThe executable 10-phase plan, 7 weeks of sequenced work.
blockers.mdWhat we can’t build right now and why (hardware, library, budget, integration). Includes the Phase 9/10 detailed status appendix.
honest_summary.mdThe “what we can / can’t / shouldn’t build” summary, plus cross-references.

One-line answer to “what do I build first?”

Phase 1 of phased_plan.md: NEFTune + gradient clipping