Tiny GPT in pure Zig
Cross-language generalisation on Karpathy's minGPT/nanoGPT lineage, rebuilt in Zig without ML libs.
This evaluation tests low-level systems programming and ML fundamentals. The model must implement a character-level GPT from scratch in Zig, including matrix operations, AdamW optimizer, and learning rate scheduling. The acceptance criteria include successful build, training, and sample generation.
Spec: Medium-spec prompt; char-level GPT, AdamW, warmup/cosine, CPU-only; build/train/sample acceptance.
RESULTS BY MODEL
GPT-5.2-codex medium
Codex CLI
GPT-5.2 xhigh
Codex CLI
GPT-5.2 medium
Codex CLI
Opus 4.5 thinking
Claude Code
GPT-5.1-codex-max medium
Codex CLI
Gemini 3 Pro
Gemini CLI
KEY TAKEAWAYS
- GPT-5.2 xhigh (82) beats Gemini (81)—first OpenAI model to win Zig; medium (70) also solid.
- Claude (62) succeeded on 3rd attempt via self-correction; Codex 5.1 (36) still crashes on matmul.
Claude recovers via best-of-3
GPT-5.2 xhigh (82) beats Gemini 3.0 Pro (81) on the Zig eval—first OpenAI model to win Zig. Claude (62) initially crashed but succeeded on 3rd attempt through self-correction. Codex 5.1 (36) still crashes on matmul assertions.