Tiny GPT in pure Zig

Cross-language generalisation on Karpathy's minGPT/nanoGPT lineage, rebuilt in Zig without ML libs.

Methodology

This evaluation tests low-level systems programming and ML fundamentals. The model must implement a character-level GPT from scratch in Zig, including matrix operations, AdamW optimizer, and learning rate scheduling. The acceptance criteria include successful build, training, and sample generation.

Spec: Medium-spec prompt; char-level GPT, AdamW, warmup/cosine, CPU-only; build/train/sample acceptance.

RESULTS BY MODEL

Opus 4.5 + GPT-5.2 High

Flow-Next

GPT-5.2-codex medium

Codex CLI

GPT-5.2 xhigh

Codex CLI

GPT-5.2 medium

Codex CLI

Opus 4.5 thinking

Claude Code

GPT-5.1-codex-max medium

Codex CLI

Gemini 3 Pro

Gemini CLI

KEY TAKEAWAYS

GPT-5.2 xhigh (82) beats Gemini (81)—first OpenAI model to win Zig; medium (70) also solid.
Claude (62) succeeded on 3rd attempt via self-correction; Codex 5.1 (36) still crashes on matmul.

Note

Claude recovers via best-of-3

GPT-5.2 xhigh (82) beats Gemini 3.0 Pro (81) on the Zig eval—first OpenAI model to win Zig. Claude (62) initially crashed but succeeded on 3rd attempt through self-correction. Codex 5.1 (36) still crashes on matmul assertions.