Permissions
Eval Details
Convex document & folder permissions
Tests whether agents can extend ACL patterns without regressions when specs are intentionally light.
Methodology
This evaluation tests incremental ACL extension on an existing codebase. The spec is intentionally sparse, requiring the model to infer patterns from existing code. Key challenges include maintaining owner checks, implementing guest filters, and handling activation hooks without breaking existing functionality.
Spec: Extend existing ACL whitelist to docs/folders; inheritance, Better Auth invites, guest filtering, tests.
RESULTS BY MODEL
GPT-5.2-codex medium
Codex CLI
GPT-5.2 xhigh
Codex CLI
GPT-5.2 medium
Codex CLI
Opus 4.5 thinking
Claude Code
GPT-5.1-codex-max medium
Codex CLI
Gemini 3 Pro
Gemini CLI
KEY TAKEAWAYS
- GPT-5.2 medium and xhigh tie (78) with best ACL inference; Claude (65) solid; Gemini struggles (49).
- Owner checks, guest filters, activation hooks—common gaps even with light specs.