ARC-AGI-3: Frontier Models Score 0.3%
activeOrigin: François Chollet + ARC Prize, March 25, 2026 (YC HQ launch event)
ARC-AGI-3 is the third version of Chollet's general reasoning benchmark. Launch event: fireside with Sam Altman at YC HQ, March 25, 2026. GPT-5.4 and Claude Opus 4.6 both scored 0.3% on ARC-AGI-3. Humans score 100%. The most concrete 'AGI claims are still wildly premature' data point of 2026.
Usage
Use as the benchmark reality-check when AGI claims are made. '0.3% on ARC-AGI-3' is the most specific, credible counter to 'we're near AGI' claims.
Mutations & variants
'0.3% vs. 100%' — the specific score comparison
'Chollet's ARC' — shorthand for the benchmark
Cross-platform escapes
- ARC Prize official announcement: arcprize.org/blog/arc-agi-3-launch
- Mainstream press: widely covered
Shelf life: Active through 2026 — each new model version's ARC score will be a discourse event
The fact that both OpenAI's and Anthropic's best models scored 0.3% on the same day is the strongest 'scaling is not enough' data point in the discourse.