ARC-AGI-3: Frontier Models Score 0.3%

active

Origin: François Chollet + ARC Prize, March 25, 2026 (YC HQ launch event)

ARC-AGI-3 is the third version of Chollet's general reasoning benchmark. Launch event: fireside with Sam Altman at YC HQ, March 25, 2026. GPT-5.4 and Claude Opus 4.6 both scored 0.3% on ARC-AGI-3. Humans score 100%. The most concrete 'AGI claims are still wildly premature' data point of 2026.

Usage

Use as the benchmark reality-check when AGI claims are made. '0.3% on ARC-AGI-3' is the most specific, credible counter to 'we're near AGI' claims.

Mutations & variants

'0.3% vs. 100%' — the specific score comparison 'Chollet's ARC' — shorthand for the benchmark

Cross-platform escapes

ARC Prize official announcement: arcprize.org/blog/arc-agi-3-launch
Mainstream press: widely covered

Shelf life: Active through 2026 — each new model version's ARC score will be a discourse event

The fact that both OpenAI's and Anthropic's best models scored 0.3% on the same day is the strongest 'scaling is not enough' data point in the discourse.

← Back to Memes & Discourse