Logical Reasoning2025en

Abstraction and Reasoning Corpus for AGI (v2)

Harder successor to ARC-AGI-1, released 2025. Designed to be more resistant to test-time compute scaling. Scores reported as % on public evaluation set.

Samples:400
Metrics:accuracy
Paper / Website
Current State of the Art

Gemini 2.5 Pro

Google

5

accuracy

Top Models Performance Comparison

Top 3 models ranked by accuracy

accuracy1Gemini 2.5 Pro5.0100.0%2o34.080.0%3o4-mini3.060.0%0%25%50%75%100%% of best
Best Score
5.0
Top Model
Gemini 2.5 Pro
Models Compared
3
Score Range
2.0

accuracyPrimary

#ModelScorePaper / CodeDate
1
Gemini 2.5 ProAPI
Google
5Mar 2026
2
o3API
OpenAI
4Mar 2026
3
o4-miniAPI
OpenAI
3Mar 2026

Other Logical Reasoning Datasets