Mathematical Reasoning2024en

American Invitational Mathematics Examination 2024

30 challenging math problems from the 2024 AIME competition. Tests advanced mathematical reasoning.

Metrics:accuracy, pass@1
Paper / Website
Current State of the Art

o1-preview

OpenAI

83.3

accuracy

Top Models Performance Comparison

Top 3 models ranked by accuracy

accuracy1o1-preview83.3100.0%2Claude 3.5 Opus16.019.2%3GPT-4o13.416.1%0%25%50%75%100%% of best
Best Score
83.3
Top Model
o1-preview
Models Compared
3
Score Range
69.9

accuracyPrimary

#ModelScorePaper / CodeDate
1
o1-preview
OpenAI
83.3Dec 2025
2
Claude 3.5 Opus
Anthropic
16Dec 2025
3
GPT-4oAPI
OpenAI
13.4Dec 2025

Other Mathematical Reasoning Datasets

AIME 2024 Benchmark - Mathematical Reasoning | CodeSOTA