AIME 2024.

The AIME 2024 dataset contains problems from the American Invitational Mathematics Examination (AIME) 2024. It is primarily used for evaluating Large Language Models' (LLMs) mathematical reasoning and problem-solving capabilities on complex mathematical problems. Each record includes an ID, problem statement, detailed solution process, and the final numerical answer. The dataset covers various mathematical domains (geometry, algebra, number theory, etc.) and is known for its high difficulty level.

Submit a result ↵

§ 01 · Leaderboard

Best published scores.

No results indexed yet — be the first to submit a score.

No benchmark results indexed yet

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

AIME 2024.

Best published scores.

Neighbouring benchmarks.

Have a score that beatsthis table?

Have a score that beats
this table?