Arithmetic Reasoning
Performing arithmetic calculations and solving equations.
Arithmetic Reasoning is a key task in reasoning. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.
Benchmarks & SOTA
MAWPS
Math Word Problem Repository
3,320 arithmetic word problems from various sources, testing basic arithmetic reasoning.
State of the Art
GPT-4o
OpenAI
97.2
accuracy
SVAMP
Simple Variations on Arithmetic Math Word Problems
1,000 elementary-level math word problems testing robustness of arithmetic reasoning.
State of the Art
GPT-4o
OpenAI
93.7
accuracy
Related Tasks
Mathematical Reasoning
Solving math word problems (GSM8K, MATH, Minerva).
Commonsense Reasoning
Reasoning about everyday situations (CommonsenseQA, HellaSwag).
Logical Reasoning
Solving logic puzzles and deductive problems.
Multi-step Reasoning
Complex reasoning requiring multiple inference steps (HotpotQA).