Multi-step Reasoning2021en

StrategyQA

2,780 yes/no questions requiring implicit multi-step reasoning to answer.

Metrics:accuracy
Paper / WebsiteDownload
Current State of the Art

GPT-4o

OpenAI

82.1

accuracy

accuracyPrimary

#ModelScorePaper / CodeDate
1
GPT-4oAPI
OpenAI
82.1Dec 2025
2
Claude 3.5 SonnetAPI
Anthropic
79.8Dec 2025

Other Multi-step Reasoning Datasets

StrategyQA Benchmark - Multi-step Reasoning | CodeSOTA