MMLU

Unknown

15,908 multiple choice questions across 57 subjects from elementary to professional level.

Benchmark Stats

Models6
Papers6
Metrics1

SOTA History

Coming Soon
Visual timeline of state-of-the-art progression over time will appear here.

accuracy

accuracy

Higher is better

RankModelCodeScorePaper / Source
1o1-preview-92.3openai-blog
2gpt-4o

Massive Multitask Language Understanding. 57 subjects.

-88.7openai-blog
3claude-35-sonnet-88.7anthropic-blog
4deepseek-v3-88.5deepseek-blog
5gemini-15-pro-85.9google-blog
6llama-3-70bHF82meta-blog