MMLU

Unknown

15,908 multiple choice questions across 57 subjects from elementary to professional level.

Benchmark Stats

Models18
Papers18
Metrics1

SOTA History

Coming Soon
Visual timeline of state-of-the-art progression over time will appear here.

accuracy

accuracy

Higher is better

RankModelCodeScorePaper / Source
1o3-92.9openai-simple-evals
2o1-91.8openai-simple-evals
3gpt-45-preview-90.8openai-simple-evals
4o1-preview-90.8openai-simple-evals
5gpt-41-90.2openai-simple-evals
6o4-mini-90openai-simple-evals
7llama-31-405b-88.6openai-simple-evals
8deepseek-v3-88.5openai-simple-evals
9claude-35-sonnet-88.3openai-simple-evals
10grok-2-87.5openai-simple-evals
11gpt-4o-87.2openai-simple-evals
12claude-3-opus-86.8openai-simple-evals
13gpt-4-turbo-86.7openai-simple-evals
14o3-mini-85.9openai-simple-evals
15gemini-15-pro-85.9openai-simple-evals
16o1-mini-85.2openai-simple-evals
17gpt-4o-mini-82openai-simple-evals
18llama-31-70b-82openai-simple-evals