mmlu
Unknown
OCR benchmark
6
Total Results
6
Models Tested
1
Metrics
2025-12-19
Last Updated
accuracy
Higher is better
| Rank | Model | Score | Source |
|---|---|---|---|
| 1 | o1-preview | 92.3 | openai-blog |
| 2 | gpt-4o Massive Multitask Language Understanding. 57 subjects. | 88.7 | openai-blog |
| 3 | claude-35-sonnet | 88.7 | anthropic-blog |
| 4 | deepseek-v3 | 88.5 | deepseek-blog |
| 5 | gemini-15-pro | 85.9 | google-blog |
| 6 | llama-3-70b | 82 | meta-blog |