Model card
GPT-4o mini.
OpenAIcommercialMultimodal LLM
Smaller, faster GPT-4o class model. Released July 2024.
§ 01 · Benchmarks
Every benchmark GPT-4o mini has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | KITAB-Bench | Computer Vision · Optical Character Recognition | cer | 0.4% | #4 | — | source ↗ |
| 02 | IAM | Computer Vision · Handwriting Recognition | wer | 3.3% | #10 | — | source ↗ |
| 03 | OCRBench v2 | Computer Vision · General OCR Capabilities | overall-en-private | 44.1% | #16 | 2024-07-18 | source ↗ |
| 04 | IAM | Computer Vision · Handwriting Recognition | cer | 1.7% | #21 | — | source ↗ |
| 05 | HumanEval | Computer Code · Code Generation | pass@1 | 87.2% | #27 | — | source ↗ |
| 06 | MATH | Reasoning · Mathematical Reasoning | accuracy | 70.2% | #31 | — | source ↗ |
| 07 | GPQA | Reasoning · Multi-step Reasoning | accuracy | 40.2% | #33 | — | source ↗ |
| 08 | MMLU | Reasoning · Commonsense Reasoning | accuracy | 82.0% | #40 | — | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where GPT-4o mini actually performs.
§ 04 · Related models
Other OpenAI models scored on Codesota.
§ 05 · Sources & freshness
Where these numbers come from.
openai-simple-evals
4
results
arxiv
2
results
alphaxiv-leaderboard
1
result
ocrbench-v2-leaderboard
1
result
2 of 8 rows marked verified.