OCR Arena: Speed vs Quality
Human preference rankings from head-to-head battles. Lower latency + higher ELO = better.
1700 1600 1500 1400 1300
ELO Score (Quality)
Gemini 3 Preview
ELO: 1688 | 39.2s
Opus 4.5 (Low)
ELO: 1647 | 18.5s
Gemini 2.5 Pro
ELO: 1645 | 46.6s
Opus 4.5 (Medium)
ELO: 1618 | 18.9s
GPT-5.2 (Medium)
ELO: 1595 | 35.9s
GPT-5.1 (Medium)
ELO: 1574 | 18.8s
Sonnet 4.5
ELO: 1571 | 21s
Gemini 2.5 Flash
ELO: 1549 | 14.5s
GPT-5.2 (None)
ELO: 1538 | 15s
GPT-5.1 (Low)
ELO: 1527 | 8.7s
GPT-5 (Low)
ELO: 1467 | 15.8s
GPT-5 (Medium)
ELO: 1466 | 35s
Iris
ELO: 1465 | 9.8s
Qwen3-VL-8B
ELO: 1446 | 7.2s
dots.ocr
ELO: 1438 | 3.6s
Nanonets2-3B
ELO: 1376 | 4.9s
olmOCR 2
ELO: 1324 | 12.7s
DeepSeek OCR
ELO: 1302 | 3.5s
0s 10s 20s 30s 40s 50s
Latency per Page (Speed)
Open Source
Closed/API
Key Insights
Best Quality
Gemini 3 Preview
ELO 1688 | 39.2s
Highest accuracy but slowest. Best for batch processing.
Best Balance
Opus 4.5 (Low)
ELO 1647 | 18.5s
Excellent quality at reasonable speed. Good default choice.
Best Open Source
Qwen3-VL-8B
ELO 1446 | 7.2s
Best quality among OSS. Chandra fine-tunes this model.
Fastest
DeepSeek OCR
ELO 1302 | 3.5s
Fastest model but lower accuracy. Good for high-volume.
Full Rankings
| Rank | Model | Type | ELO | Win Rate | Latency | Battles |
|---|---|---|---|---|---|---|
| #1 | Gemini 3 Preview | API | 1688 | 72.2% | 39.2s | 1,609 |
| #2 | Opus 4.5 (Low) | API | 1647 | 67.7% | 18.5s | 959 |
| #3 | Gemini 2.5 Pro | API | 1645 | 72.1% | 46.6s | 1,588 |
| #4 | Opus 4.5 (Medium) | API | 1618 | 69.7% | 18.9s | 890 |
| #5 | GPT-5.2 (Medium) | API | 1595 | 67.2% | 35.9s | 137 |
| #6 | GPT-5.1 (Medium) | API | 1574 | 60.3% | 18.8s | 1,589 |
| #7 | Sonnet 4.5 | API | 1571 | 49% | 21s | 989 |
| #8 | Gemini 2.5 Flash | API | 1549 | 56.7% | 14.5s | 1,674 |
| #9 | GPT-5.2 (None) | API | 1538 | 62.2% | 15s | 148 |
| #10 | GPT-5.1 (Low) | API | 1527 | 55.9% | 8.7s | 1,683 |
| #11 | GPT-5 (Low) | API | 1467 | 44.4% | 15.8s | 1,587 |
| #12 | GPT-5 (Medium) | API | 1466 | 46.1% | 35s | 1,587 |
| #13 | Iris | OSS | 1465 | 36.8% | 9.8s | 163 |
| #14 | Qwen3-VL-8B | OSS | 1446 | 40.8% | 7.2s | 1,338 |
| #15 | dots.ocr | OSS | 1438 | 36.5% | 3.6s | 1,371 |
| #16 | Nanonets2-3B | OSS | 1376 | 34.1% | 4.9s | 943 |
| #17 | olmOCR 2 | OSS | 1324 | 29.1% | 12.7s | 1,639 |
| #18 | DeepSeek OCR | OSS | 1302 | 19.9% | 3.5s | 1,598 |
About OCR Arena
OCR Arena uses human preference rankings through head-to-head battles. Users compare OCR outputs from two anonymous models and select the better result. ELO scores are calculated from these battles, similar to chess rankings.
Note: Latency measurements are from the Arena API, not local inference. Self-hosted open source models can be significantly faster on dedicated hardware.