Home/OCR/Arena

OCR Arena: Speed vs Quality

Human preference rankings from head-to-head battles. Lower latency + higher ELO = better.

17001600150014001300
ELO Score (Quality)
Gemini 3 Preview
ELO: 1688 | 39.2s
Opus 4.5 (Low)
ELO: 1647 | 18.5s
Gemini 2.5 Pro
ELO: 1645 | 46.6s
Opus 4.5 (Medium)
ELO: 1618 | 18.9s
GPT-5.2 (Medium)
ELO: 1595 | 35.9s
GPT-5.1 (Medium)
ELO: 1574 | 18.8s
Sonnet 4.5
ELO: 1571 | 21s
Gemini 2.5 Flash
ELO: 1549 | 14.5s
GPT-5.2 (None)
ELO: 1538 | 15s
GPT-5.1 (Low)
ELO: 1527 | 8.7s
GPT-5 (Low)
ELO: 1467 | 15.8s
GPT-5 (Medium)
ELO: 1466 | 35s
Iris
ELO: 1465 | 9.8s
Qwen3-VL-8B
ELO: 1446 | 7.2s
dots.ocr
ELO: 1438 | 3.6s
Nanonets2-3B
ELO: 1376 | 4.9s
olmOCR 2
ELO: 1324 | 12.7s
DeepSeek OCR
ELO: 1302 | 3.5s
0s10s20s30s40s50s
Latency per Page (Speed)
Open Source
Closed/API

Key Insights

Best Quality
Gemini 3 Preview
ELO 1688 | 39.2s

Highest accuracy but slowest. Best for batch processing.

Best Balance
Opus 4.5 (Low)
ELO 1647 | 18.5s

Excellent quality at reasonable speed. Good default choice.

Best Open Source
Qwen3-VL-8B
ELO 1446 | 7.2s

Best quality among OSS. Chandra fine-tunes this model.

Fastest
DeepSeek OCR
ELO 1302 | 3.5s

Fastest model but lower accuracy. Good for high-volume.

Full Rankings

RankModelTypeELOWin RateLatencyBattles
#1Gemini 3 PreviewAPI168872.2%39.2s1,609
#2Opus 4.5 (Low)API164767.7%18.5s959
#3Gemini 2.5 ProAPI164572.1%46.6s1,588
#4Opus 4.5 (Medium)API161869.7%18.9s890
#5GPT-5.2 (Medium)API159567.2%35.9s137
#6GPT-5.1 (Medium)API157460.3%18.8s1,589
#7Sonnet 4.5API157149%21s989
#8Gemini 2.5 FlashAPI154956.7%14.5s1,674
#9GPT-5.2 (None)API153862.2%15s148
#10GPT-5.1 (Low)API152755.9%8.7s1,683
#11GPT-5 (Low)API146744.4%15.8s1,587
#12GPT-5 (Medium)API146646.1%35s1,587
#13IrisOSS146536.8%9.8s163
#14Qwen3-VL-8BOSS144640.8%7.2s1,338
#15dots.ocrOSS143836.5%3.6s1,371
#16Nanonets2-3BOSS137634.1%4.9s943
#17olmOCR 2OSS132429.1%12.7s1,639
#18DeepSeek OCROSS130219.9%3.5s1,598

About OCR Arena

OCR Arena uses human preference rankings through head-to-head battles. Users compare OCR outputs from two anonymous models and select the better result. ELO scores are calculated from these battles, similar to chess rankings.

Note: Latency measurements are from the Arena API, not local inference. Self-hosted open source models can be significantly faster on dedicated hardware.