Home/OCR/Benchmarks/olmOCR-Bench

olmOCR-Bench

Allen Institute for AI

PDF content extraction benchmark with 7,010 unit tests across 1,402 PDF documents.

28
Total Results
16
Models Tested
9
Metrics
2025-12-21
Last Updated

Pass Rate

Percentage of unit tests passed

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench.

83.1%alphaxiv-leaderboard
2infinity-parser-7b82.5%alphaxiv-leaderboard
3olmocr-v0.4.082.4%alphaxiv-leaderboard
4paddleocr-vl80%alphaxiv-leaderboard
5dots-ocr-3b79.1%github-readme
6mistral-ocr-3

Estimated based on 74% win rate vs OCR 2

78%mistral-announcement
7marker-1.10.076.5%github-readme
8marker-1.10.176.1%alphaxiv-leaderboard
9deepseek-ocr75.7%alphaxiv-leaderboard
10deepseek-ocr

Chandra outperforms by 7.7 points

75.4%github-readme
11mineru-2.575.2%alphaxiv-leaderboard
12mistral-ocr-api72%alphaxiv-leaderboard
13gpt-4o-anchored

GPT-4o with anchored prompting

69.9%github-readme
14nanonets-ocr2-3b69.5%alphaxiv-leaderboard
15gemini-flash-263.8%github-readme

tables

Higher is better

RankModelScoreSource
1dots-ocr-3b

#1 on table recognition

88.3github-readme
2chandra-ocr-0.1.0

Table recognition category. Near-best (dots.ocr: 88.3)

88github-readme

old-scans-math

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

Mathematical notation in old scans. #1, leads by 5.4 points

80.3github-readme
2olmocr-v0.3.0

#2 on math in old scans

79.9github-readme

long-tiny-text

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

Long documents with tiny text. #1 in category

92.3github-readme

base

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

Base clean document parsing. Near-perfect

99.9github-readme

headers-footers

Higher is better

RankModelScoreSource
1olmocr-v0.3.0

#1 on headers/footers extraction

95.1github-readme
2chandra-ocr-0.1.0

Header/footer extraction

90.8github-readme

multi-column

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

Multi-column document parsing

81.2github-readme

arxiv

Higher is better

RankModelScoreSource
1marker-1.10.0

#1 on ArXiv paper parsing

83.8github-readme
2chandra-ocr-0.1.0

ArXiv paper parsing. Marker leads (83.8)

82.2github-readme

old-scans

Higher is better

RankModelScoreSource
1chandra-ocr-0.1.0

Old scan recognition. #1 (GPT-4o: 40.7)

50.4github-readme
2gpt-4o

#2 on old scans. Chandra leads by 9.7 points

40.7github-readme

Explore More OCR Content