Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Benchmark · olmOCR-BenchHome/Leaderboards/Vision & Documents/Document Parsing/olmOCR-Bench
Allen Institute for AI

olmOCR-Bench.

7,010 unit tests across 1,402 PDF documents. Tests parsing of tables, math, multi-column layouts, old scans, and more.

Paper Leaderboard Lineage
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Base

Base is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Baseverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01Chandra v0.1.0
Base clean document parsing. Near-perfect
unverified99.92025Source ↗Edit result
02chandra-ocr-0.1.0
Base clean document parsing. Near-perfect
paper99.92025Source ↗Edit result
03olmOCR v0.4.0
olmOCR 2. Sub-category: base clean documents.
paper99.72025Source ↗Edit result
04olmocr-v0.4.0
olmOCR 2. Base clean documents sub-category.
paper99.72025Source ↗Edit result
05LightOnOCR-2-1B
LightOnOCR-2-1B. Base clean documents sub-category.
paper99.62026Source ↗Edit result
06Qianfan-OCR
Qianfan-OCR. Base clean documents sub-category.
paper99.62026Source ↗Edit result

Headers Footers

Headers Footers is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Headers Footersverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01olmOCR v0.4.0
olmOCR 2. Sub-category: headers/footers.
paper96.12025Source ↗Edit result
02olmocr-v0.4.0
olmOCR 2. Headers/footers sub-category.
paper96.12025Source ↗Edit result
03olmOCR v0.3.0
#1 on headers/footers extraction
unverified95.12025Source ↗Edit result
04olmocr-v0.3.0
#1 on headers/footers extraction
paper95.12025Source ↗Edit result
05chandra-ocr-0.1.0
Header/footer extraction
paper90.82025Source ↗Edit result
06Chandra v0.1.0
Header/footer extraction
unverified90.82025Source ↗Edit result

Long Tiny Text

Long Tiny Text is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Long Tiny Textverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01Chandra v0.1.0
Long documents with tiny text. #1 in category
unverified92.32025Source ↗Edit result
02chandra-ocr-0.1.0
Long documents with tiny text. #1 in category
paper92.32025Source ↗Edit result
03LightOnOCR-2-1B
LightOnOCR-2-1B. Long tiny text sub-category.
paper91.42026Source ↗Edit result
04olmocr-v0.4.0
olmOCR 2. Long tiny text sub-category.
paper81.92025Source ↗Edit result
05olmOCR v0.4.0
olmOCR 2. Sub-category: long tiny text.
paper81.92025Source ↗Edit result
06Qianfan-OCR
Qianfan-OCR. Long tiny text sub-category.
paper80.42026Source ↗Edit result

Multi Column

Multi Column is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Multi Columnverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01Qianfan-OCR
Qianfan-OCR. Multi-column layout sub-category.
paper92.22026Source ↗Edit result
02LightOnOCR-2-1B
LightOnOCR-2-1B. Multi-column layout sub-category.
paper84.82026Source ↗Edit result
03olmocr-v0.4.0
olmOCR 2. Multi-column layout sub-category.
paper83.72025Source ↗Edit result
04olmOCR v0.4.0
olmOCR 2. Sub-category: multi-column layout.
paper83.72025Source ↗Edit result
05Chandra v0.1.0
Multi-column document parsing
unverified81.22025Source ↗Edit result
06chandra-ocr-0.1.0
Multi-column document parsing
paper81.22025Source ↗Edit result

Arxiv

Arxiv is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Arxivverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01LightOnOCR-2-1B
LightOnOCR-2-1B. ArXiv math documents sub-category.
paper89.62026Source ↗Edit result
02marker-1.10.0
#1 on ArXiv paper parsing
paper83.82025Source ↗Edit result
03Marker 1.10.0
#1 on ArXiv paper parsing
unverified83.82025Source ↗Edit result
04olmOCR v0.4.0
olmOCR 2 (arxiv:2510.19817). Sub-category: ArXiv math documents.
paper832025Source ↗Edit result
05olmocr-v0.4.0
olmOCR 2 (arxiv:2510.19817). ArXiv math documents sub-category.
paper832025Source ↗Edit result
06chandra-ocr-0.1.0
ArXiv paper parsing. Marker leads (83.8)
paper82.22025Source ↗Edit result
07Chandra v0.1.0
ArXiv paper parsing. Marker leads (83.8)
unverified82.22025Source ↗Edit result
08Qianfan-OCR
Qianfan-OCR. ArXiv math documents sub-category.
paper80.12026Source ↗Edit result

Tables

Tables is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Tablesverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01LightOnOCR-2-1B
LightOnOCR-2-1B. Table recognition sub-category.
paper892026Source ↗Edit result
02dots.ocr 3B
#1 on table recognition
unverified88.32025Source ↗Edit result
03dots-ocr-3b
#1 on table recognition
paper88.32025Source ↗Edit result
04Chandra v0.1.0
Table recognition category. Near-best (dots.ocr: 88.3)
unverified882025Source ↗Edit result
05chandra-ocr-0.1.0
Table recognition category. Near-best (dots.ocr: 88.3)
paper882025Source ↗Edit result
06olmocr-v0.4.0
olmOCR 2. Table recognition sub-category.
paper84.92025Source ↗Edit result
07olmOCR v0.4.0
olmOCR 2. Sub-category: table recognition.
paper84.92025Source ↗Edit result
08Qianfan-OCR
Qianfan-OCR. Table recognition sub-category.
paper81.62026Source ↗Edit result

Accuracy

Accuracy is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01Infinity-Parser2-Prounverified87.62026Paper ↗Edit result
02Chandra 2unverified85.92026Paper ↗Code ↗Edit result
03dots.mocrunverified83.92026Paper ↗Code ↗Edit result
04LightOnOCR-2-1Bunverified83.22026Paper ↗Source ↗Edit result
05Chandraunverified83.12025Paper ↗Edit result
06Infinity-Parser 7Bunverified82.52025Paper ↗Code ↗Edit result
07olmOCR-2-7B-1025 (7B)unverified82.42025Paper ↗Edit result
08Falcon-OCRunverified80.32026Paper ↗Code ↗Edit result
09PaddleOCR-VLunverified802025Paper ↗Code ↗Edit result
10Qianfan-OCRunverified79.82026Paper ↗Code ↗Edit result
11dots.ocrunverified79.12025Paper ↗Code ↗Edit result
12MinerU2.5unverified77.52025Paper ↗Code ↗Edit result
13DeepSeek-OCR-2unverified76.32026Paper ↗Code ↗Edit result
14LightOnOCR-1B-1025unverified76.12026Paper ↗Edit result

Old Scans Math

Old Scans Math is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Old Scans Mathverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01LightOnOCR-2-1B
LightOnOCR-2-1B. Old scans with math sub-category.
paper85.62026Source ↗Edit result
02olmocr-v0.4.0
olmOCR 2. Old scans with math sub-category.
paper82.32025Source ↗Edit result
03olmOCR v0.4.0
olmOCR 2. Sub-category: old scans with math.
paper82.32025Source ↗Edit result
04chandra-ocr-0.1.0
Mathematical notation in old scans. #1, leads by 5.4 points
paper80.32025Source ↗Edit result
05Chandra v0.1.0
Mathematical notation in old scans. #1, leads by 5.4 points
unverified80.32025Source ↗Edit result
06olmocr-v0.3.0
#2 on math in old scans
paper79.92025Source ↗Edit result
07olmOCR v0.3.0
#2 on math in old scans
unverified79.92025Source ↗Edit result

Pass Rate

Pass Rate is the reported evaluation metric for olmOCR-Bench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Pass Rateverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01dots.mocr
dots.mocr (arxiv:2603.13032). Current SOTA on olmOCR-Bench (83.9 ± 0.9). 3B multimodal model.
unverified83.92026Source ↗Edit result
02LightOnOCR-2-1B
LightOnOCR-2-1B. SOTA at publication (Jan 2026). 9x smaller than Chandra-9B, 3.3x faster.
paper83.22026Source ↗Edit result
03chandra-ocr-0.1.0
7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench.
paper83.12025Source ↗Edit result
04Chandra v0.1.0
7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench.
unverified83.12025Source ↗Edit result
05infinity-parser-7bpaper82.52025Source ↗Edit result
06Infinity-Parser 7Bunverified82.52025Source ↗Edit result
07olmOCR v0.4.0unverified82.42025Source ↗Edit result
08olmocr-v0.4.0paper82.42025Source ↗Edit result
09paddleocr-vlpaper802025Source ↗Edit result
10Qianfan-OCR
Baidu Qianfan-OCR 4B (Qwen3-4B + Qianfan-ViT), Apache 2.0, 192 langs. Layout-as-Thought.
paper79.82026Source ↗Edit result
11Qwen3-VL-4B
Qwen3-VL-4B score from Qianfan-OCR Table 3. General-purpose VLM baseline.
paper79.22026Source ↗Edit result
12PaddleOCR-VL-1.5
PaddleOCR-VL-1.5 (arxiv:2601.21957). Score from Qianfan-OCR comparison table (Table 3). 0.9B params.
paper79.12026Source ↗Edit result
13dots.ocr 3Bunverified79.12025Source ↗Edit result
14dots-ocr-3bpaper79.12025Source ↗Edit result
15mistral-ocr-3
Estimated based on 74% win rate vs OCR 2
paper782025Source ↗Edit result
16Mistral OCR 3
Estimated based on 74% win rate vs OCR 2
unverified782025Source ↗Edit result
17Marker 1.10.0unverified76.52025Source ↗Edit result
18marker-1.10.0paper76.52025Source ↗Edit result
19marker-1.10.1paper76.12025Source ↗Edit result
20Marker 1.10.1unverified76.12025Source ↗Edit result
Lineage

olmOCR-Bench in context.

See full ocr benchmarks lineage →
This benchmark (1)
active2025-03
olmOCR-Bench
§ 04 · Submit a result

Add to the leaderboard.

← Back to Document Parsing