Codesota · OCR · Benchmark · OmniDocBench62 scored runs · 47 distinct modelsUpdated 2026-05-21
§ 00 · Opening

OmniDocBench leaderboard: PDF parsing SOTA.

Direct answer: OmniDocBench is a Shanghai AI Laboratory benchmark for end-to-end PDF document parsing. In CodeSOTA's current registry, GLM-OCR leads the composite score at 94.62. The benchmark grades text extraction, table structure, formula recognition and layout on the same page.

§ 01 · Leaderboard · Composite score

Composite score, ranked.

The headline OmniDocBench score: ((1 − TextEditDist)·100 + TableTEDS + FormulaCDM) / 3. (higher is better)

#ModelComposite scoreVerifiedSource
01PaddleOCR-VL-1.6
OmniDocBench v1.6, vendor self-reported (arXiv 2606.03264)
96.33vendor model card
02MinerU2.5-Pro
OmniDocBench v1.6, vendor (arXiv 2604.04771)
95.69paper
03GLM-OCR
Mapped from PWC OmniDocBench v1.5 Accuracy.; Overall 94.62 on OmniDocBench v1.5 from GLM-OCR arXiv Table 4 (Table 3 rounds to 94.6); metric: Accuracy.; PWC evaluation id 4956; paper: GLM-OCR Technical Report
94.62yespaperswithcode-public-api
04PaddleOCR-VL-1.5
Fetched from CodeSOTA API on 2026-04-20
94.50codesota-api
05Qianfan-OCR
Mapped from PWC OmniDocBench v1.5 Accuracy.; End-to-end document parsing on OmniDocBench v1.5; reported Overall score 93.12 (TextEdit 0.041, FormulaCDM 92.43, TableTEDs 91.02, TableTEDss 93.85, R-orderEdit 0.049). Image-to-Markdown, no external pipeline.; PWC evaluation id 1195; paper: Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
93.12yespaperswithcode-public-api
06firered-ocr-2b
Mapped from PWC OmniDocBench v1.5 Accuracy.; Overall 92.94 on OmniDocBench v1.5 from the FireRed-OCR Hugging Face card, GitHub README, and arXiv paper Tables 1-2; metric: Accuracy.; PWC evaluation id 4955; paper: FireRed-OCR Technical Report
92.94yespaperswithcode-public-api
07paddleocr-vl
Fetched from CodeSOTA API on 2026-04-20
92.86codesota-api
08paddleocr-vl-0.9b
Fetched from CodeSOTA API on 2026-04-20
92.56codesota-api
09paddleocr-vl
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 92; paper: PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
92.56yespaperswithcode-public-api
10deepseek-ocr-2
Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench v1.5 Overall 91.09 from DeepSeek-OCR 2 paper Table 1 main results, V-token max=1120; metric: Accuracy.; PWC evaluation id 4957; paper: DeepSeek-OCR 2: Visual Causal Flow
91.09yespaperswithcode-public-api
11qwen3-5-397b-a17b
Mapped from PWC OmniDocBench v1.5 Accuracy.; Official Qwen3.5 blog (https://qwen.ai/blog?id=qwen3.5). Vision table row OmniDocBench1.5; linked to OCR task.; PWC evaluation id 1059; paper: Qwen3.5: Towards Native Multimodal Agents
90.80yespaperswithcode-public-api
12mineru-2.5
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 87; paper: MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
90.67yespaperswithcode-public-api
13Gemini 3 Pro
Fetched from CodeSOTA API on 2026-04-20
90.33codesota-api
14Dolphin-v2
Fetched from CodeSOTA API on 2026-04-20
89.78codesota-api
15qwen3-vl-235b
Fetched from CodeSOTA API on 2026-04-20
89.15codesota-api
16MonkeyOCR-pro-3B
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 88; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
88.85yespaperswithcode-public-api
17Kimi K2.5
Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench v1.5 score = (1 - normalized Levenshtein distance) x 100; max 64k tokens; avg@3; Thinking mode.; PWC evaluation id 1253; paper: Kimi K2.5: Visual Agentic Intelligence
88.80yespaperswithcode-public-api
18falcon-ocr
Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench Overall score (Accuracy %) reported by tiiuae/Falcon-OCR model card (arXiv:2603.27365, Apr 2026), aggregating three sub-metrics: Edit Distance 0.055 (lower is better), CDM 86.8 (formula recognition), TEDS 84.6 (table structure). Inference via Layout + OCR two-stage pipeline (PP-DocLayoutV3 layout detection + Falcon-OCR category-prompted VLM). Mapped to OmniDocBench v1.5 leaderboard based on overall accuracy magnitude and comparator scores.; PWC evaluation id 1157; paper: Falcon Perception
88.64yespaperswithcode-public-api
19ocrverse-4b
Fetched from CodeSOTA API on 2026-04-20
88.56codesota-api
20dots-ocr-3b
Fetched from CodeSOTA API on 2026-04-20
88.41codesota-api
21gemini-25-pro
Fetched from CodeSOTA API on 2026-04-20
88.03codesota-api
22MonkeyOCR-3B
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 89; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
87.13yespaperswithcode-public-api
23qwen25-vl
Fetched from CodeSOTA API on 2026-04-20
87.02codesota-api
24MonkeyOCR-pro-1.2B
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 90; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
86.96yespaperswithcode-public-api
25PP-StructureV3
Fetched from CodeSOTA API on 2026-04-20
86.73codesota-api
26DeepSeek-OCR
Fetched from CodeSOTA API on 2026-04-20
86.46codesota-api
27Nanonets-OCR-s
Fetched from CodeSOTA API on 2026-04-20
85.59codesota-api
28MinerU2-VLM
Fetched from CodeSOTA API on 2026-04-20
85.56codesota-api
29Dolphin-1.5
Fetched from CodeSOTA API on 2026-04-20
85.06codesota-api
30InternVL3.5-241B
Fetched from CodeSOTA API on 2026-04-20
82.67codesota-api
31olmOCR-7B
Fetched from CodeSOTA API on 2026-04-20
81.79codesota-api
32olmocr
Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 91; paper: olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
81.79yespaperswithcode-public-api
33POINTS-Reader
Fetched from CodeSOTA API on 2026-04-20
80.98codesota-api
34InternVL3-76B
Fetched from CodeSOTA API on 2026-04-20
80.33codesota-api
35mistral-ocr-3
Fetched from CodeSOTA API on 2026-04-20
79.75yescodesota-api
36mistral-ocr-2512
Fetched from CodeSOTA API on 2026-04-20
79.75yescodesota-api
37MinerU2-pipeline
Fetched from CodeSOTA API on 2026-04-20
75.51codesota-api
38GPT-4o
Fetched from CodeSOTA API on 2026-04-20
75.02codesota-api
39OCRFlux-3B
Fetched from CodeSOTA API on 2026-04-20
74.82codesota-api
40Dolphin
Fetched from CodeSOTA API on 2026-04-20
74.67codesota-api
41Marker 1.8.2
Fetched from CodeSOTA API on 2026-04-20
71.30codesota-api
42clearocr-teamquest
Fetched from CodeSOTA API on 2026-04-20
31.70yescodesota-api
Fig · 42 results on Composite score. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 02 · Leaderboard · Table TEDS

Table TEDS, ranked.

Tree Edit Distance score on table structure recognition. (higher is better)

#ModelTable TEDSVerifiedSource
01paddleocr-vl
Fetched from CodeSOTA API on 2026-04-20
93.52codesota-api
02Qianfan-OCR
Fetched from CodeSOTA API on 2026-04-20
91.02codesota-api
03mistral-ocr-3
Fetched from CodeSOTA API on 2026-04-20
70.88yescodesota-api
04clearocr-teamquest
Fetched from CodeSOTA API on 2026-04-20
0.800yescodesota-api
Fig · 4 results on Table TEDS. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 03 · Leaderboard · Formula CDM

Formula CDM, ranked.

Character-level match score on formula recognition. (higher is better)

#ModelFormula CDMVerifiedSource
01Qianfan-OCR
Fetched from CodeSOTA API on 2026-04-20
92.43codesota-api
Fig · 1 result on Formula CDM. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 04 · Leaderboard · Layout mAP

Layout mAP, ranked.

Mean Average Precision on layout detection. (higher is better)

#ModelLayout mAPVerifiedSource
01mineru-2.5
Fetched from CodeSOTA API on 2026-04-20
97.50codesota-api
Fig · 1 result on Layout mAP. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 05 · Leaderboard · Reading-order score

Reading-order score, ranked.

Agreement with ground-truth reading order of page regions. (higher is better)

#ModelReading-order scoreVerifiedSource
01mistral-ocr-3
Fetched from CodeSOTA API on 2026-04-20
91.63yescodesota-api
02clearocr-teamquest
Fetched from CodeSOTA API on 2026-04-20
86.04yescodesota-api
Fig · 2 results on Reading-order score. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 06 · Leaderboard · Text edit distance

Text edit distance, ranked.

Normalised edit distance on full-page text extraction. (lower is better)

#ModelText edit distanceVerifiedSource
01mistral-ocr-3
Fetched from CodeSOTA API on 2026-04-20
0.099yescodesota-api
02minicpm-o-4-5-instruct
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Instruct mode from the openbmb/MiniCPM-o-4_5 Hugging Face model card (https://huggingface.co/openbmb/MiniCPM-o-4_5); 9B params; results reported in instruct mode/variant; from the 'OmniDocBench' table; metric label in card: OverallEdit↓, EN subset (state-of-the-art for end-to-end English document parsing claim).; PWC evaluation id 1178; paper: MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction
0.109yespaperswithcode-public-api
03deepseek-ocr-gundam-m
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; PWC evaluation id 681; paper: DeepSeek-OCR: Contexts Optical Compression
0.123yespaperswithcode-public-api
04dots-ocr
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; OmniDocBench v1.0 EN Overall Edit 0.125 from the dots.ocr paper, Hugging Face card, and GitHub blog end-to-end table; metric: Edit Distance (lower is better).; PWC evaluation id 4967; paper: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model
0.125yespaperswithcode-public-api
05PP-StructureV3
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Table 1 of the PaddleOCR 3.0 Technical Report reports OmniDocBench document parsing edit distance for PP-StructureV3: EN 0.145 and ZH 0.206. Imported the English split as the leaderboard score following existing OmniDocBench v1.0 convention; Chinese split is recorded here but not imported because there is no separate local split leaderboard. Metric: Edit Distance (lower is better).; PWC evaluation id 5637; paper: PaddleOCR 3.0 Technical Report
0.145yespaperswithcode-public-api
06clearocr-teamquest
Fetched from CodeSOTA API on 2026-04-20
0.154yescodesota-api
07Qwen2.5-VL 72B
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Table 5, OmniDocBench English edit distance from the reported en/zh pair 0.226/0.324; Chinese split not imported because there is no separate local leaderboard. Source: Qwen2.5-VL Technical Report (arXiv:2502.13923). Model: Qwen2.5-VL-72B.; PWC evaluation id 5017; paper: Qwen2.5-VL Technical Report
0.226yespaperswithcode-public-api
08lightonocr-1b-1025
Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; OmniDocBench v1.0 EN Overall Edit 0.234 from LightOnOCR arXiv Appendix C.2 / Table 6; metric: Edit Distance (lower is better).; PWC evaluation id 4968; paper: LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR
0.234yespaperswithcode-public-api
Fig · 8 results on Text edit distance. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 07 · Leaderboard · OCR edit distance

OCR edit distance, ranked.

Character-level edit distance for raw OCR. (lower is better)

#ModelOCR edit distanceVerifiedSource
01gpt-4o
Fetched from CodeSOTA API on 2026-04-20
0.020codesota-api
Fig · 1 result on OCR edit distance. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 08 · Leaderboard · Formula edit distance

Formula edit distance, ranked.

Edit distance on formula recognition. (lower is better)

#ModelFormula edit distanceVerifiedSource
01mistral-ocr-3
Fetched from CodeSOTA API on 2026-04-20
0.218yescodesota-api
02clearocr-teamquest
Fetched from CodeSOTA API on 2026-04-20
0.902yescodesota-api
Fig · 2 results on Formula edit distance. Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ 09 · Leaderboard · Text edit (vendor variant)

Text edit (vendor variant), ranked.

Edit-distance variant reported by some vendor submissions. (lower is better)

#ModelText edit (vendor variant)VerifiedSource
01Qianfan-OCR
Fetched from CodeSOTA API on 2026-04-20
0.041codesota-api
Fig · 1 result on Text edit (vendor variant). Rows sourced from benchmarks.json; shaded row marks current SOTA.
§ What it measures

Composite = text + table + formula.

The headline OmniDocBench score is a composite defined by its authors as ((1 − TextEditDist) · 100 + TableTEDS + FormulaCDM) / 3. Each component tracks a different axis of parsing accuracy, so a model that aces tables but mangles formulas cannot win on the headline.

Secondary metrics tracked here: table-teds (tree edit distance on table recognition), formula-cdm (formula character-level match), layout-map (mean average precision on layout detection), and text-edit-distance / ocr-edit-distance (lower is better for edit distances). Every reported score is preserved verbatim from the submission.

§ Dataset details

A cross-section of real document wild-type.

OmniDocBench was released by Shanghai AI Laboratory as a comprehensive benchmark for evaluating PDF document parsing across diverse document types with multi-level annotations. It is the reference bench most open-source document parsers and vendor OCR APIs now report on.

The upstream leaderboard and test split live on alphaXiv. Per-row sources on the tables above link back to the submitting paper or vendor statement.

§ How scores are verified

Reported, then reproduced.

Every row above is imported from the canonical benchmarks.json. Open-weight models are re-executed against the OmniDocBench test split through the CodeSOTA harness; closed APIs are run through the vendor endpoint with the model version and access date recorded. Rows marked “verified” have been independently reproduced, not taken from press release.

For the full reproduction policy, see the Codesota methodology.

§ Final · Related OCR benchmarks

Cross-links, sibling leaderboards.