OmniDocBench leaderboard: PDF parsing SOTA.
Direct answer: OmniDocBench is a Shanghai AI Laboratory benchmark for end-to-end PDF document parsing. In CodeSOTA's current registry, GLM-OCR leads the composite score at 94.62. The benchmark grades text extraction, table structure, formula recognition and layout on the same page.
Composite score, ranked.
The headline OmniDocBench score: ((1 − TextEditDist)·100 + TableTEDS + FormulaCDM) / 3. (higher is better)
| # | Model | Composite score | Verified | Source |
|---|---|---|---|---|
| 01 | PaddleOCR-VL-1.6 OmniDocBench v1.6, vendor self-reported (arXiv 2606.03264) | 96.33 | — | vendor model card |
| 02 | MinerU2.5-Pro OmniDocBench v1.6, vendor (arXiv 2604.04771) | 95.69 | — | paper |
| 03 | GLM-OCR Mapped from PWC OmniDocBench v1.5 Accuracy.; Overall 94.62 on OmniDocBench v1.5 from GLM-OCR arXiv Table 4 (Table 3 rounds to 94.6); metric: Accuracy.; PWC evaluation id 4956; paper: GLM-OCR Technical Report | 94.62 | yes | paperswithcode-public-api |
| 04 | PaddleOCR-VL-1.5 Fetched from CodeSOTA API on 2026-04-20 | 94.50 | — | codesota-api |
| 05 | Qianfan-OCR Mapped from PWC OmniDocBench v1.5 Accuracy.; End-to-end document parsing on OmniDocBench v1.5; reported Overall score 93.12 (TextEdit 0.041, FormulaCDM 92.43, TableTEDs 91.02, TableTEDss 93.85, R-orderEdit 0.049). Image-to-Markdown, no external pipeline.; PWC evaluation id 1195; paper: Qianfan-OCR: A Unified End-to-End Model for Document Intelligence | 93.12 | yes | paperswithcode-public-api |
| 06 | firered-ocr-2b Mapped from PWC OmniDocBench v1.5 Accuracy.; Overall 92.94 on OmniDocBench v1.5 from the FireRed-OCR Hugging Face card, GitHub README, and arXiv paper Tables 1-2; metric: Accuracy.; PWC evaluation id 4955; paper: FireRed-OCR Technical Report | 92.94 | yes | paperswithcode-public-api |
| 07 | paddleocr-vl Fetched from CodeSOTA API on 2026-04-20 | 92.86 | — | codesota-api |
| 08 | paddleocr-vl-0.9b Fetched from CodeSOTA API on 2026-04-20 | 92.56 | — | codesota-api |
| 09 | paddleocr-vl Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 92; paper: PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B
Ultra-Compact Vision-Language Model | 92.56 | yes | paperswithcode-public-api |
| 10 | deepseek-ocr-2 Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench v1.5 Overall 91.09 from DeepSeek-OCR 2 paper Table 1 main results, V-token max=1120; metric: Accuracy.; PWC evaluation id 4957; paper: DeepSeek-OCR 2: Visual Causal Flow | 91.09 | yes | paperswithcode-public-api |
| 11 | qwen3-5-397b-a17b Mapped from PWC OmniDocBench v1.5 Accuracy.; Official Qwen3.5 blog (https://qwen.ai/blog?id=qwen3.5). Vision table row OmniDocBench1.5; linked to OCR task.; PWC evaluation id 1059; paper: Qwen3.5: Towards Native Multimodal Agents | 90.80 | yes | paperswithcode-public-api |
| 12 | mineru-2.5 Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 87; paper: MinerU2.5: A Decoupled Vision-Language Model for Efficient
High-Resolution Document Parsing | 90.67 | yes | paperswithcode-public-api |
| 13 | Gemini 3 Pro Fetched from CodeSOTA API on 2026-04-20 | 90.33 | — | codesota-api |
| 14 | Dolphin-v2 Fetched from CodeSOTA API on 2026-04-20 | 89.78 | — | codesota-api |
| 15 | qwen3-vl-235b Fetched from CodeSOTA API on 2026-04-20 | 89.15 | — | codesota-api |
| 16 | MonkeyOCR-pro-3B Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 88; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | 88.85 | yes | paperswithcode-public-api |
| 17 | Kimi K2.5 Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench v1.5 score = (1 - normalized Levenshtein distance) x 100; max 64k tokens; avg@3; Thinking mode.; PWC evaluation id 1253; paper: Kimi K2.5: Visual Agentic Intelligence | 88.80 | yes | paperswithcode-public-api |
| 18 | falcon-ocr Mapped from PWC OmniDocBench v1.5 Accuracy.; OmniDocBench Overall score (Accuracy %) reported by tiiuae/Falcon-OCR model card (arXiv:2603.27365, Apr 2026), aggregating three sub-metrics: Edit Distance 0.055 (lower is better), CDM 86.8 (formula recognition), TEDS 84.6 (table structure). Inference via Layout + OCR two-stage pipeline (PP-DocLayoutV3 layout detection + Falcon-OCR category-prompted VLM). Mapped to OmniDocBench v1.5 leaderboard based on overall accuracy magnitude and comparator scores.; PWC evaluation id 1157; paper: Falcon Perception | 88.64 | yes | paperswithcode-public-api |
| 19 | ocrverse-4b Fetched from CodeSOTA API on 2026-04-20 | 88.56 | — | codesota-api |
| 20 | dots-ocr-3b Fetched from CodeSOTA API on 2026-04-20 | 88.41 | — | codesota-api |
| 21 | gemini-25-pro Fetched from CodeSOTA API on 2026-04-20 | 88.03 | — | codesota-api |
| 22 | MonkeyOCR-3B Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 89; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | 87.13 | yes | paperswithcode-public-api |
| 23 | qwen25-vl Fetched from CodeSOTA API on 2026-04-20 | 87.02 | — | codesota-api |
| 24 | MonkeyOCR-pro-1.2B Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 90; paper: MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | 86.96 | yes | paperswithcode-public-api |
| 25 | PP-StructureV3 Fetched from CodeSOTA API on 2026-04-20 | 86.73 | — | codesota-api |
| 26 | DeepSeek-OCR Fetched from CodeSOTA API on 2026-04-20 | 86.46 | — | codesota-api |
| 27 | Nanonets-OCR-s Fetched from CodeSOTA API on 2026-04-20 | 85.59 | — | codesota-api |
| 28 | MinerU2-VLM Fetched from CodeSOTA API on 2026-04-20 | 85.56 | — | codesota-api |
| 29 | Dolphin-1.5 Fetched from CodeSOTA API on 2026-04-20 | 85.06 | — | codesota-api |
| 30 | InternVL3.5-241B Fetched from CodeSOTA API on 2026-04-20 | 82.67 | — | codesota-api |
| 31 | olmOCR-7B Fetched from CodeSOTA API on 2026-04-20 | 81.79 | — | codesota-api |
| 32 | olmocr Mapped from PWC OmniDocBench v1.5 Accuracy.; PWC evaluation id 91; paper: olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | 81.79 | yes | paperswithcode-public-api |
| 33 | POINTS-Reader Fetched from CodeSOTA API on 2026-04-20 | 80.98 | — | codesota-api |
| 34 | InternVL3-76B Fetched from CodeSOTA API on 2026-04-20 | 80.33 | — | codesota-api |
| 35 | mistral-ocr-3 Fetched from CodeSOTA API on 2026-04-20 | 79.75 | yes | codesota-api |
| 36 | mistral-ocr-2512 Fetched from CodeSOTA API on 2026-04-20 | 79.75 | yes | codesota-api |
| 37 | MinerU2-pipeline Fetched from CodeSOTA API on 2026-04-20 | 75.51 | — | codesota-api |
| 38 | GPT-4o Fetched from CodeSOTA API on 2026-04-20 | 75.02 | — | codesota-api |
| 39 | OCRFlux-3B Fetched from CodeSOTA API on 2026-04-20 | 74.82 | — | codesota-api |
| 40 | Dolphin Fetched from CodeSOTA API on 2026-04-20 | 74.67 | — | codesota-api |
| 41 | Marker 1.8.2 Fetched from CodeSOTA API on 2026-04-20 | 71.30 | — | codesota-api |
| 42 | clearocr-teamquest Fetched from CodeSOTA API on 2026-04-20 | 31.70 | yes | codesota-api |
Table TEDS, ranked.
Tree Edit Distance score on table structure recognition. (higher is better)
| # | Model | Table TEDS | Verified | Source |
|---|---|---|---|---|
| 01 | paddleocr-vl Fetched from CodeSOTA API on 2026-04-20 | 93.52 | — | codesota-api |
| 02 | Qianfan-OCR Fetched from CodeSOTA API on 2026-04-20 | 91.02 | — | codesota-api |
| 03 | mistral-ocr-3 Fetched from CodeSOTA API on 2026-04-20 | 70.88 | yes | codesota-api |
| 04 | clearocr-teamquest Fetched from CodeSOTA API on 2026-04-20 | 0.800 | yes | codesota-api |
Formula CDM, ranked.
Character-level match score on formula recognition. (higher is better)
| # | Model | Formula CDM | Verified | Source |
|---|---|---|---|---|
| 01 | Qianfan-OCR Fetched from CodeSOTA API on 2026-04-20 | 92.43 | — | codesota-api |
Layout mAP, ranked.
Mean Average Precision on layout detection. (higher is better)
| # | Model | Layout mAP | Verified | Source |
|---|---|---|---|---|
| 01 | mineru-2.5 Fetched from CodeSOTA API on 2026-04-20 | 97.50 | — | codesota-api |
Reading-order score, ranked.
Agreement with ground-truth reading order of page regions. (higher is better)
| # | Model | Reading-order score | Verified | Source |
|---|---|---|---|---|
| 01 | mistral-ocr-3 Fetched from CodeSOTA API on 2026-04-20 | 91.63 | yes | codesota-api |
| 02 | clearocr-teamquest Fetched from CodeSOTA API on 2026-04-20 | 86.04 | yes | codesota-api |
Text edit distance, ranked.
Normalised edit distance on full-page text extraction. (lower is better)
| # | Model | Text edit distance | Verified | Source |
|---|---|---|---|---|
| 01 | mistral-ocr-3 Fetched from CodeSOTA API on 2026-04-20 | 0.099 | yes | codesota-api |
| 02 | minicpm-o-4-5-instruct Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Instruct mode from the openbmb/MiniCPM-o-4_5 Hugging Face model card (https://huggingface.co/openbmb/MiniCPM-o-4_5); 9B params; results reported in instruct mode/variant; from the 'OmniDocBench' table; metric label in card: OverallEdit↓, EN subset (state-of-the-art for end-to-end English document parsing claim).; PWC evaluation id 1178; paper: MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction | 0.109 | yes | paperswithcode-public-api |
| 03 | deepseek-ocr-gundam-m Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; PWC evaluation id 681; paper: DeepSeek-OCR: Contexts Optical Compression | 0.123 | yes | paperswithcode-public-api |
| 04 | dots-ocr Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; OmniDocBench v1.0 EN Overall Edit 0.125 from the dots.ocr paper, Hugging Face card, and GitHub blog end-to-end table; metric: Edit Distance (lower is better).; PWC evaluation id 4967; paper: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model | 0.125 | yes | paperswithcode-public-api |
| 05 | PP-StructureV3 Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Table 1 of the PaddleOCR 3.0 Technical Report reports OmniDocBench document parsing edit distance for PP-StructureV3: EN 0.145 and ZH 0.206. Imported the English split as the leaderboard score following existing OmniDocBench v1.0 convention; Chinese split is recorded here but not imported because there is no separate local split leaderboard. Metric: Edit Distance (lower is better).; PWC evaluation id 5637; paper: PaddleOCR 3.0 Technical Report | 0.145 | yes | paperswithcode-public-api |
| 06 | clearocr-teamquest Fetched from CodeSOTA API on 2026-04-20 | 0.154 | yes | codesota-api |
| 07 | Qwen2.5-VL 72B Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; Table 5, OmniDocBench English edit distance from the reported en/zh pair 0.226/0.324; Chinese split not imported because there is no separate local leaderboard. Source: Qwen2.5-VL Technical Report (arXiv:2502.13923). Model: Qwen2.5-VL-72B.; PWC evaluation id 5017; paper: Qwen2.5-VL Technical Report | 0.226 | yes | paperswithcode-public-api |
| 08 | lightonocr-1b-1025 Mapped from PWC OmniDocBench v1.0 Edit Distance; lower is better.; OmniDocBench v1.0 EN Overall Edit 0.234 from LightOnOCR arXiv Appendix C.2 / Table 6; metric: Edit Distance (lower is better).; PWC evaluation id 4968; paper: LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR | 0.234 | yes | paperswithcode-public-api |
OCR edit distance, ranked.
Character-level edit distance for raw OCR. (lower is better)
| # | Model | OCR edit distance | Verified | Source |
|---|---|---|---|---|
| 01 | gpt-4o Fetched from CodeSOTA API on 2026-04-20 | 0.020 | — | codesota-api |
Formula edit distance, ranked.
Edit distance on formula recognition. (lower is better)
| # | Model | Formula edit distance | Verified | Source |
|---|---|---|---|---|
| 01 | mistral-ocr-3 Fetched from CodeSOTA API on 2026-04-20 | 0.218 | yes | codesota-api |
| 02 | clearocr-teamquest Fetched from CodeSOTA API on 2026-04-20 | 0.902 | yes | codesota-api |
Text edit (vendor variant), ranked.
Edit-distance variant reported by some vendor submissions. (lower is better)
| # | Model | Text edit (vendor variant) | Verified | Source |
|---|---|---|---|---|
| 01 | Qianfan-OCR Fetched from CodeSOTA API on 2026-04-20 | 0.041 | — | codesota-api |
Composite = text + table + formula.
The headline OmniDocBench score is a composite defined by its authors as ((1 − TextEditDist) · 100 + TableTEDS + FormulaCDM) / 3. Each component tracks a different axis of parsing accuracy, so a model that aces tables but mangles formulas cannot win on the headline.
Secondary metrics tracked here: table-teds (tree edit distance on table recognition), formula-cdm (formula character-level match), layout-map (mean average precision on layout detection), and text-edit-distance / ocr-edit-distance (lower is better for edit distances). Every reported score is preserved verbatim from the submission.
A cross-section of real document wild-type.
OmniDocBench was released by Shanghai AI Laboratory as a comprehensive benchmark for evaluating PDF document parsing across diverse document types with multi-level annotations. It is the reference bench most open-source document parsers and vendor OCR APIs now report on.
The upstream leaderboard and test split live on alphaXiv. Per-row sources on the tables above link back to the submitting paper or vendor statement.
Reported, then reproduced.
Every row above is imported from the canonical benchmarks.json. Open-weight models are re-executed against the OmniDocBench test split through the CodeSOTA harness; closed APIs are run through the vendor endpoint with the model version and access date recorded. Rows marked “verified” have been independently reproduced, not taken from press release.
For the full reproduction policy, see the Codesota methodology.