Codesota · OCR · Vol. IIThe most commercially relevant benchmark category on the siteIssue: April 22, 2026
§ 00 · Opening

OCR, measured honestly.
Every model. Every benchmark. Dated.

Document OCR is the modality where a wrong score costs money — where a vendor’s overclaim ends in a re-tendered contract, and where an open-source release can collapse a line item by two orders of magnitude overnight.

163 models on 99 benchmarks, 1008 scored runs. Every number is drawn from benchmarks.json and scored through lib/scoring; nothing is interpolated, every score dated.

§ 01 · Benchmark surface

Open-source OCR, ranked.

Models you can pull, fine-tune and self-host. The shaded row is the current state of the art on OmniDocBench; the Δ column reports change against the previous submission from the same organisation, when one exists.


Models
110
Benchmarks
OmniDoc · OCRBench · olmOCR
Source
benchmarks.json
Open source · 110 models
Shaded row marks SOTA · sorted by OmniDoc composite
#ModelVendorLicenseOmniDocOCRBench ENolmOCR
01PaddleOCR-VLBaiduApache 2.092.8680.0
02PaddleOCR-VL 0.9BBaiduApache 2.092.56
03MinerU 2.5OpenDataLabAGPL-3.090.6775.2
04Qwen3-VL-235BAlibabaQwen License89.15
05MonkeyOCR-pro-3BUnknownApache 2.0 / MIT88.85
06OCRVerse 4BUnknownApache 2.0 / MIT88.56
07dots.ocr 3BRedNote HILabApache 2.088.4179.1
08Qwen2.5-VLAlibabaApache 2.087.02
09Chandra v0.1.0datalab-toApache 2.083.1
10Infinity-Parser 7BUnknownApache 2.0 / MIT82.5
11olmOCR v0.4.0Allen AIApache 2.082.4
12Marker 1.10.0VikParuchuriApache 2.0 / MIT76.5
13Marker 1.10.1VikParuchuriApache 2.0 / MIT76.1
14DeepSeek OCRDeepSeekApache 2.0 / MIT75.7
15GPT-4o (Anchored)OpenAIApache 2.0 / MIT69.9
16Nanonets OCR2 3BNanonetsApache 2.0 / MIT69.5
17Gemini Flash 2GoogleApache 2.0 / MIT63.8
18Qwen3-Omni-30BAlibabaQwen License61.3%
19Nemotron Nano V2 VLNVIDIANVIDIA Open Model License61.2%
20GPT-4o MiniOpenAIApache 2.0 / MIT44.1%
21CoCa (finetuned)GoogleApache 2.0
22ViT-G/14GoogleApache 2.0
23ViT-H/14GoogleApache 2.0
24ViT-L/16GoogleApache 2.0
25ViT-B/16GoogleApache 2.0
26ConvNeXt V2 HugeMetaMIT
27ConvNeXt V2 BaseMetaMIT
28ConvNeXt V2 TinyMetaMIT
29Swin Transformer V2 LargeMicrosoftMIT
30Swin Transformer LargeMicrosoftMIT
31EfficientNetV2-LGoogleApache 2.0
32EfficientNet-B7GoogleApache 2.0
33EfficientNet-B0GoogleApache 2.0
34DeiT-B DistilledMetaApache 2.0
35DeiT-BMetaApache 2.0
36ResNet-152MicrosoftMIT
37ResNet-50MicrosoftMIT
38ResNet-50 (A3 training)TimmApache 2.0
39Qwen2.5-VL 72BAlibabaApache 2.0
40InternVL2-76BShanghai AI LabMIT
41InternVL3-78BShanghai AI LabApache 2.0 / MIT
42TesseractGoogle (Open Source)Apache 2.0
43EasyOCRJaidedAIApache 2.0
44olmOCR v0.3.0Allen AIApache 2.0 / MIT
45Qwen2-VL 72BAlibabaApache 2.0 / MIT
46Qwen2.5-VL 32BAlibabaApache 2.0 / MIT
47AIN 7BResearchApache 2.0 / MIT
48Azure OCRMicrosoftApache 2.0 / MIT
49PaddleOCRBaiduApache 2.0 / MIT
50InternVL3 14BOpenGVLabApache 2.0 / MIT
51o1-previewOpenAIApache 2.0 / MIT
52Llama 3 70BMetaApache 2.0 / MIT
53DeepSeek V3DeepSeekApache 2.0 / MIT
54Claude 3.5 OpusAnthropicApache 2.0 / MIT
55AL-NegatResearchApache 2.0 / MIT
56GCNResearchApache 2.0 / MIT
57Multi-Task TransformerResearchApache 2.0 / MIT
58Deep Learning (Heinsfeld)ResearchApache 2.0 / MIT
59PHGCL-DDGFormerResearchApache 2.0 / MIT
60Random ForestBaselineApache 2.0 / MIT
61MAACNNResearchApache 2.0 / MIT
62Multi-Atlas DNNResearchApache 2.0 / MIT
63Abraham ConnectomesResearchApache 2.0 / MIT
64Go-ExploreUber AIApache 2.0 / MIT
65BrainGNNResearchMIT
66MVS-GCNResearchApache 2.0 / MIT
67BrainGTResearchApache 2.0 / MIT
68SVM with Connectivity FeaturesResearchApache 2.0 / MIT
69AE-FCNResearchApache 2.0 / MIT
70DeepASDResearchApache 2.0 / MIT
71MCBERTResearchApache 2.0 / MIT
72ASD-SWNetResearchApache 2.0 / MIT
73Agent57DeepMindApache 2.0 / MIT
74MuZeroDeepMindApache 2.0 / MIT
75DreamerV3DeepMindApache 2.0 / MIT
76Rainbow DQNDeepMindApache 2.0 / MIT
77DQN (Human-level)DeepMindApache 2.0 / MIT
78Human ProfessionalBiologyApache 2.0 / MIT
79BBOS-1UnknownApache 2.0 / MIT
80GDI-H3ResearchApache 2.0 / MIT
81Plymouth DL ModelResearchApache 2.0 / MIT
82Co-DETR (Swin-L)ResearchApache 2.0 / MIT
83InternImage-HShanghai AI LabApache 2.0 / MIT
84DINO (Swin-L)ResearchApache 2.0 / MIT
85YOLOv10-XTsinghuaApache 2.0 / MIT
86Mask2Former (Swin-L)MetaApache 2.0 / MIT
87EfficientDet-D7xGoogleApache 2.0 / MIT
88CheXNetStanford ML GroupMIT
89TorchXRayVisionCohen LabApache 2.0
90CheXzeroHarvard/MITMIT
91MedCLIPResearchMIT
92GLoRIAStanfordMIT
93BioViLMicrosoftMIT
94RAD-DINOMicrosoftMIT
95CheXpert AUC MaximizerStanfordApache 2.0 / MIT
96DenseNet-121 (Chest X-ray)ResearchMIT
97ResNet-50 (Chest X-ray)ResearchMIT
98ConVIRTNYUApache 2.0 / MIT
99PatchCoreAmazonApache 2.0
100PaDiMResearchApache 2.0
101FastFlowResearchMIT
102EfficientADResearchMIT
103SimpleNetResearchMIT
104DRAEMResearchMIT
105CFLOW-ADResearchApache 2.0
106Reverse DistillationResearchMIT
107YOLOv8 (Weld Detection)UltralyticsAGPL-3.0
108DefectDet (ResNet)ResearchApache 2.0 / MIT
109Llama 3.1 405BMetaApache 2.0 / MIT
110Llama 3.1 70BMetaApache 2.0 / MIT
Fig 1 · Open-source OCR models on OmniDocBench (composite), OCRBench v2 (English private split) and olmOCR (pass-rate). Empty cells mean no reproducible score is in the registry yet. License field read verbatim from models.json.
§ 02 · Vendor surface

Paid APIs, ranked.

Enterprises still pay for SLAs, compliance and a single throat to choke. The table below lists vendor endpoints scored on the same three benchmarks as the open-source table, so the two can be read against each other.


List prices vary by region and volume; see each vendor’s billing page. For self-hosted comparisons in dollars-per-thousand-pages see our economics essay.

Vendor APIs · 20 endpoints
Sorted by OmniDoc composite
#VendorProviderOmniDocOCRBench ENolmOCRPrice / 1K
01Gemini 2.5 ProGoogle88.0359.3%varies
02Mistral OCR 3Mistral79.7578.0varies
03Mistral OCR 2Mistral72.0varies
04Seed1.6-visionByteDance62.2%varies
05GPT-4oOpenAI55.5%varies
06Claude Sonnet 4Anthropic42.4%varies
07clearOCRTeamQuest31.70varies
08Gemini 2.0 FlashGooglevaries
09Gemini 1.5 ProGooglevaries
10Claude 3.5 SonnetAnthropicvaries
11o1OpenAIvaries
12o1-miniOpenAIvaries
13o3OpenAIvaries
14o3-miniOpenAIvaries
15o4-miniOpenAIvaries
16GPT-4.1OpenAIvaries
17GPT-4.5 PreviewOpenAIvaries
18Grok 2xAIvaries
19Claude 3 OpusAnthropicvaries
20GPT-4 TurboOpenAIvaries
Fig 2 · Same three benchmarks as Fig 1. Endpoint names read from the models.json registry. Price shown as “varies” where the vendor ties it to model tier / volume — see their pricing page.
§ 03 · Consistency

Cross-benchmark champions.

A single high score can be a training-set artefact. The models below place in the top-three across multiple benchmarks — a harder, more honest bar.

Ranked by number of top-3 finishes, then by average rank, across 7 OCR benchmarks in the registry.

#ModelCoverageTop-3sAvg rankPer-benchmark rank
01Gemini 2.5 Pro4 / 72#5.3OmniDoc #13OCRBench #5VideoOCR #1Thai #2
02Gemini 1.5 Pro2 / 71#3.0CC-OCR #1VideoOCR #5
03Qwen2.5-VL 72B2 / 71#3.5VideoOCR #2Thai #5
04Qwen2.5-VL 32B2 / 71#4.5VideoOCR #6Thai #3
05GPT-4o4 / 71#4.8OCRBench #8CC-OCR #4VideoOCR #4Arabic #3
06PaddleOCR-VL-1.52 / 71#5.5OmniDoc #2olmOCR #9
07Qianfan-OCR3 / 71#5.7OmniDoc #3OCRBench #7olmOCR #7
08Claude Sonnet 42 / 71#11.0OCRBench #21Thai #1
Fig 3 · Minimum coverage threshold: 2 benchmarks. Per-benchmark pills are copper when the model is top-3 on that benchmark.
§ 04 · Figure

Twelve models, eight benchmarks.

A single grid to read coverage at a glance. PaddleOCR-VL and Gemini 2.5 Pro show the broadest reach; specialist systems light up a single column.

Rendered from the same registry as the tables above; green indicates higher normalised score within the benchmark.

Heatmap showing OCR model performance across 8 benchmarks. Green = high score. PaddleOCR-VL and Gemini 2.5 Pro show broadest coverage.
Fig 4 · Twelve OCR models × eight benchmarks. Each cell is normalised within its column. Greyed cells: no reproducible score in registry.
Horizontal bar chart comparing top 10 OCR models by OmniDocBench composite score.
Fig 5 · Top-10 by OmniDocBench composite.
Cost comparison chart showing vendor APIs at $1-15 per 1K pages versus self-hosted PaddleOCR-VL at $0.09.
Fig 6 · Price per 1,000 pages. Self-hosted PaddleOCR-VL at $0.09 vs vendor APIs at $1–$15.
§ 05
How it works

Three stages, one forward pass.

Classical OCR is a pipeline of three modules. First a detector draws boxes around text regions; then a recogniser reads the pixels inside each box into characters; finally a post-processor corrects the output and resolves reading order. Each module can fail independently, and the errors compound.

Detection granularity has shifted from words to lines to whole regions. Word-level detection — the CRAFT / EAST tradition — still dominates scene text. Line-level dominates documents. Region-level detection is where modern vision-language models thrive: they see entire paragraphs as semantic units and preserve layout without a separate analysis step.

Recognition used to be CTC — Connectionist Temporal Classification — which is fast but treats each character as independent. Attention-based decoders, standard since 2018, let the model condition each character on the whole image. That is why modern OCR finally stops confusing “rn” with “m” and “l” with “1”.

Post-processing is the unsexy part: language-model correction (“teh” to “the”), layout analysis (read left column before right), table structure recognition (scored by TEDS), and confidence filtering. It is also where traditional pipelines most often embarrass themselves.

The 2023–2026 shift is that vision-language models fold all three stages into a single forward pass. They read the document the way a literate human does — as one object, with layout, structure and language considered at once. That is why the top of Fig 1 is dominated by VLM-class open-source models and why Tesseract has quietly slid off the leaderboard.

§ 06 · History

One hundred and fifty years of teaching machines to read.

From a selenium photocell concept to vision-language models that read better than their operators. Every breakthrough that led to today’s 92.86 OmniDoc SOTA.

Era I · 1870 — 1970

Mechanical.

The idea of a machine that could read predates computers by almost a century. Early pioneers built physical devices from selenium cells, spinning disks and vacuum tubes — driven mostly by the ambition of giving blind readers access to printed text.

1870
Carey’s retina-inspired sensor
T.D. Carey proposes a mosaic of selenium photocells that converts an image into electrical signals. Decades ahead of the available technology.
1885
Nipkow’s scanning disk
A rotating disk with spiral holes scans an image point-by-point into a serial electrical signal. The scanning principle persists in every OCR device for 60 years.
1912
Optophone for the blind
d’Albe’s device maps each printed character to a distinct musical chord. A trained reader reaches about one word per minute.
1914
Goldberg’s statistical machine
The first device to recognise printed characters by comparing their photocell signature against stored templates. Ancestor of all template-based OCR.
1929
Tauschek’s template patent
A spinning disk with cut-out letters; maximum light transmission identifies the character. Elegant and impossibly slow.
1931
IBM acquires Goldberg’s patents
The technology sits dormant for twenty years, waiting for electronics to catch up.
1949
RCA reading machine
The US Veterans Administration funds the first prototype that reads printed pages aloud. Accuracy under 50% — but OCR now has serious government funding.
1951
GISMO — first electronic OCR
NIST’s Sheppard replaces the spinning disk with static photocell arrays. The leap from mechanical to electronic is the most important transition in OCR history.
1955
MICR for banking
The American Bankers Association adopts the E-13B magnetic-ink font for check processing. Not optical — but it proves banks will pay for machine reading.
1957
The perceptron detour
Rosenblatt’s Mark I Perceptron barely distinguishes triangles from squares. Minsky’s 1969 critique kills neural networks for two decades.
1965
First commercial OCR
Reader’s Digest + RCA process 1,500 documents per hour — but only in the purpose-built OCR-A font.
1966
US Postal Service
Machine-sorting mail using OCR scanners. The first industrial-scale deployment.
1968
Kurzweil’s insight
Extract structural features (strokes, apexes, crossbars), then classify. The separation of feature extraction from classification is the architecture every modern OCR system still uses.
Era II · 1974 — 2006

Desktop.

The personal-computer revolution turned OCR from a million-dollar mainframe operation into desktop software. Neural networks arrived quietly, and one Bell Labs researcher changed everything with a 28×28 pixel grid.

1974
Kurzweil Computer Products
The first system to read any typeface. Stevie Wonder is an early customer.
1974
OCR-B — Frutiger
A machine-readable font that is also legible to humans. Still on every machine-readable passport in the world.
1976
Kurzweil reading machine
Flatbed CCD scanner + omni-font OCR + text-to-speech. $50,000. Xerox acquires the company in 1980.
1985
OCR goes desktop
Mac and Windows GUIs. HP Labs begins developing Tesseract internally. Scanners drop below $1,000.
1990
LeCun’s LeNet
A CNN recognises handwritten digits at 99.2% accuracy on MNIST. The first network that learns features from data. Still the architectural ancestor of every modern recogniser.
1995
The “solved problem” illusion
OCR accuracy hits 99% on clean printed text. Industry declares the problem solved. Handwriting, receipts, faded prints remain nearly impossible.
1998
Google begins book scanning
Research that will become Google Books. Eventually 40M+ books digitised — the largest OCR deployment in history.
2000
ABBYY FineReader
Worldwide enterprise standard for document digitisation. For fifteen years, “OCR” in enterprise effectively means ABBYY.
2005
Tesseract open-sourced
HP releases the engine. Google sponsors development. High-quality OCR is free, and still the most-used OSS OCR tool today.
Era III · 2012 — 2022

Deep learning.

AlexNet’s 2012 ImageNet victory ignited the deep-learning revolution. Within five years every OCR pipeline was rebuilt with neural networks. Text recognition shifted from “recognise characters” to “understand documents”.

2012
AlexNet
Deep CNNs learn features humans cannot hand-engineer. Every CV problem, OCR included, is ripe for rethinking.
2013
CRNN — the ten-year king
Convolutional-recurrent network: CNN sees the image, RNN reads it left-to-right like a human. Dominates OCR for nearly a decade.
2015
reCAPTCHA v2
Every “select all traffic lights” trained Google’s Street-View OCR for free. Billions of annotations — the most profitable UX pattern ever designed.
2017
EAST and CRAFT
Scene-text detection at 13 FPS on a single GPU. Localises text so recognition networks can focus on reading it.
2018
Attention replaces CTC
Decoders look back at the image while predicting each character. “rn” versus “m” becomes solvable.
2019
PaddleOCR released
Baidu’s Apache 2.0 toolkit becomes the default for self-hosted OCR and the foundation of the VLM-OCR era.
2020
Transformers enter OCR
TrOCR and Donut prove pure transformer architectures can match or beat CRNN. Donut processes document images end-to-end with no traditional OCR module at all.
Era IV · 2023 — 2026

Vision-language models.

The most disruptive shift in OCR history. Models built to understand imagesturn out to be better at reading text than models built specifically for OCR. The commercial OCR industry is blindsided.

2023
GPT-4V “accidentally” wins
Nobody optimised it for OCR, yet it immediately outperforms every dedicated OCR system on complex documents. “Understanding” turns out to be a superset of “reading”.
2024
Economics shift
Purpose-built doc-AI tools arrive: Mistral OCR at $1 per 1K pages, Docling (free, IBM), olmOCR (open source). Cost of high-quality OCR drops ten-fold in a year.
2025
Open source overtakes
PaddleOCR-VL tops every benchmark. $0.09 per 1K pages self-hosted versus $15 for GPT-5.4 — a 167x cost reduction.
2026
VLMs make traditional OCR obsolete
SOTA hits 92.86 on OmniDocBench. Traditional detect-recognise-post-process pipelines cannot compete. Tesseract and EasyOCR remain relevant only on the edge.
§ 07 · Decision tools

What are you trying to extract?

Pick the document type. Each link goes to a dedicated page with setup instructions, failure modes and a working code sample.

  1. Scenario · 01

    Invoices & receipts

    Line items, totals, vendor info → structured data. Table-heavy. Receipts fade and crumple.

    PaddleOCR-VL-1.5free · local
  2. Scenario · 02

    Handwritten notes

    Forms, signatures, meeting notes, historical documents. Variable slant, irregular spacing.

    TrOCRfree · local
  3. Scenario · 03

    PDFs & reports

    Multi-page documents, multi-column layout, tables, headers, footnotes.

    Chandra / olmOCRfree · local
  4. Scenario · 04

    Photos & screenshots

    Camera captures, screen grabs, social media imagery — often rotated, sometimes warped.

    PaddleOCR-VL-1.5free · local
  5. Scenario · 05

    Scanned books & archives

    Digitise printed text, old documents, historical archives with degraded print.

    GLM-OCR / PaddleOCR-VL-1.5free · local
  6. Scenario · 06

    ID cards & passports

    KYC verification, identity documents, MRZ code reading. Compliance and audit matter.

    Azure / Googleenterprise
§ 08 · Long form

Deep dives & techniques.

  1. Essay
    167×

    The OCR economics shift

    Self-hosted VLM-OCR is now both better and 167× cheaper than vendor APIs. The October 2025 inflection point, with pricing and accuracy math.

  2. Architecture

    How Docling works

    The architecture of IBM’s document-understanding library and why VLM pipelines outperform traditional OCR.

  3. Engineering

    Interactive OCR correction

    Handling OCR “flicker” (H vs N) and camera drift in mobile apps. Google MLKit plus centroid anchoring.

  4. Reference
    26

    Benchmarks directory

    26 OCR benchmarks across document parsing, handwriting, video OCR, scene text and multilingual tasks.

  5. Case study
    PL

    Rys OCR — Polish SOTA

    71% CER reduction on Polish diacritics. LoRA fine-tune of PaddleOCR-VL. Apache 2.0.

  6. Tutorial

    Ship it

    3 questions, one recommendation, copy-paste code that runs in 10 minutes. For engineers who picked a model and need to wire it in.

§ 09 · Testing priority

What we still need to verify.

Generated by generateTestingPriorityList() in lib/scoring. Ranks outstanding (model, benchmark) pairs by importance weight — coverage gaps first, then benchmark criticality.

If you are planning to run one of these — see the submission note below.

#ModelBenchmarkReasonWeightType
01Qwen2.5-VL 72BomnidocbenchPrimary benchmark missing7.2OSS
02Qwen2.5-VL 72Bocrbench-v2Primary benchmark missing7.2OSS
03Qwen2.5-VL 72Bolmocr-benchPrimary benchmark missing7.2OSS
04Qwen2.5-VL 32BomnidocbenchPrimary benchmark missing7.2OSS
05Qwen2.5-VL 32Bocrbench-v2Primary benchmark missing7.2OSS
06Qwen2.5-VL 32Bolmocr-benchPrimary benchmark missing7.2OSS
07PaddleOCRomnidocbenchPrimary benchmark missing6.0OSS
08PaddleOCRocrbench-v2Primary benchmark missing6.0OSS
09PaddleOCRolmocr-benchPrimary benchmark missing6.0OSS
10EasyOCRomnidocbenchPrimary benchmark missing6.0OSS
Fig 7 · Top-10 testing priorities computed from the registry. Each row is a (model, benchmark) pair we have not independently verified yet. Sparkline is illustrative.

Run any of these OCR models?

Send us your numbers, or flag ones we got wrong. We verify and credit every contribution.

Share results →
§ 10 · Contribute

Know an OCR result
we’re missing?

Fresh numbers, stale data, a model we haven’t tested — tell us. Real humans read every message, and every verified result gets attribution in the registry.

Spotted a number that looks wrong?Tell us →
§ 11 · FAQ

Frequently asked, honestly answered.

Questions that arrive in our inbox every week, answered with real numbers drawn from the tables above.

Q01What is the best OCR model in 2026?+

PaddleOCR-VL 7B leads the OmniDocBench leaderboard with a composite score of 92.86, outperforming GPT-5.4 and Gemini 2.5 Pro on end-to-end document parsing including tables and formulas. It’s open source (Apache 2.0).

Q02Which OCR model has the best English text recognition?+

On OCRBench v2, Seed1.6-vision leads English OCR at 62.2%, followed by Qwen3-Omni-30B at 61.3%. For pure text extraction, GPT-5.4 achieves the lowest edit distance (0.02) on OmniDocBench.

Q03Is open-source OCR better than paid APIs in 2026?+

Yes. PaddleOCR-VL (open source) scores 92.86 on OmniDocBench vs GPT-5.4's 85.80. Self-hosted VLM-OCR is 167x cheaper per page than vendor APIs while delivering higher accuracy on document parsing tasks.

Q04Which OCR is best for invoices and receipts?+

PaddleOCR-VL-1.5 leads OmniDocBench with 94.50 composite score, closely followed by GLM-OCR at 94.62. For PDFs specifically, Chandra v0.1.0 tops the olmOCR bench at 83.1%. Open-source models now decisively outperform commercial APIs.

Q05How much does OCR cost per page?+

Vendor APIs range from $1-$15 per 1,000 pages (Mistral OCR at $1, GPT-5.4 at ~$15). Self-hosted PaddleOCR-VL costs approximately $0.09 per 1,000 pages on a consumer GPU — a 167x cost reduction.

§ 12
Methodology

Where these numbers come from.

All benchmark results on this page are sourced from AlphaXiv leaderboards, published papers, and our own independent verification. Each data point in benchmarks.json carries a source URL and an access date; every ranking you see is recomputed on build from that file.

Results marked “pending verification” are claims that we have not independently confirmed. We do not include estimated or interpolated values.

Don’t want to pick a model? Drop a PDF at hardparse.com and get clean Markdown back — tables, formulas and layout preserved. It is our sister project running one of the top-ranked models on this page. Rankings stay independent of it.

Read next

Three places to go from here.

Condensed view
OCR Power Ranking
One ranking by average percentile across all OCR benchmarks, plus CodeSOTA-verified scores where we ran our own eval.
Practical guide
Best OCR for handwriting
Frontier VLMs (GPT-5, Claude Opus 4.7, Gemini 3) on IAM. CER, bounding-box support, code samples.
Comparison
PaddleOCR vs Tesseract vs dots.ocr
Three-way benchmark: throughput, edit distance, $/1K pages. When each OCR engine wins.