Home / OCR / All Results

All Verified Results

258 benchmark results across 50 datasets. Every data point links to its source.

258
Total Results
50
Benchmarks
121
Models

JSON API: Download raw data at /data/benchmarks.json

Complete Results Table

Model Dataset Metric Value Source
coca-finetuned imagenet-1k top-1-accuracy 91 google-research
vit-g-14 imagenet-1k top-1-accuracy 90.45 google-research
convnext-v2-huge imagenet-1k top-1-accuracy 88.9 meta-research
vit-h-14 imagenet-1k top-1-accuracy 88.55 google-research
swin-large imagenet-1k top-1-accuracy 87.3 microsoft-research
efficientnet-v2-l imagenet-1k top-1-accuracy 85.7 google-research
deit-b-distilled imagenet-1k top-1-accuracy 85.2 meta-research
efficientnet-b7 imagenet-1k top-1-accuracy 84.4 google-research
deit-b imagenet-1k top-1-accuracy 83.1 meta-research
convnext-v2-tiny imagenet-1k top-1-accuracy 83 meta-research
vit-l-16 imagenet-1k top-1-accuracy 82.7 google-research
vit-b-16 imagenet-1k top-1-accuracy 81.2 google-research
resnet-50-a3 imagenet-1k top-1-accuracy 80.4 timm-research
resnet-152 imagenet-1k top-1-accuracy 78.6 microsoft-research
efficientnet-b0 imagenet-1k top-1-accuracy 77.1 google-research
resnet-50 imagenet-1k top-1-accuracy 76.15 pytorch-vision
swin-v2-large imagenet-v2 top-1-accuracy 84 microsoft-research
convnext-v2-huge imagenet-v2 top-1-accuracy 80.5 meta-research
vit-h-14 cifar-100 accuracy 94.55 google-research
vit-b-16 cifar-100 accuracy 91.48 huggingface
deit-b-distilled cifar-10 accuracy 99.1 meta-research
convnext-v2-base cifar-10 accuracy 98.7 meta-research
resnet-50 cifar-10 accuracy 96.01 cutout-paper
efficientnet-b7 cifar-100 accuracy 91.7 google-research
resnet-50 cifar-100 accuracy 78.04 cutout-paper
paddleocr-vl omnidocbench composite 92.86 alphaxiv-leaderboard
paddleocr-vl-0.9b omnidocbench composite 92.56 alphaxiv-leaderboard
mineru-2.5 omnidocbench composite 90.67 alphaxiv-leaderboard
qwen3-vl-235b omnidocbench composite 89.15 alphaxiv-leaderboard
monkeyocr-pro-3b omnidocbench composite 88.85 alphaxiv-leaderboard
gemini-25-pro omnidocbench composite 88.03 alphaxiv-leaderboard
qwen25-vl omnidocbench composite 87.02 alphaxiv-leaderboard
ocrverse-4b omnidocbench composite 88.56 github-leaderboard
dots-ocr-3b omnidocbench composite 88.41 github-leaderboard
mistral-ocr-3 omnidocbench composite 79.75 codesota-verified
mistral-ocr-3 omnidocbench text-edit-distance 0.099 codesota-verified
mistral-ocr-3 omnidocbench table-teds 70.88 codesota-verified
mistral-ocr-3 omnidocbench formula-edit-distance 0.218 codesota-verified
mistral-ocr-3 omnidocbench reading-order 91.63 codesota-verified
clearocr-teamquest omnidocbench composite 31.7 codesota-verified
clearocr-teamquest omnidocbench text-edit-distance 0.154 codesota-verified
clearocr-teamquest omnidocbench table-teds 0.8 codesota-verified
clearocr-teamquest omnidocbench formula-edit-distance 0.902 codesota-verified
clearocr-teamquest omnidocbench reading-order 86.04 codesota-verified
gpt-4o omnidocbench ocr-edit-distance 0.02 alphaxiv-leaderboard
paddleocr-vl omnidocbench table-teds 93.52 alphaxiv-leaderboard
mineru-2.5 omnidocbench layout-map 97.5 alphaxiv-leaderboard
seed-1.6-vision ocrbench-v2 overall-en-private 62.2 alphaxiv-leaderboard
qwen3-omni-30b ocrbench-v2 overall-en-private 61.3 alphaxiv-leaderboard
nemotron-nano-v2-vl ocrbench-v2 overall-en-private 61.2 alphaxiv-leaderboard
gemini-25-pro ocrbench-v2 overall-en-private 59.3 alphaxiv-leaderboard
gpt-4o ocrbench-v2 overall-en-private 55.5 alphaxiv-leaderboard
gemini-25-pro ocrbench-v2 overall-zh-private 62.2 alphaxiv-leaderboard
chandra-ocr-0.1.0 olmocr-bench pass-rate 83.1 alphaxiv-leaderboard
chandra-ocr-0.1.0 olmocr-bench tables 88 github-readme
chandra-ocr-0.1.0 olmocr-bench old-scans-math 80.3 github-readme
chandra-ocr-0.1.0 olmocr-bench long-tiny-text 92.3 github-readme
chandra-ocr-0.1.0 olmocr-bench base 99.9 github-readme
chandra-ocr-0.1.0 olmocr-bench headers-footers 90.8 github-readme
chandra-ocr-0.1.0 olmocr-bench multi-column 81.2 github-readme
chandra-ocr-0.1.0 olmocr-bench arxiv 82.2 github-readme
chandra-ocr-0.1.0 olmocr-bench old-scans 50.4 github-readme
deepseek-ocr olmocr-bench pass-rate 75.4 github-readme
dots-ocr-3b olmocr-bench pass-rate 79.1 github-readme
marker-1.10.0 olmocr-bench pass-rate 76.5 github-readme
gpt-4o-anchored olmocr-bench pass-rate 69.9 github-readme
gemini-flash-2 olmocr-bench pass-rate 63.8 github-readme
dots-ocr-3b olmocr-bench tables 88.3 github-readme
olmocr-v0.3.0 olmocr-bench old-scans-math 79.9 github-readme
olmocr-v0.3.0 olmocr-bench headers-footers 95.1 github-readme
marker-1.10.0 olmocr-bench arxiv 83.8 github-readme
gpt-4o olmocr-bench old-scans 40.7 github-readme
infinity-parser-7b olmocr-bench pass-rate 82.5 alphaxiv-leaderboard
olmocr-v0.4.0 olmocr-bench pass-rate 82.4 alphaxiv-leaderboard
paddleocr-vl olmocr-bench pass-rate 80 alphaxiv-leaderboard
marker-1.10.1 olmocr-bench pass-rate 76.1 alphaxiv-leaderboard
deepseek-ocr olmocr-bench pass-rate 75.7 alphaxiv-leaderboard
mineru-2.5 olmocr-bench pass-rate 75.2 alphaxiv-leaderboard
mistral-ocr-3 olmocr-bench pass-rate 78 mistral-announcement
mistral-ocr-3 internal-mistral overall-accuracy 94.9 mistral-announcement
mistral-ocr-3 ocr-cer-benchmark cer 3.7 sparkco-benchmark
mistral-ocr-3 ocr-wer-benchmark wer 7.1 sparkco-benchmark
mistral-ocr-api olmocr-bench pass-rate 72 alphaxiv-leaderboard
nanonets-ocr2-3b olmocr-bench pass-rate 69.5 alphaxiv-leaderboard
churro-3b churro-ds handwritten-levenshtein 70.1 alphaxiv-leaderboard
churro-3b churro-ds printed-levenshtein 82.3 alphaxiv-leaderboard
gemini-25-pro churro-ds handwritten-levenshtein 63.6 alphaxiv-leaderboard
gemini-25-pro churro-ds printed-levenshtein 80.9 alphaxiv-leaderboard
gemini-25-flash churro-ds handwritten-levenshtein 58.7 alphaxiv-leaderboard
qwen25-vl-72b churro-ds handwritten-levenshtein 54.5 alphaxiv-leaderboard
claude-sonnet-4 churro-ds handwritten-levenshtein 37.1 alphaxiv-leaderboard
gpt-4o churro-ds handwritten-levenshtein 34.2 alphaxiv-leaderboard
gemini-15-pro cc-ocr multi-scene-f1 83.25 alphaxiv-leaderboard
qwen2-vl-72b cc-ocr multi-scene-f1 77.95 alphaxiv-leaderboard
internvl2-76b cc-ocr multi-scene-f1 76.92 alphaxiv-leaderboard
gpt-4o cc-ocr multi-scene-f1 76.4 alphaxiv-leaderboard
claude-35-sonnet cc-ocr multi-scene-f1 72.87 alphaxiv-leaderboard
qwen2-vl-72b cc-ocr kie-f1 71.76 alphaxiv-leaderboard
gemini-15-pro cc-ocr kie-f1 67.28 alphaxiv-leaderboard
claude-35-sonnet cc-ocr kie-f1 64.58 alphaxiv-leaderboard
gpt-4o cc-ocr kie-f1 63.45 alphaxiv-leaderboard
gemini-15-pro cc-ocr multilingual-f1 78.97 alphaxiv-leaderboard
gpt-4o cc-ocr multilingual-f1 73.44 alphaxiv-leaderboard
gemini-15-pro cc-ocr document-parsing 62.37 alphaxiv-leaderboard
gemini-25-pro mme-videoocr total-accuracy 73.7 alphaxiv-leaderboard
qwen25-vl-72b mme-videoocr total-accuracy 69 alphaxiv-leaderboard
internvl3-78b mme-videoocr total-accuracy 67.2 alphaxiv-leaderboard
gpt-4o mme-videoocr total-accuracy 66.4 alphaxiv-leaderboard
gemini-15-pro mme-videoocr total-accuracy 64.9 alphaxiv-leaderboard
qwen25-vl-32b mme-videoocr total-accuracy 61 alphaxiv-leaderboard
gemini-20-flash kitab-bench cer 0.13 alphaxiv-leaderboard
ain-7b kitab-bench cer 0.2 alphaxiv-leaderboard
gpt-4o kitab-bench cer 0.31 alphaxiv-leaderboard
gpt-4o-mini kitab-bench cer 0.43 alphaxiv-leaderboard
azure-ocr kitab-bench cer 0.52 alphaxiv-leaderboard
tesseract kitab-bench cer 0.54 alphaxiv-leaderboard
easyocr kitab-bench cer 0.58 alphaxiv-leaderboard
paddleocr kitab-bench cer 0.79 alphaxiv-leaderboard
claude-sonnet-4 thaiocrbench ted-score 0.84 alphaxiv-leaderboard
gemini-25-pro thaiocrbench ted-score 0.77 alphaxiv-leaderboard
qwen25-vl-32b thaiocrbench ted-score 0.765 alphaxiv-leaderboard
internvl3-14b thaiocrbench ted-score 0.76 alphaxiv-leaderboard
qwen25-vl-72b thaiocrbench ted-score 0.72 alphaxiv-leaderboard
o1-preview gsm8k accuracy 97.8 openai-blog
gpt-4o gsm8k accuracy 92 openai-blog
claude-35-sonnet gsm8k accuracy 96.4 anthropic-blog
gemini-15-pro gsm8k accuracy 91.7 google-blog
llama-3-70b gsm8k accuracy 93 meta-blog
o1-preview math accuracy 94.8 openai-blog
gpt-4o math accuracy 76.6 openai-blog
claude-35-sonnet math accuracy 71.1 anthropic-blog
gemini-15-pro math accuracy 67.7 google-blog
deepseek-v3 math accuracy 90.2 deepseek-blog
o1-preview aime-2024 accuracy 83.3 openai-blog
gpt-4o aime-2024 accuracy 13.4 openai-blog
claude-35-opus aime-2024 accuracy 16 anthropic-blog
gpt-4o hellaswag accuracy 95.3 openai-blog
claude-35-sonnet hellaswag accuracy 89 anthropic-blog
llama-3-70b hellaswag accuracy 88 meta-blog
gemini-15-pro hellaswag accuracy 92.5 google-blog
gpt-4o winogrande accuracy 87.5 openai-blog
claude-35-sonnet winogrande accuracy 85.4 anthropic-blog
llama-3-70b winogrande accuracy 85.3 meta-blog
gpt-4o arc-challenge accuracy 96.4 openai-blog
claude-35-sonnet arc-challenge accuracy 96.7 anthropic-blog
llama-3-70b arc-challenge accuracy 93 meta-blog
gemini-15-pro arc-challenge accuracy 94.8 google-blog
gpt-4o mmlu accuracy 88.7 openai-blog
o1-preview mmlu accuracy 92.3 openai-blog
claude-35-sonnet mmlu accuracy 88.7 anthropic-blog
gemini-15-pro mmlu accuracy 85.9 google-blog
llama-3-70b mmlu accuracy 82 meta-blog
deepseek-v3 mmlu accuracy 88.5 deepseek-blog
o1-preview gpqa accuracy 78 openai-blog
gpt-4o gpqa accuracy 53.6 openai-blog
claude-35-sonnet gpqa accuracy 59.4 anthropic-blog
gemini-15-pro gpqa accuracy 46.2 google-blog
gpt-4o commonsenseqa accuracy 85.4 openai-blog
claude-35-sonnet commonsenseqa accuracy 83.2 anthropic-blog
llama-3-70b commonsenseqa accuracy 80.9 meta-blog
gpt-4o hotpotqa f1 71.3 arxiv-paper
claude-35-sonnet hotpotqa f1 68.5 arxiv-paper
gpt-4o strategyqa accuracy 82.1 arxiv-paper
claude-35-sonnet strategyqa accuracy 79.8 arxiv-paper
gpt-4o logiqa accuracy 56.3 arxiv-paper
claude-35-sonnet logiqa accuracy 53.8 arxiv-paper
gpt-4o reclor accuracy 72.4 arxiv-paper
claude-35-sonnet reclor accuracy 68.9 arxiv-paper
gpt-4o svamp accuracy 93.7 arxiv-paper
claude-35-sonnet svamp accuracy 91.2 arxiv-paper
llama-3-70b svamp accuracy 89.5 meta-blog
gpt-4o mawps accuracy 97.2 arxiv-paper
claude-35-sonnet mawps accuracy 95.8 arxiv-paper
llama-3-70b mawps accuracy 94.1 meta-blog
plymouth-dl-model abide-i accuracy 98 research-paper
deepasd abide-ii auc 93 research-paper
mcbert abide-i accuracy 93.4 research-paper
ae-fcn abide-i accuracy 85 research-paper
braingт abide-i auc 78.7 research-paper
asd-swnet abide-i accuracy 76.52 research-paper
asd-swnet abide-i auc 81 research-paper
al-negat abide-i accuracy 74.7 research-paper
braingnn abide-i accuracy 73.3 research-paper
gcn abide-i accuracy 72.2 research-paper
gcn abide-i auc 78 research-paper
multi-task-transformer abide-i accuracy 72 research-paper
svm-connectivity abide-i accuracy 70.1 research-paper
svm-connectivity abide-i auc 77 research-paper
deep-learning-heinsfeld abide-i accuracy 70 research-paper
mvs-gcn abide-i accuracy 69.38 research-paper
mvs-gcn abide-i auc 69.01 research-paper
phgcl-ddgformer abide-i accuracy 70.9 research-paper
random-forest abide-i accuracy 63 research-paper
maacnn abide-i accuracy 75.12 research-paper
maacnn abide-ii accuracy 72.88 research-paper
multi-atlas-dnn abide-i accuracy 78.07 research-paper
abraham-connectomes abide-i accuracy 67 research-paper
o1-preview humaneval pass@1 92.4 openai-blog
claude-35-sonnet humaneval pass@1 92 anthropic-blog
gpt-4o humaneval pass@1 90.2 openai-blog
deepseek-v3 humaneval pass@1 82.6 deepseek-blog
llama-3-70b humaneval pass@1 81.7 meta-blog
claude-35-sonnet swe-bench-verified resolve-rate 49 anthropic-blog
gpt-4o swe-bench-verified resolve-rate 41.2 swe-bench-leaderboard
deepseek-v25 swe-bench-verified resolve-rate 37 deepseek-blog
gpt-4o mbpp pass@1 87.8 openai-blog
claude-35-sonnet mbpp pass@1 89.2 anthropic-blog
internimage-h coco mAP 65.4 arxiv-paper
co-detr-swin-l coco mAP 66 arxiv-paper
dino-swin-l coco mAP 63.3 arxiv-paper
yolov10-x coco mAP 57.4 github-readme
efficientdet-d7-x coco mAP 55.1 google-research
internimage-h ade20k mIoU 62.9 arxiv-paper
mask2former-swin-l ade20k mIoU 57.3 arxiv-paper
agent57 atari-2600 human-normalized-score 4731.3 deepmind-research
go-explore atari-2600 human-normalized-score 40000 nature-paper
muzero atari-2600 human-normalized-score 731 nature-paper
dreamerv3 atari-2600 human-normalized-score 840 arxiv-paper
rainbow-dqn atari-2600 human-normalized-score 231 aaai-paper
dqn atari-2600 human-normalized-score 79 nature-paper
human-gamer atari-2600 human-normalized-score 100 baseline
bbos-1 atari-2600 human-normalized-score 1100 research
gdi-h3 atari-2600 human-normalized-score 950 research
chexpert-auc-maximizer chexpert auroc 93 stanford-leaderboard
chexzero chexpert auroc 88.6 research-paper
torchxrayvision chexpert auroc 87.4 github-readme
densenet-121-cxr chexpert auroc 86.5 research-paper
gloria chexpert auroc 88.2 research-paper
medclip chexpert auroc 87.8 research-paper
biovil chexpert auroc 89.1 microsoft-research
chexnet nih-chestxray14 auroc 84.1 research-paper
torchxrayvision nih-chestxray14 auroc 85.8 github-readme
densenet-121-cxr nih-chestxray14 auroc 82.6 research-paper
resnet-50-cxr nih-chestxray14 auroc 80.4 research-paper
chexzero mimic-cxr auroc 89.2 research-paper
torchxrayvision mimic-cxr auroc 86.3 github-readme
convirt mimic-cxr auroc 85.7 research-paper
rad-dino vindr-cxr auroc 91.2 microsoft-research
torchxrayvision vindr-cxr auroc 87.9 research-paper
densenet-121-cxr rsna-pneumonia auroc 88.5 kaggle-competition
chexnet rsna-pneumonia auroc 87.2 research-paper
torchxrayvision padchest auroc 84.6 github-readme
densenet-121-cxr covid-chestxray auroc 94.7 research-paper
torchxrayvision covid-chestxray auroc 93.2 github-readme
patchcore mvtec-ad auroc 99.1 research-paper
efficientad mvtec-ad auroc 99.1 research-paper
simplenet mvtec-ad auroc 99.6 research-paper
padim mvtec-ad auroc 97.9 research-paper
fastflow mvtec-ad auroc 99.4 research-paper
draem mvtec-ad auroc 98 research-paper
cflow-ad mvtec-ad auroc 98.3 research-paper
reverse-distillation mvtec-ad auroc 98.5 research-paper
patchcore visa auroc 92.1 research-paper
simplenet visa auroc 95.5 research-paper
efficientad visa auroc 94.8 research-paper
yolov8-weld weld-defect-xray map 87.3 research
defectdet-resnet neu-det map 78.4 research
yolov8-weld severstal-steel dice 91.2 kaggle

Pending Verification

These results are claimed in papers but need manual verification from the source PDF.

Model Dataset Claimed Value Status
trocr-large sroie 96.58 needs-pdf-verification
trocr-large iam 2.89 needs-pdf-verification
paddleocr-v4 icdar-2015 Unknown needs-documentation-verification
polish-roberta-ocr poleval-2021-ocr Unknown
polish-t5-ocr poleval-2021-ocr Unknown
herbert poleval-2021-ocr Unknown
abbyy-finereader impact-psnc Unknown
tesseract-polish impact-psnc Unknown
abbyy-finereader impact-psnc Unknown
tesseract-polish impact-psnc Unknown
tesseract-polish codesota-polish Unknown
tesseract-polish codesota-polish Unknown
tesseract-polish codesota-polish Unknown
tesseract-polish codesota-polish-wikipedia Unknown
tesseract-polish codesota-polish-real Unknown
tesseract-polish codesota-polish-synth-random Unknown
tesseract-polish codesota-polish-synth-words Unknown

Data Quality

All benchmark results are sourced from AlphaXiv benchmark leaderboards. Each data point includes the source URL for verification.

Results marked as "pending verification" are claimed in papers but have not been independently confirmed. We do not include estimated or interpolated values.