Handwriting Recognition2024multilingual

Cultural Heritage Understanding Research Repository OCR Dataset

Historical documents from 46 languages, 99K pages. Tests handwritten and printed text recognition across diverse scripts.

Samples:99,000
Metrics:handwritten-levenshtein, printed-levenshtein
Paper / Website
Current State of the Art

CHURRO (3B)

Stanford

70.1

handwritten-levenshtein

Top Models Performance Comparison

Top 6 models ranked by handwritten-levenshtein

handwritten-levenshtein1CHURRO (3B)70.1100.0%2Gemini 2.5 Pro63.690.7%3Gemini 2.5 Flash58.783.7%4Qwen2.5-VL 72B54.577.7%5Claude Sonnet 437.152.9%6GPT-4o34.248.8%0%25%50%75%100%% of best
Best Score
70.1
Top Model
CHURRO (3B)
Models Compared
6
Score Range
35.9

handwritten-levenshteinPrimary

#ModelScorePaper / CodeDate
1
CHURRO (3B)Open Source
Stanford
70.1Dec 2025
2
Gemini 2.5 ProAPI
Google
63.6Dec 2025
3
Gemini 2.5 Flash
Google
58.7Dec 2025
4
Qwen2.5-VL 72BOpen Source
Alibaba
54.5Dec 2025
5
Claude Sonnet 4API
Anthropic
37.1Dec 2025
6
GPT-4oAPI
OpenAI
34.2Dec 2025

printed-levenshtein

#ModelScorePaper / CodeDate
1
CHURRO (3B)Open Source
Stanford
82.3Dec 2025
2
Gemini 2.5 ProAPI
Google
80.9Dec 2025

Other Handwriting Recognition Datasets