Union14M.

Name: Union14M Benchmark Results
Creator: Unknown
License: https://creativecommons.org/licenses/by/4.0/

Next-generation scene text recognition benchmark assembled from 14 datasets (4M labeled + 10M unlabeled images). Accuracy drops 33-48% vs standard benchmarks, exposing real-world model limitations across 7 challenge categories: Artistic, Multi-Oriented, Salient, Multi-Words, General, Contextless, Incomplete.

Paper ↗Leaderboard ↓

§ 01 · SOTA history

Year over year.

Not enough data to show trend.

§ 02 · Leaderboard

Results by metric.

accuracy

Accuracy is the reported evaluation metric for Union14M. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for accuracyverifiedpapervendorcommunityunverified

Rank	Model	Trust	Score	Year	Source
01	CLIP4STR-B CLIP4STR-B on Union14M-Benchmark. 70.8% word accuracy. Reported in Union14M paper (arXiv 2307.08723, ICCV 2023) and CLIP4STR paper. Best model on Union14M at time of benchmark publication.	paper	70.8	2026	Source ↗
02	CLIP4STR-B CLIP4STR-B on Union14M-Benchmark. 70.8% word accuracy. Reported in Union14M paper (arXiv 2307.08723, ICCV 2023) and CLIP4STR paper. Best model on Union14M at time of benchmark publication.	paper	70.8	2026	Source ↗
03	PARSeq PARSeq on Union14M-Benchmark. 67.8% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). Strong ECCV 2022 baseline exposed with real-world difficulty.	paper	67.8	2026	Source ↗
04	PARSeq PARSeq on Union14M-Benchmark. 67.8% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). Strong ECCV 2022 baseline exposed with real-world difficulty.	paper	67.8	2026	Source ↗
05	CLIP4STR CLIP4STR on Union14M benchmark. Leverages CLIP pretraining.	paper	67.3	2026	Source ↗
06	LPV-S LPV-S (Language-Guided Progressive Vison, Small) on Union14M-Benchmark. 65.1% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023).	paper	65.1	2026	Source ↗
07	LPV-S LPV-S (Language-Guided Progressive Vison, Small) on Union14M-Benchmark. 65.1% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023).	paper	65.1	2026	Source ↗
08	PARSeq PARSeq on Union14M. Permutation autoregressive model. ECCV 2022.	paper	63.8	2026	Source ↗
09	MAERec-S MAERec-S on Union14M-Benchmark. 62.4% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). MAE pre-training for text recognition.	paper	62.4	2026	Source ↗
10	MAERec-S MAERec-S on Union14M-Benchmark. 62.4% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). MAE pre-training for text recognition.	paper	62.4	2026	Source ↗
11	MATRN MATRN on Union14M. Multi-granularity attention. From the Union14M paper.	paper	61.2	2026	Source ↗
12	CDistNet CDistNet on Union14M-Benchmark. 56.2% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). AAAI 2022 baseline.	paper	56.2	2026	Source ↗
13	CDistNet CDistNet on Union14M-Benchmark. 56.2% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). AAAI 2022 baseline.	paper	56.2	2026	Source ↗

§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Detection