Codesota · Benchmark · Union14MHome/Leaderboards/Vision & Documents/Scene Text Detection/Union14M
Unknown

Union14M.

Next-generation scene text recognition benchmark assembled from 14 datasets (4M labeled + 10M unlabeled images). Accuracy drops 33-48% vs standard benchmarks, exposing real-world model limitations across 7 challenge categories: Artistic, Multi-Oriented, Salient, Multi-Words, General, Contextless, Incomplete.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

Not enough data to show trend.
§ 02 · Leaderboard

Results by metric.

accuracy

Accuracy is the reported evaluation metric for Union14M. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearSource
01CLIP4STR-B
CLIP4STR-B on Union14M-Benchmark. 70.8% word accuracy. Reported in Union14M paper (arXiv 2307.08723, ICCV 2023) and CLIP4STR paper. Best model on Union14M at time of benchmark publication.
paper70.82026Source ↗
02CLIP4STR-B
CLIP4STR-B on Union14M-Benchmark. 70.8% word accuracy. Reported in Union14M paper (arXiv 2307.08723, ICCV 2023) and CLIP4STR paper. Best model on Union14M at time of benchmark publication.
paper70.82026Source ↗
03PARSeq
PARSeq on Union14M-Benchmark. 67.8% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). Strong ECCV 2022 baseline exposed with real-world difficulty.
paper67.82026Source ↗
04PARSeq
PARSeq on Union14M-Benchmark. 67.8% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). Strong ECCV 2022 baseline exposed with real-world difficulty.
paper67.82026Source ↗
05CLIP4STR
CLIP4STR on Union14M benchmark. Leverages CLIP pretraining.
paper67.32026Source ↗
06LPV-S
LPV-S (Language-Guided Progressive Vison, Small) on Union14M-Benchmark. 65.1% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023).
paper65.12026Source ↗
07LPV-S
LPV-S (Language-Guided Progressive Vison, Small) on Union14M-Benchmark. 65.1% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023).
paper65.12026Source ↗
08PARSeq
PARSeq on Union14M. Permutation autoregressive model. ECCV 2022.
paper63.82026Source ↗
09MAERec-S
MAERec-S on Union14M-Benchmark. 62.4% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). MAE pre-training for text recognition.
paper62.42026Source ↗
10MAERec-S
MAERec-S on Union14M-Benchmark. 62.4% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). MAE pre-training for text recognition.
paper62.42026Source ↗
11MATRN
MATRN on Union14M. Multi-granularity attention. From the Union14M paper.
paper61.22026Source ↗
12CDistNet
CDistNet on Union14M-Benchmark. 56.2% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). AAAI 2022 baseline.
paper56.22026Source ↗
13CDistNet
CDistNet on Union14M-Benchmark. 56.2% word accuracy. Table 4 in Union14M paper (arXiv 2307.08723, ICCV 2023). AAAI 2022 baseline.
paper56.22026Source ↗
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Detection