Codesota · Benchmark · coco-textHome/Leaderboards/Vision & Documents/Scene Text Detection/coco-text
Unknown

coco-text.

coco-text is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for coco-text.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

1 1 Accuracy

1 1 Accuracy is the reported evaluation metric for coco-text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for 1 1 Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01CLIP4STR-L
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified81.92023Paper ↗Code ↗Looks wrong?
02MGP-STR
From paper: Multi-Granularity Prediction for Scene Text Recognition
verified81.72022Paper ↗Code ↗Looks wrong?
03CLIP4STR-B
From paper: CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
verified81.12023Paper ↗Code ↗Looks wrong?

F Measure

F Measure is the reported evaluation metric for coco-text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measureverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01TCM
CLIP-based detector with joint-dataset training. IJCAI 2025.
paper65.92026Source ↗Looks wrong?
02PANet (Joint)
Pixel Aggregation Network with joint-dataset fine-tuning.
unverified64.52026Source ↗Looks wrong?
03LRANet
Low-Rank Approximation Network. AAAI 2024 Oral.
unverified61.72026Source ↗Looks wrong?
04DPText-DETR
DETR-based with dynamic point queries. Joint training. AAAI 2023.
unverified61.62026Source ↗Looks wrong?
05MAEDet
MAE-based self-supervised pretraining for text detection. IJCAI 2025.
unverified60.62026Source ↗Looks wrong?
06DBNet
Differentiable Binarization with fine-tuning. AAAI 2020.
unverified60.52026Source ↗Looks wrong?
07DBNet++
DB with Adaptive Scale Fusion. Joint training. TPAMI 2022.
paper59.52026Source ↗Looks wrong?
08SRFormer
Segmentation+Regression Transformer. AAAI 2024.
unverified59.42026Source ↗Looks wrong?
09Corner-based Region Proposals
From paper: Detecting Multi-Oriented Text with Corner-based Region Proposals
verified59.12018Paper ↗Code ↗Looks wrong?
10TextBoxes++_MS
From paper: TextBoxes++: A Single-Shot Oriented Scene Text Detector
verified58.722018Paper ↗Code ↗Looks wrong?
11FCENet
Fourier Contour Embedding. CVPR 2021.
paper57.92026Source ↗Looks wrong?
12PSENet
Progressive Scale Expansion Network. CVPR 2019.
paper562026Source ↗Looks wrong?
13ABCNet v2
Adaptive Bezier-Curve Network v2. TPAMI 2021.
paper53.22026Source ↗Looks wrong?
14EAST + VGG16
From paper: EAST: An Efficient and Accurate Scene Text Detector
verified39.452017Paper ↗Code ↗Looks wrong?
15SSTD
From paper: Single Shot Text Detector with Regional Attention
verified372017Paper ↗Code ↗Looks wrong?
16WordSup (VGG16-synth-coco)
From paper: WordSup: Exploiting Word Annotations for Character based Text Detection
verified36.82017Paper ↗Looks wrong?
17Yao et al.
From paper: Scene Text Detection via Holistic, Multi-Channel Prediction
verified33.312016Paper ↗Looks wrong?
18DRRG
Deep Relational Reasoning Graph. CVPR 2020.
unverified31.92026Source ↗Looks wrong?

Recall

Recall is the reported evaluation metric for coco-text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Recallverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01Corner-based Region Proposals
From paper: Detecting Multi-Oriented Text with Corner-based Region Proposals
verified63.32018Paper ↗Code ↗Looks wrong?
02TextBoxes++_MS
From paper: TextBoxes++: A Single-Shot Oriented Scene Text Detector
verified56.72018Paper ↗Code ↗Looks wrong?
03EAST + VGG16
From paper: EAST: An Efficient and Accurate Scene Text Detector
verified32.42017Paper ↗Code ↗Looks wrong?
04SSTD
From paper: Single Shot Text Detector with Regional Attention
verified312017Paper ↗Code ↗Looks wrong?
05WordSup (VGG16-synth-coco)
From paper: WordSup: Exploiting Word Annotations for Character based Text Detection
verified30.92017Paper ↗Looks wrong?
06Yao et al.
From paper: Scene Text Detection via Holistic, Multi-Channel Prediction
verified27.12016Paper ↗Looks wrong?

Precision

Precision is the reported evaluation metric for coco-text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Precisionverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01TextBoxes++_MS
From paper: TextBoxes++: A Single-Shot Oriented Scene Text Detector
verified60.872018Paper ↗Code ↗Looks wrong?
02Corner-based Region Proposals
From paper: Detecting Multi-Oriented Text with Corner-based Region Proposals
verified55.52018Paper ↗Code ↗Looks wrong?
03EAST + VGG16
From paper: EAST: An Efficient and Accurate Scene Text Detector
verified50.392017Paper ↗Code ↗Looks wrong?
04SSTD
From paper: Single Shot Text Detector with Regional Attention
verified462017Paper ↗Code ↗Looks wrong?
05WordSup (VGG16-synth-coco)
From paper: WordSup: Exploiting Word Annotations for Character based Text Detection
verified45.22017Paper ↗Looks wrong?
06Yao et al.
From paper: Scene Text Detection via Holistic, Multi-Channel Prediction
verified43.232016Paper ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Detection