Codesota · Tasks · Scene Text DetectionHome/Tasks/Computer Vision/Scene Text Detection

Computer Vision

Scene Text Detection.

Detecting text regions in natural scene images

11

Datasets

581

Results

accuracy

Canonical metric

§ 02 · Canonical benchmark

The reference dataset.

coco-text

Dataset from Papers With Code

Primary metric: accuracy

View full leaderboard →

§ 03 · Top 10

Leading models.

Leading models on coco-text.

#	Model	1-1-accuracy	Year	Source
★	CLIP4STR-L✓	81.9	2023	paper ↗
2	MGP-STR✓	81.7	2022	paper ↗
3	CLIP4STR-B✓	81.1	2023	paper ↗
4	TCM	65.9	2026	paper ↗
5	PANet (Joint)	64.5	2026	paper ↗
6	Corner-based Region Proposals✓	63.3	2018	paper ↗
7	LRANet	61.7	2026	paper ↗
8	DPText-DETR	61.6	2026	paper ↗
9	TextBoxes++_MS✓	60.9	2018	paper ↗
10	MAEDet	60.6	2026	paper ↗

What were you looking for on Scene Text Detection?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

11 datasets tracked for this task.

33 results · accuracy

Top: CLIP4STR-L — 81.9

188 results · f1

Top: TextFuseNet (ResNeXt-101) — 94.0

126 results · f1

Top: FAST-T-448 — 153

79 results · accuracy

Top: FAST-T-512 — 137

59 results · accuracy

Top: JSTR — 99.2

54 results · accuracy

Top: PMTD* — 84.4

18 results · f1

Top: DBNet++ (ResNet-50) (1024) — 88.5

11 results · accuracy

Top: CLIP4STR-L (DataComp-1B) — 86.4

8 results · accuracy

Top: CLIP4STR-B — 70.8

4 results · f1

Top: pil_maskrcnn — 82.7

1 result · accuracy

Top: BDN — 93.4

§ 05 · Related tasks

Other tasks in Computer Vision.

3D Understanding Depth estimation Document Image Classification Document Layout Analysis Document Parsing Document Understanding General OCR Capabilities Handwriting Recognition

Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Scene Text Detection? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.