Codesota · Benchmark · Total-TextHome/Leaderboards/Vision & Documents/Scene Text Detection/Total-Text
Unknown

Total-Text.

Curved text benchmark. 1555 images with polygon annotations.

Paper Leaderboard
§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Fps

Fps is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Fpsverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01FAST-T-448
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified152.82021Paper ↗Code ↗Looks wrong?
02FAST-S-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified115.52021Paper ↗Code ↗Looks wrong?
03FAST-B-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified93.22021Paper ↗Code ↗Looks wrong?

precision

Precision is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for precisionverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DAT-SEG
Segmentation variant. ICML 2024. Table 1 in arxiv:2405.19765.
verified95.042024Paper ↗Looks wrong?
02DAT-DET
Detection variant. ICML 2024. Table 1 in arxiv:2405.19765.
verified93.982024Paper ↗Looks wrong?
03MixNet
From paper: MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild
verified932023Paper ↗Code ↗Looks wrong?
04ERRNet (with pre-training)
With pre-training. AAAI 2025. Table 1 in arxiv:2412.14692.
verified92.62024Paper ↗Looks wrong?
05SRFormer (ResNet-50)
From paper: SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
verified92.22023Paper ↗Code ↗Looks wrong?
06DPText-DETR (ResNet-50)
From paper: DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
verified91.82022Paper ↗Code ↗Looks wrong?
07LRANet
With pre-training. AAAI 2024 Oral. Table 7 in arxiv:2412.14692.
verified90.32023Paper ↗Code ↗Looks wrong?
08ERRNet
Without pre-training. AAAI 2025. Table 1 in arxiv:2412.14692.
verified90.12024Paper ↗Looks wrong?
09FAST-B-800
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified902021Paper ↗Code ↗Looks wrong?
10FAST-B-640
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified89.92021Paper ↗Code ↗Looks wrong?
11CharNet H-88
From paper: Convolutional Character Networks
verified89.92019Paper ↗Code ↗Looks wrong?
12I3CL + SSL(ResNet-50)
From paper: I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
verified89.82021Paper ↗Code ↗Looks wrong?
13FAST-B-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified89.62021Paper ↗Code ↗Looks wrong?
14TextMamba
ResNet-50 backbone. Table I in paper. arxiv:2512.06657
verified89.52024Paper ↗Looks wrong?
15PAN-640
From paper: Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
verified89.32019Paper ↗Code ↗Looks wrong?
16TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified89.22020Paper ↗Code ↗Looks wrong?
17DBNet++ (ResNet-50) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified88.92022Paper ↗Code ↗Looks wrong?
18FAST-S-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified88.32021Paper ↗Code ↗Looks wrong?
19CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified882019Paper ↗Code ↗Looks wrong?
20CRAFT
From paper: Character Region Awareness for Text Detection
verified87.62019Paper ↗Code ↗Looks wrong?
21DBNet++ (ResNet-18) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified87.42022Paper ↗Code ↗Looks wrong?
22FAST-T-448
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified86.52021Paper ↗Code ↗Looks wrong?
23FTSN
From paper: Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
verified84.72017Paper ↗Looks wrong?
24PSENet-4s
From paper: Shape Robust Text Detection with Progressive Scale Expansion Network
verified84.52019Paper ↗Code ↗Looks wrong?
25SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified832018Paper ↗Code ↗Looks wrong?
26TextSnake
From paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
verified82.72018Paper ↗Code ↗Looks wrong?
27TextFiled
From paper: TextField: Learning A Deep Direction Field for Irregular Scene Text Detection
verified81.22018Paper ↗Code ↗Looks wrong?

F Measure

F Measure is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measureverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DAT-SEG
Segmentation variant, new SOTA on Total-Text. P=95.04, R=89.16. ICML 2024. Table 1 in arxiv:2405.19765.
verified92.012024Paper ↗Looks wrong?
02DAT-DET
Detection variant. P=93.98, R=88.17. ICML 2024. Table 1 in arxiv:2405.19765.
verified90.982024Paper ↗Looks wrong?
03MixNet
From paper: MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild
verified90.52023Paper ↗Code ↗Looks wrong?
04SRFormer (ResNet-50)
From paper: SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
verified902023Paper ↗Code ↗Looks wrong?
05ERRNet (with pre-training)
With pre-training. P=92.6, R=87.3. AAAI 2025. Table 1 in arxiv:2412.14692.
verified89.92024Paper ↗Looks wrong?
06TextMamba
ResNet-50 backbone. Table I in paper. arxiv:2512.06657
verified89.22024Paper ↗Looks wrong?
07LRANet
With pre-training. P=90.3, R=87.8. AAAI 2024 Oral. Table 7 in arxiv:2412.14692.
verified892023Paper ↗Code ↗Looks wrong?
08DPText-DETR (ResNet-50)
From paper: DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
verified892022Paper ↗Code ↗Looks wrong?
09ERRNet
Without pre-training. P=90.1, R=86.1. AAAI 2025. Table 1 in arxiv:2412.14692.
verified88.12024Paper ↗Looks wrong?
10FAST-B-800
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified87.52021Paper ↗Code ↗Looks wrong?
11TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified87.52020Paper ↗Code ↗Looks wrong?
12I3CL + SSL(ResNet-50)
From paper: I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
verified86.92021Paper ↗Code ↗Looks wrong?
13CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified86.52019Paper ↗Code ↗Looks wrong?
14FAST-B-640
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified86.42021Paper ↗Code ↗Looks wrong?
15DBNet++ (ResNet-50) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified862022Paper ↗Code ↗Looks wrong?
16FAST-B-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified85.82021Paper ↗Code ↗Looks wrong?
17SA-Text
From paper: A method for detecting text of arbitrary shapes in natural scenes that improves text spotting
verified85.62019Paper ↗Looks wrong?
18CharNet H-88
From paper: Convolutional Character Networks
verified85.62019Paper ↗Code ↗Looks wrong?
19PAN-640
From paper: Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
verified852019Paper ↗Code ↗Looks wrong?
20FAST-S-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified84.92021Paper ↗Code ↗Looks wrong?
21DB-ResNet-50 (800)
From paper: Real-time Scene Text Detection with Differentiable Binarization
verified84.72019Paper ↗Code ↗Looks wrong?
22TextCohesion
From paper: TextCohesion: Detecting Text for Arbitrary Shapes
verified84.62019Paper ↗Looks wrong?
23CRAFT
From paper: Character Region Awareness for Text Detection
verified83.62019Paper ↗Code ↗Looks wrong?
24DBNet++ (ResNet-18) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified83.32022Paper ↗Code ↗Looks wrong?
25SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified82.92018Paper ↗Code ↗Looks wrong?
26FAST-T-448
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified81.62021Paper ↗Code ↗Looks wrong?
27FTSN
From paper: Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
verified81.32017Paper ↗Looks wrong?
28TextFiled
From paper: TextField: Learning A Deep Direction Field for Irregular Scene Text Detection
verified80.62018Paper ↗Code ↗Looks wrong?
29PSENet-4s
From paper: Shape Robust Text Detection with Progressive Scale Expansion Network
verified79.62019Paper ↗Code ↗Looks wrong?
30TextSnake
From paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
verified78.42018Paper ↗Code ↗Looks wrong?

F Measure Full Lexicon

F Measure Full Lexicon is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measure Full Lexiconverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DeepSolo (ViTAEv2-S, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified89.62022Paper ↗Code ↗Looks wrong?
02DeepSolo (ResNet-50, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified88.72022Paper ↗Code ↗Looks wrong?
03DeepSolo (ResNet-50)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified872022Paper ↗Code ↗Looks wrong?
04UNITS
From paper: Towards Unified Scene Text Spotting based on Sequence Generation
verified862023Paper ↗Code ↗Looks wrong?
05A3S
From paper: A3S: Adversarial learning of semantic representations for Scene-Text Spotting
verified85.12023Paper ↗Looks wrong?
06SwinTextSpotter
From paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
verified84.12022Paper ↗Code ↗Looks wrong?
07TESTR
From paper: Text Spotting Transformers
verified83.92022Paper ↗Code ↗Looks wrong?
08MANGO
From paper: MANGO: A Mask Attention Guided One-Stage Scene Text Spotter
verified83.62020Paper ↗Code ↗Looks wrong?
09DEER
From paper: DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting
verified83.32022Paper ↗Looks wrong?
10GLASS
From paper: GLASS: Global to Local Attention for Scene-Text Spotting
verified832022Paper ↗Code ↗Looks wrong?
11MaskTextSpotter v3
From paper: Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting
verified78.42020Paper ↗Code ↗Looks wrong?
12ABCNet v2
From paper: ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting
verified78.12021Paper ↗Code ↗Looks wrong?

recall

Recall is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for recallverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DAT-SEG
Segmentation variant. ICML 2024. Table 1 in arxiv:2405.19765.
verified89.162024Paper ↗Looks wrong?
02TextMamba
ResNet-50 backbone. Table I in paper. arxiv:2512.06657
verified88.82024Paper ↗Looks wrong?
03DAT-DET
Detection variant. ICML 2024. Table 1 in arxiv:2405.19765.
verified88.172024Paper ↗Looks wrong?
04MixNet
From paper: MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild
verified88.12023Paper ↗Code ↗Looks wrong?
05SRFormer (ResNet-50)
From paper: SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
verified87.92023Paper ↗Code ↗Looks wrong?
06LRANet
With pre-training. AAAI 2024 Oral. Table 7 in arxiv:2412.14692.
verified87.82023Paper ↗Code ↗Looks wrong?
07ERRNet (with pre-training)
With pre-training. AAAI 2025. Table 1 in arxiv:2412.14692.
verified87.32024Paper ↗Looks wrong?
08DPText-DETR (ResNet-50)
From paper: DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
verified86.42022Paper ↗Code ↗Looks wrong?
09ERRNet
Without pre-training. AAAI 2025. Table 1 in arxiv:2412.14692.
verified86.12024Paper ↗Looks wrong?
10TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified85.82020Paper ↗Code ↗Looks wrong?
11FAST-B-800
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified85.22021Paper ↗Code ↗Looks wrong?
12CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified852019Paper ↗Code ↗Looks wrong?
13I3CL + SSL(ResNet-50)
From paper: I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
verified84.22021Paper ↗Code ↗Looks wrong?
14FAST-B-640
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified83.22021Paper ↗Code ↗Looks wrong?
15DBNet++ (ResNet-50) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified83.22022Paper ↗Code ↗Looks wrong?
16SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified82.82018Paper ↗Code ↗Looks wrong?
17FAST-B-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified82.42021Paper ↗Code ↗Looks wrong?
18CharNet H-88
From paper: Convolutional Character Networks
verified81.72019Paper ↗Code ↗Looks wrong?
19FAST-S-512
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified81.72021Paper ↗Code ↗Looks wrong?
20PAN-640
From paper: Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
verified812019Paper ↗Code ↗Looks wrong?
21TextFiled
From paper: TextField: Learning A Deep Direction Field for Irregular Scene Text Detection
verified79.92018Paper ↗Code ↗Looks wrong?
22CRAFT
From paper: Character Region Awareness for Text Detection
verified79.92019Paper ↗Code ↗Looks wrong?
23DBNet++ (ResNet-18) (800)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified79.62022Paper ↗Code ↗Looks wrong?

F Measure No Lexicon

F Measure No Lexicon is the reported evaluation metric for Total-Text. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measure No Lexiconverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01DeepSolo (ViTAEv2-S, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified83.62022Paper ↗Code ↗Looks wrong?
02DeepSolo (ResNet-50, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified82.52022Paper ↗Code ↗Looks wrong?
03DeepSolo (ResNet-50)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified79.72022Paper ↗Code ↗Looks wrong?
04A3S
From paper: A3S: Adversarial learning of semantic representations for Scene-Text Spotting
verified79.42023Paper ↗Looks wrong?
05UNITS
From paper: Towards Unified Scene Text Spotting based on Sequence Generation
verified78.72023Paper ↗Code ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Detection