Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Benchmark · ICDAR 2015Home/Leaderboards/Vision & Documents/Scene Text Detection/ICDAR 2015
Unknown

ICDAR 2015.

1000 training + 500 test images captured with wearable cameras. Industry standard for scene text detection.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

precision

Precision is the reported evaluation metric for ICDAR 2015. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for precisionverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified93.962020Paper ↗Code ↗Edit result
02CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified92.652019Paper ↗Code ↗Edit result
03SBD
From paper: Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection
verified92.12019Paper ↗Code ↗Edit result
04EK-Net
EK-Net (Expand Kernel Network), arXiv Jan 2024. ResNet-18 backbone at 35.42 FPS. arxiv:2401.11704.
paper922024Source ↗Edit result
05FOTS MS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified91.852018Paper ↗Code ↗Edit result
06DB-ResNet-50 (1152)
From paper: Real-time Scene Text Detection with Differentiable Binarization
verified91.82019Paper ↗Code ↗Edit result
07Mask TextSpotter
From paper: Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
verified91.62018Paper ↗Code ↗Edit result
08CharNet H-57 (multi-scale)
From paper: Convolutional Character Networks
verified91.432019Paper ↗Code ↗Edit result
09PMTD*
From paper: Pyramid Mask Text Detector
verified91.32019Paper ↗Code ↗Edit result
10CharNet H-50 (single-scale)
From paper: Convolutional Character Networks
verified91.152019Paper ↗Code ↗Edit result
11FOTS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified912018Paper ↗Code ↗Edit result
12DBNet++ (ResNet-50) (1152)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified90.92022Paper ↗Code ↗Edit result
13CharNet H-50 (multi-scale)
From paper: Convolutional Character Networks
verified90.92019Paper ↗Code ↗Edit result
14GNNets
From paper: Geometry Normalization Networks for Accurate Scene Text Detection
verified90.412019Paper ↗Code ↗Edit result
15TESTR
TESTR (Text Spotting Transformers), CVPR 2022. Detection-only F-measure on ICDAR 2015 test set without lexicon. arxiv:2204.01918.
unverified90.312022Source ↗Edit result
16DBNet++ (ResNet-18) (736)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified90.12022Paper ↗Code ↗Edit result
17CharNet H-88 (single-scale)
From paper: Convolutional Character Networks
verified89.992019Paper ↗Code ↗Edit result
18CRAFT
From paper: Character Region Awareness for Text Detection
verified89.82019Paper ↗Code ↗Edit result
19FAST-B-1280
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified89.72021Paper ↗Code ↗Edit result
20Corner Localization (multi-scale)
From paper: Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
verified89.52018Paper ↗Code ↗Edit result
21FAST-B-896
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified89.22021Paper ↗Code ↗Edit result
22CharNet H-57 (single-scale)
From paper: Convolutional Character Networks
verified88.882019Paper ↗Code ↗Edit result
23Corner-based Region Proposals
From paper: Detecting Multi-Oriented Text with Corner-based Region Proposals
verified88.72018Paper ↗Code ↗Edit result
24SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified88.72018Paper ↗Code ↗Edit result
25FTSN + MNMS
From paper: Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
verified88.62017Paper ↗Edit result
26FAST-B-736
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified882021Paper ↗Code ↗Edit result
27Quad_MS
From paper: TextBoxes++: A Single-Shot Oriented Scene Text Detector
verified87.82018Paper ↗Code ↗Edit result
28PSENet-1s
From paper: Shape Robust Text Detection with Progressive Scale Expansion Network
verified86.92019Paper ↗Code ↗Source ↗Edit result
29SAST
From paper: A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning
verified86.722019Paper ↗Code ↗Edit result
30FAST-S-736
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified86.32021Paper ↗Code ↗Edit result
31FAST-T-736
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified862021Paper ↗Code ↗Edit result
32SLPR
From paper: PixelLink: Detecting Scene Text via Instance Segmentation
verified85.52018Paper ↗Code ↗Edit result
33PixelLink+VGG16 2s
From paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
verified85.52018Paper ↗Code ↗Edit result
34TextSnake
From paper: Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
verified84.92019Paper ↗Code ↗Edit result
35PAN
From paper: Mask R-CNN with Pyramid Attention Network for Scene Text Detection
verified842017Paper ↗Code ↗Source ↗Edit result

F Measure

F Measure is the reported evaluation metric for ICDAR 2015. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measureverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified92.232020Paper ↗Code ↗Edit result
02CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified91.552019Paper ↗Code ↗Edit result
03CharNet H-88 (single-scale)
From paper: Convolutional Character Networks
verified90.972019Paper ↗Code ↗Edit result
04CharNet H-50 (multi-scale)
From paper: Convolutional Character Networks
verified90.162019Paper ↗Code ↗Edit result
05SBD
From paper: Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection
verified90.12019Paper ↗Code ↗Edit result
06CharNet H-57 (multi-scale)
From paper: Convolutional Character Networks
verified90.062019Paper ↗Code ↗Edit result
07FreeReal+DBNet
FreeReal with DBNet backbone, ECCV 2024. Bridging synthetic and real worlds for pre-training. Achieves 90.0% F-measure on IC15. arxiv:2312.05286.
paper902024Source ↗Edit result
08TESTR
TESTR (Text Spotting Transformers), CVPR 2022. Detection-only F-measure on ICDAR 2015 test set without lexicon. arxiv:2204.01918.
paper902022Source ↗Edit result
09FOTS MS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified89.842018Paper ↗Code ↗Edit result
10CharNet H-50 (single-scale)
From paper: Convolutional Character Networks
verified89.72019Paper ↗Code ↗Edit result
11CharNet H-57 (single-scale)
From paper: Convolutional Character Networks
verified89.662019Paper ↗Code ↗Edit result
12PMTD*
From paper: Pyramid Mask Text Detector
verified89.332019Paper ↗Code ↗Edit result
13GNNets
From paper: Geometry Normalization Networks for Accurate Scene Text Detection
verified88.522019Paper ↗Code ↗Edit result
14FOTS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified87.992018Paper ↗Code ↗Edit result
15DBNet++ (ResNet-50) (1152)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified87.32022Paper ↗Code ↗Edit result
16DB-ResNet-50 (1152)
From paper: Real-time Scene Text Detection with Differentiable Binarization
verified87.32019Paper ↗Code ↗Edit result
17SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified87.22018Paper ↗Code ↗Edit result
18FAST-B-1280
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified87.12021Paper ↗Code ↗Edit result
19SAST
From paper: A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning
verified86.912019Paper ↗Code ↗Edit result
20CRAFT
From paper: Character Region Awareness for Text Detection
verified86.92019Paper ↗Code ↗Edit result
21EK-Net++
EK-Net++ improves EK-Net with Epoch Adaptive Weight algorithm. Expert Systems with Applications 2024.
paper86.722024Source ↗Edit result
22FAST-B-896
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified86.32021Paper ↗Code ↗Edit result
23Mask TextSpotter
From paper: Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
verified862018Paper ↗Code ↗Edit result
24EK-Net
EK-Net (Expand Kernel Network), arXiv Jan 2024. ResNet-18 backbone at 35.42 FPS. arxiv:2401.11704.
paper85.722024Source ↗Edit result
25PSENet-1s
From paper: Shape Robust Text Detection with Progressive Scale Expansion Network
verified85.72019Paper ↗Code ↗Source ↗Edit result
26FAST-B-736
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified84.72021Paper ↗Code ↗Edit result
27Corner-based Region Proposals
From paper: Detecting Multi-Oriented Text with Corner-based Region Proposals
verified84.52018Paper ↗Code ↗Edit result
28SLPR
From paper: PixelLink: Detecting Scene Text via Instance Segmentation
verified84.52018Paper ↗Code ↗Edit result
29Corner Localization (multi-scale)
From paper: Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
verified84.32018Paper ↗Code ↗Edit result
30FTSN + MNMS
From paper: Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
verified84.12017Paper ↗Edit result
31PixelLink+VGG16 2s
From paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
verified83.72018Paper ↗Code ↗Edit result

recall

Recall is the reported evaluation metric for ICDAR 2015. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for recallverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01CharNet H-88 (single-scale)
From paper: Convolutional Character Networks
verified91.982019Paper ↗Code ↗Edit result
02TextFuseNet (ResNeXt-101)
From paper: TextFuseNet: Scene Text Detection with Richer Fused Features
verified90.562020Paper ↗Code ↗Edit result
03CharNet H-88 (multi-scale)
From paper: Convolutional Character Networks
verified90.472019Paper ↗Code ↗Edit result
04CharNet H-57 (single-scale)
From paper: Convolutional Character Networks
verified90.452019Paper ↗Code ↗Edit result
05TESTR
TESTR (Text Spotting Transformers), CVPR 2022. Detection-only F-measure on ICDAR 2015 test set without lexicon. arxiv:2204.01918.
unverified89.72022Source ↗Edit result
06CharNet H-50 (multi-scale)
From paper: Convolutional Character Networks
verified89.442019Paper ↗Code ↗Edit result
07CharNet H-57 (multi-scale)
From paper: Convolutional Character Networks
verified88.742019Paper ↗Code ↗Edit result
08CharNet H-50 (single-scale)
From paper: Convolutional Character Networks
verified88.32019Paper ↗Code ↗Edit result
09SBD
From paper: Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection
verified88.22019Paper ↗Code ↗Edit result
10FOTS MS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified87.922018Paper ↗Code ↗Edit result
11PMTD*
From paper: Pyramid Mask Text Detector
verified87.432019Paper ↗Code ↗Edit result
12SAST
From paper: A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning
verified87.092019Paper ↗Code ↗Edit result
13GNNets
From paper: Geometry Normalization Networks for Accurate Scene Text Detection
verified86.712019Paper ↗Code ↗Edit result
14SPCNET
From paper: Scene Text Detection with Supervised Pyramid Context Network
verified85.82018Paper ↗Code ↗Edit result
15FOTS
From paper: FOTS: Fast Oriented Text Spotting with a Unified Network
verified85.172018Paper ↗Code ↗Edit result
16FAST-B-1280
From paper: FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
verified84.62021Paper ↗Code ↗Edit result
17PSENet-1s
From paper: Shape Robust Text Detection with Progressive Scale Expansion Network
verified84.52019Paper ↗Code ↗Source ↗Edit result
18CRAFT
From paper: Character Region Awareness for Text Detection
verified84.32019Paper ↗Code ↗Edit result
19DBNet++ (ResNet-50) (1152)
From paper: Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
verified83.92022Paper ↗Code ↗Edit result

F Measure Strong Lexicon

F Measure Strong Lexicon is the reported evaluation metric for ICDAR 2015. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measure Strong Lexiconverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01UNITS
From paper: Towards Unified Scene Text Spotting based on Sequence Generation
verified892023Paper ↗Code ↗Edit result
02DeepSolo (ViTAEv2-S, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified88.12022Paper ↗Code ↗Edit result
03DeepSolo (ResNet-50, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified882022Paper ↗Code ↗Edit result
04DeepSolo (ResNet-50)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified86.82022Paper ↗Code ↗Edit result
05SRTS
From paper: Single Shot Self-Reliant Scene Text Spotter by Decoupled yet Collaborative Detection and Recognition
verified85.62022Paper ↗Code ↗Edit result
06TESTR
From paper: Text Spotting Transformers
verified85.22022Paper ↗Code ↗Edit result
07A3S
From paper: A3S: Adversarial learning of semantic representations for Scene-Text Spotting
verified84.82023Paper ↗Edit result
08GLASS
From paper: GLASS: Global to Local Attention for Scene-Text Spotting
verified84.72022Paper ↗Code ↗Edit result
09SwinTextSpotter
From paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
verified83.92022Paper ↗Code ↗Edit result

F Measure Weak Lexicon

F Measure Weak Lexicon is the reported evaluation metric for ICDAR 2015. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for F Measure Weak Lexiconverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksEdit
01UNITS
From paper: Towards Unified Scene Text Spotting based on Sequence Generation
verified84.12023Paper ↗Code ↗Edit result
02DeepSolo (ViTAEv2-S, TextOCR)
From paper: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
verified83.92022Paper ↗Code ↗Edit result
§ 04 · Submit a result

Add to the leaderboard.

← Back to Scene Text Detection