Scene Text Detection2015en
ICDAR 2015 Incidental Scene Text
1000 training + 500 test images captured with wearable cameras. Industry standard for scene text detection.
accuracy
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | PGNet-A | 62.3 | Apr 2021 | |
| 2 | PGNet-E | 57.4 | - |
f-measure
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | TextFuseNet (ResNeXt-101) | 92.23 | TextFuseNet: Scene Text Detection with Richer Fused FeaturesCode | May 2020 |
| 2 | CharNet H-88 (multi-scale) | 91.55 | Oct 2019 | |
| 3 | CharNet H-88 (single-scale) | 90.97 | Oct 2019 | |
| 4 | CharNet H-50 (multi-scale) | 90.16 | Oct 2019 | |
| 5 | SBD | 90.1 | Dec 2019 | |
| 6 | CharNet H-57 (multi-scale) | 90.06 | Oct 2019 | |
| 7 | FOTS MS | 89.84 | Jan 2018 | |
| 8 | CharNet H-50 (single-scale) | 89.7 | Oct 2019 | |
| 9 | CharNet H-57 (single-scale) | 89.66 | Oct 2019 | |
| 10 | PMTD* | 89.33 | Mar 2019 | |
| 11 | GNNets | 88.52 | Sep 2019 | |
| 12 | FOTS | 87.99 | Jan 2018 | |
| 13 | DB-ResNet-50 (1152) | 87.3 | Nov 2019 | |
| 14 | DBNet++ (ResNet-50) (1152) | 87.3 | Feb 2022 | |
| 15 | SPCNET | 87.2 | Nov 2018 | |
| 16 | FAST-B-1280 | 87.1 | Nov 2021 | |
| 17 | SAST | 86.91 | Aug 2019 | |
| 18 | CRAFT | 86.9 | Apr 2019 | |
| 19 | FAST-B-896 | 86.3 | Nov 2021 | |
| 20 | Mask TextSpotter | 86 | Jul 2018 | |
| 21 | PSENet-1s | 85.7 | Mar 2019 | |
| 22 | FAST-B-736 | 84.7 | Nov 2021 | |
| 23 | SLPR | 84.5 | Jan 2018 | |
| 24 | Corner-based Region Proposals | 84.5 | Apr 2018 | |
| 25 | Corner Localization (multi-scale) | 84.3 | Feb 2018 | |
| 26 | FTSN + MNMS | 84.1 | Sep 2017 | |
| 27 | PixelLink+VGG16 2s | 83.7 | Jul 2018 | |
| 28 | DBNet++ (ResNet-18) (736) | 83.1 | Feb 2022 | |
| 29 | Quad_MS | 82.9 | Jan 2018 | |
| 30 | FAST-S-736 | 82.9 | Nov 2021 | |
| 31 | PAN | 82.9 | Apr 2017 | |
| 32 | TextSnake | 82.6 | Aug 2019 | |
| 33 | FAST-T-736 | 81.7 | Nov 2021 | |
| 34 | EAST + PVANET2x RBOX (multi-scale) | 80.7 | Sep 2017 | |
| 35 | EAST + PVANET2x RBOX (single-scale) | 78.2 | Apr 2017 | |
| 36 | WordSup (VGG16-synth-icdar) | 78.2 | Mar 2017 | |
| 37 | SSTD | 77 | Aug 2017 | |
| 38 | SegLink | 75 | Apr 2016 | |
| 39 | MCLAB_FCN | 53.6 | Apr 2021 |
f-measure-generic-lexicon
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | UNITS | 80.3 | Apr 2023 | |
| 2 | A3S | 79.6 | Feb 2023 | |
| 3 | DeepSolo (ViTAEv2-S, TextOCR) | 79.5 | Nov 2022 | |
| 4 | DeepSolo (ResNet-50, TextOCR) | 79.1 | Nov 2022 | |
| 5 | DeepSolo (ResNet-50) | 76.9 | Nov 2022 | |
| 6 | GLASS | 76.3 | Aug 2022 | |
| 7 | SRTS | 74.5 | Jul 2022 | |
| 8 | MaskTextSpotter v3 | 74.2 | Jul 2020 | |
| 9 | TESTR | 73.6 | Apr 2022 | |
| 10 | ABCNet v2 | 73 | May 2021 | |
| 11 | SPTS v2 | 72.6 | Jan 2023 | |
| 12 | SwinTextSpotter | 70.5 | Mar 2022 | |
| 13 | MANGO | 67.3 | Dec 2020 | |
| 14 | SPTS | 65.8 | Dec 2021 | |
| 15 | TextDragon | 65.2 | TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting | Oct 2019 |
| 16 | TextPerceptron | 65.1 | Feb 2020 | |
| 17 | PGNet | 63.5 | Apr 2021 | |
| 18 | FOTS | 62.2 | Jan 2018 |
f-measure-strong-lexicon
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | UNITS | 89 | Apr 2023 | |
| 2 | DeepSolo (ViTAEv2-S, TextOCR) | 88.1 | Nov 2022 | |
| 3 | DeepSolo (ResNet-50, TextOCR) | 88 | Nov 2022 | |
| 4 | DeepSolo (ResNet-50) | 86.8 | Nov 2022 | |
| 5 | SRTS | 85.6 | Jul 2022 | |
| 6 | TESTR | 85.2 | Apr 2022 | |
| 7 | A3S | 84.8 | Feb 2023 | |
| 8 | GLASS | 84.7 | Aug 2022 | |
| 9 | SwinTextSpotter | 83.9 | Mar 2022 | |
| 10 | FOTS | 83.6 | Jan 2018 | |
| 11 | MaskTextSpotter v3 | 83.3 | Jul 2020 | |
| 12 | PGNet | 83.3 | Apr 2021 | |
| 13 | ABCNet v2 | 82.7 | May 2021 | |
| 14 | TextDragon | 82.5 | TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting | Oct 2019 |
| 15 | SPTS v2 | 82.3 | Jan 2023 | |
| 16 | MANGO | 81.8 | Dec 2020 | |
| 17 | TextPerceptron | 80.5 | Feb 2020 | |
| 18 | SPTS | 77.5 | Dec 2021 |
f-measure-weak-lexicon
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | UNITS | 84.1 | Apr 2023 | |
| 2 | DeepSolo (ViTAEv2-S, TextOCR) | 83.9 | Nov 2022 | |
| 3 | A3S | 83.7 | Feb 2023 | |
| 4 | DeepSolo (ResNet-50, TextOCR) | 83.5 | Nov 2022 | |
| 5 | DeepSolo (ResNet-50) | 81.9 | Nov 2022 | |
| 6 | SRTS | 81.7 | Jul 2022 | |
| 7 | GLASS | 80.1 | Aug 2022 | |
| 8 | TESTR | 79.4 | Apr 2022 | |
| 9 | MANGO | 78.9 | Dec 2020 | |
| 10 | ABCNet v2 | 78.5 | May 2021 | |
| 11 | TextDragon | 78.3 | TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting | Oct 2019 |
| 12 | PGNet | 78.3 | Apr 2021 | |
| 13 | MaskTextSpotter v3 | 78.1 | Jul 2020 | |
| 14 | SPTS v2 | 77.7 | Jan 2023 | |
| 15 | SwinTextSpotter | 77.3 | Mar 2022 | |
| 16 | TextPerceptron | 76.6 | Feb 2020 | |
| 17 | FOTS | 74.5 | Jan 2018 | |
| 18 | SPTS | 70.2 | Dec 2021 |
fps
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | FAST-T-736 | 60.9 | Nov 2021 | |
| 2 | FAST-S-736 | 53.9 | Nov 2021 | |
| 3 | DBNet++ (ResNet-18) (736) | 44 | Feb 2022 | |
| 4 | FAST-B-736 | 42.7 | Nov 2021 | |
| 5 | FAST-B-896 | 31.8 | Nov 2021 | |
| 6 | FAST-B-1280 | 15.7 | Nov 2021 | |
| 7 | DBNet++ (ResNet-50) (1152) | 10 | Feb 2022 |
precision
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | TextFuseNet (ResNeXt-101) | 93.96 | TextFuseNet: Scene Text Detection with Richer Fused FeaturesCode | May 2020 |
| 2 | CharNet H-88 (multi-scale) | 92.65 | Oct 2019 | |
| 3 | SBD | 92.1 | Dec 2019 | |
| 4 | FOTS MS | 91.85 | Jan 2018 | |
| 5 | DB-ResNet-50 (1152) | 91.8 | Nov 2019 | |
| 6 | Mask TextSpotter | 91.6 | Jul 2018 | |
| 7 | CharNet H-57 (multi-scale) | 91.43 | Oct 2019 | |
| 8 | PMTD* | 91.3 | Mar 2019 | |
| 9 | CharNet H-50 (single-scale) | 91.15 | Oct 2019 | |
| 10 | FOTS | 91 | Jan 2018 | |
| 11 | CharNet H-50 (multi-scale) | 90.9 | Oct 2019 | |
| 12 | DBNet++ (ResNet-50) (1152) | 90.9 | Feb 2022 | |
| 13 | GNNets | 90.41 | Sep 2019 | |
| 14 | DBNet++ (ResNet-18) (736) | 90.1 | Feb 2022 | |
| 15 | CharNet H-88 (single-scale) | 89.99 | Oct 2019 | |
| 16 | CRAFT | 89.8 | Apr 2019 | |
| 17 | FAST-B-1280 | 89.7 | Nov 2021 | |
| 18 | Corner Localization (multi-scale) | 89.5 | Feb 2018 | |
| 19 | FAST-B-896 | 89.2 | Nov 2021 | |
| 20 | CharNet H-57 (single-scale) | 88.88 | Oct 2019 | |
| 21 | SPCNET | 88.7 | Nov 2018 | |
| 22 | Corner-based Region Proposals | 88.7 | Apr 2018 | |
| 23 | FTSN + MNMS | 88.6 | Sep 2017 | |
| 24 | FAST-B-736 | 88 | Nov 2021 | |
| 25 | Quad_MS | 87.8 | Jan 2018 | |
| 26 | PSENet-1s | 86.9 | Mar 2019 | |
| 27 | SAST | 86.72 | Aug 2019 | |
| 28 | FAST-S-736 | 86.3 | Nov 2021 | |
| 29 | FAST-T-736 | 86 | Nov 2021 | |
| 30 | SLPR | 85.5 | Jan 2018 | |
| 31 | PixelLink+VGG16 2s | 85.5 | Jul 2018 | |
| 32 | TextSnake | 84.9 | Aug 2019 | |
| 33 | PAN | 84 | Apr 2017 | |
| 34 | EAST + PVANET2x RBOX (single-scale) | 83.6 | Apr 2017 | |
| 35 | EAST + PVANET2x RBOX (multi-scale) | 83.3 | Sep 2017 | |
| 36 | SSTD | 80 | Aug 2017 | |
| 37 | WordSup (VGG16-synth-icdar) | 79.3 | Mar 2017 | |
| 38 | SegLink | 73.1 | Apr 2016 | |
| 39 | MCLAB_FCN | 70.8 | Apr 2021 |
recall
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | CharNet H-88 (single-scale) | 91.98 | Oct 2019 | |
| 2 | TextFuseNet (ResNeXt-101) | 90.56 | TextFuseNet: Scene Text Detection with Richer Fused FeaturesCode | May 2020 |
| 3 | CharNet H-88 (multi-scale) | 90.47 | Oct 2019 | |
| 4 | CharNet H-57 (single-scale) | 90.45 | Oct 2019 | |
| 5 | CharNet H-50 (multi-scale) | 89.44 | Oct 2019 | |
| 6 | CharNet H-57 (multi-scale) | 88.74 | Oct 2019 | |
| 7 | CharNet H-50 (single-scale) | 88.3 | Oct 2019 | |
| 8 | SBD | 88.2 | Dec 2019 | |
| 9 | FOTS MS | 87.92 | Jan 2018 | |
| 10 | PMTD* | 87.43 | Mar 2019 | |
| 11 | SAST | 87.09 | Aug 2019 | |
| 12 | GNNets | 86.71 | Sep 2019 | |
| 13 | SPCNET | 85.8 | Nov 2018 | |
| 14 | FOTS | 85.17 | Jan 2018 | |
| 15 | FAST-B-1280 | 84.6 | Nov 2021 | |
| 16 | PSENet-1s | 84.5 | Mar 2019 | |
| 17 | CRAFT | 84.3 | Apr 2019 | |
| 18 | DBNet++ (ResNet-50) (1152) | 83.9 | Feb 2022 | |
| 19 | SLPR | 83.6 | Jan 2018 | |
| 20 | FAST-B-896 | 83.6 | Nov 2021 | |
| 21 | DB-ResNet-50 (1152) | 83.2 | Nov 2019 | |
| 22 | PixelLink+VGG16 2s | 82 | Jul 2018 | |
| 23 | PAN | 81.9 | Apr 2017 | |
| 24 | FAST-B-736 | 81.7 | Nov 2021 | |
| 25 | Mask TextSpotter | 81 | Jul 2018 | |
| 26 | Corner-based Region Proposals | 80.7 | Apr 2018 | |
| 27 | TextSnake | 80.4 | Aug 2019 | |
| 28 | FTSN + MNMS | 80 | Sep 2017 | |
| 29 | FAST-S-736 | 79.8 | Nov 2021 | |
| 30 | Corner Localization (multi-scale) | 79.7 | Feb 2018 | |
| 31 | Quad_MS | 78.5 | Jan 2018 | |
| 32 | EAST + PVANET2x RBOX (multi-scale) | 78.3 | Sep 2017 | |
| 33 | FAST-T-736 | 77.9 | Nov 2021 | |
| 34 | DBNet++ (ResNet-18) (736) | 77.2 | Feb 2022 | |
| 35 | WordSup (VGG16-synth-icdar) | 77 | Mar 2017 | |
| 36 | SegLink | 76.8 | Apr 2016 | |
| 37 | EAST + PVANET2x RBOX (single-scale) | 73.5 | Apr 2017 | |
| 38 | SSTD | 73 | Aug 2017 | |
| 39 | MCLAB_FCN | 43 | Apr 2021 |
Related Papers39
Towards Unified Scene Text Spotting based on Sequence Generation
Apr 2023Models: UNITS
SPTS v2: Single-Point Scene Text Spotting
Jan 2023Models: SPTS v2
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Nov 2022Models: DeepSolo (ViTAEv2-S, TextOCR), DeepSolo (ResNet-50, TextOCR), DeepSolo (ResNet-50)
GLASS: Global to Local Attention for Scene-Text Spotting
Aug 2022Models: GLASS
Text Spotting Transformers
Apr 2022Models: TESTR
SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
Mar 2022Models: SwinTextSpotter
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
Feb 2022Models: DBNet++ (ResNet-50) (1152), DBNet++ (ResNet-18) (736)
SPTS: Single-Point Text Spotting
Dec 2021Models: SPTS
FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
Nov 2021Models: FAST-B-1280, FAST-B-896, FAST-B-736 +2 more
ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting
May 2021Models: ABCNet v2
PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network
Apr 2021Models: PGNet-A, MCLAB_FCN, PGNet
MANGO: A Mask Attention Guided One-Stage Scene Text Spotter
Dec 2020Models: MANGO
Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting
Jul 2020Models: MaskTextSpotter v3
Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
Feb 2020Models: TextPerceptron
Real-time Scene Text Detection with Differentiable Binarization
Nov 2019Models: DB-ResNet-50 (1152)
Convolutional Character Networks
Oct 2019Models: CharNet H-88 (multi-scale), CharNet H-88 (single-scale), CharNet H-50 (multi-scale) +3 more
Geometry Normalization Networks for Accurate Scene Text Detection
Sep 2019Models: GNNets
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Aug 2019Models: TextSnake
Character Region Awareness for Text Detection
Apr 2019Models: CRAFT
Pyramid Mask Text Detector
Mar 2019Models: PMTD*
Shape Robust Text Detection with Progressive Scale Expansion Network
Mar 2019Models: PSENet-1s
Scene Text Detection with Supervised Pyramid Context Network
Nov 2018Models: SPCNET
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Jul 2018Models: Mask TextSpotter
TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
Jul 2018Models: PixelLink+VGG16 2s
Detecting Multi-Oriented Text with Corner-based Region Proposals
Apr 2018Models: Corner-based Region Proposals
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
Feb 2018Models: Corner Localization (multi-scale)
TextBoxes++: A Single-Shot Oriented Scene Text Detector
Jan 2018Models: Quad_MS
FOTS: Fast Oriented Text Spotting with a Unified Network
Jan 2018Models: FOTS MS, FOTS
PixelLink: Detecting Scene Text via Instance Segmentation
Jan 2018Models: SLPR
Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
Sep 2017Models: FTSN + MNMS
Single Shot Text Detector with Regional Attention
Sep 2017Models: EAST + PVANET2x RBOX (multi-scale)
WordSup: Exploiting Word Annotations for Character based Text Detection
Aug 2017Models: SSTD
EAST: An Efficient and Accurate Scene Text Detector
Apr 2017Models: PAN, EAST + PVANET2x RBOX (single-scale)
Detecting Oriented Text in Natural Images by Linking Segments
Mar 2017Models: WordSup (VGG16-synth-icdar)
Multi-Oriented Text Detection with Fully Convolutional Networks
Apr 2016Models: SegLink