Optical Character Recognition2020en

mldoc-zero-shot-english-to-russian

Dataset from Papers With Code

Metrics:accuracy, cer, wer, f1
Current State of the Art

XLMft UDA

Unknown

89.7

accuracy

accuracy Progress Over Time

Showing 3 breakthroughs from May 2018 to Sep 2019

58.667.175.684.092.5May 2018Dec 2018Sep 2019accuracyDate

Key Milestones

May 2018
BiLSTM (UN)

From paper: A Corpus for Multilingual Document Classification in Eight Languages

61.4
Dec 2018
Massively Multilingual Sentence Embeddings

From paper: Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

67.8
+10.4%
Sep 2019
XLMft UDACurrent SOTA

From paper: Bridging the domain gap in cross-lingual document classification

89.7
+32.3%
Total Improvement
46.0%
Time Span
1y 4m
Breakthroughs
3
Current SOTA
89.7

Top Models Performance Comparison

Top 5 models ranked by accuracy

accuracy1XLMft UDA89.7100.0%2MultiFiT, pseudo67.875.6%3Massively Multilingual Se...67.875.6%4BiLSTM (UN)61.468.5%5MultiCCA + CNN60.867.8%0%25%50%75%100%% of best
Best Score
89.7
Top Model
XLMft UDA
Models Compared
5
Score Range
28.9

accuracyPrimary

Related Papers4

Other Optical Character Recognition Datasets