Benchmarks

326 benchmarks across 82 tasks in 14 research areas.

75Active
2469Results
82Tasks
14Areas

Top by area

Ranked by results, recency, and community interest.

5 resultsaccuracyEst. 2019Latest: Jun 2025
6 resultssuccess-rateEst. 2024Latest: Apr 2025
5 resultsnormalized-scoreEst. 2024Latest: Apr 2025
5 resultstask-horizon-minutesEst. 2024Latest: Apr 2025
6 resultsaurocEst. 2021Latest: Mar 2025
15 resultsresolve-rateEst. 2024Latest: Feb 2025
11 resultsaccuracyEst. 2024Latest: Feb 2025
9 resultsaccuracyEst. 2019Latest: Feb 2025
15 resultsaccuracyEst. 2020Latest: Feb 2025
12 resultsmseEst. 2021Latest: Feb 2025

All benchmarks

82 resultsaccuracyEst. 2020Latest: Dec 2024
80 resultsaccuracyEst. 2020Latest: May 2023
icdar2013Legacy
39 resultsaccuracyEst. 2020Latest: Aug 2023
dartNeeds Research
32 resultsaccuracyEst. 2020Latest: Oct 2023
tabfactNeeds Research
23 resultsaccuracyEst. 2020Latest: Dec 2024
icdar2015Needs Research
26 resultsaccuracyEst. 2020Latest: Aug 2023
inverse-textNeeds Research
18 resultsaccuracyEst. 2020Latest: May 2023
15 resultsaccuracyEst. 2020Latest: Feb 2025
sun-rgb-dNeeds Research
19 resultsaccuracyEst. 2020Latest: Jun 2021
CodeSearchNetNeeds Research
14 resultsaccuracyEst. 2020Latest: Sep 2024
pendigitsNeeds Research
15 resultsaccuracyEst. 2020Latest: May 2021
lam(line-level)Needs Research
12 resultsaccuracyEst. 2020Latest: Sep 2024
read2016(line-level)Needs Research
9 resultsaccuracyEst. 2020Latest: Sep 2024
iam(line-level)Needs Research
9 resultsaccuracyEst. 2020Latest: Sep 2024
howsumm-stepNeeds Research
11 resultsaccuracyEst. 2020Latest: Oct 2021
e2eNeeds Research
10 resultsaccuracyEst. 2020Latest: Jul 2021
howsumm-methodNeeds Research
9 resultsaccuracyEst. 2020Latest: Oct 2021
urdudocNeeds Research
9 resultsaccuracyEst. 2020Latest: Jun 2023
wikibioNeeds Research
8 resultsaccuracyEst. 2020Latest: Feb 2021
8 resultscerEst. 2024
codesearchnet---phpNeeds Research
8 resultsaccuracyEst. 2020Latest: Apr 2021
8 resultsaccuracyEst. 2020Latest: Apr 2021
belfortNeeds Research
8 resultsaccuracyEst. 2020Latest: Jun 2023
reuters-21578Needs Research
8 resultsaccuracyEst. 2020Latest: Mar 2020
codesearchnet---javaNeeds Research
8 resultsaccuracyEst. 2020Latest: Apr 2021
read-2016Needs Research
4 resultsaccuracyEst. 2020Latest: Sep 2024
codesearchnet---goNeeds Research
7 resultsaccuracyEst. 2020Latest: Apr 2021
7 resultsaccuracyEst. 2020Latest: Aug 2023
7 resultsaccuracyEst. 2020Latest: Apr 2021
codesearchnet---rubyNeeds Research
7 resultsaccuracyEst. 2020Latest: Apr 2021
webnlg-(all)Needs Research
6 resultsaccuracyEst. 2020Latest: Jul 2021
6 resultsaccuracyEst. 2020Latest: Sep 2019
hocNeeds Research
6 resultsaccuracyEst. 2020Latest: Oct 2022
tobacco-small-3482Needs Research
6 resultsaccuracyEst. 2020Latest: Apr 2020
webnlg-(seen)Needs Research
6 resultsaccuracyEst. 2020Latest: Jul 2021
6 resultsaccuracyEst. 2020Latest: Sep 2019
webnlg-(unseen)Needs Research
6 resultsaccuracyEst. 2020Latest: Jul 2021
Thai OCR BenchmarkNeeds Research
5 resultsted-scoreEst. 2024
5 resultsaccuracyEst. 2020Latest: Sep 2019
5 resultsaccuracyEst. 2020Latest: Feb 2020
5 resultsaccuracyEst. 2020Latest: Sep 2019
5 resultsaccuracyEst. 2020Latest: Sep 2019
dwieNeeds Research
1 resultsaccuracyEst. 2020Latest: Dec 2024
re-docredNeeds Research
1 resultsaccuracyEst. 2020Latest: Dec 2024
bc8Needs Research
1 resultsaccuracyEst. 2020Latest: Jan 2025
1 resultsaccuracyEst. 2020Latest: Oct 2024
docred-ieNeeds Research
1 resultsaccuracyEst. 2020Latest: Apr 2024
lunNeeds Research
1 resultsaccuracyEst. 2020Latest: Oct 2024
4 resultsaccuracyEst. 2020Latest: Sep 2019
bbcsportNeeds Research
4 resultsaccuracyEst. 2020Latest: Mar 2020
stdwNeeds Research
4 resultsaccuracyEst. 2020Latest: Aug 2022
twitterNeeds Research
3 resultsaccuracyEst. 2020Latest: Mar 2020
cub-200-2011Needs Research
3 resultsaccuracyEst. 2020Latest: Dec 2023
sutNeeds Research
3 resultsaccuracyEst. 2020Latest: Nov 2023
rotowireNeeds Research
3 resultsaccuracyEst. 2020Latest: Aug 2021
fsns---testNeeds Research
3 resultsaccuracyEst. 2020Latest: Dec 2017
bbc-xsumNeeds Research
3 resultsaccuracyEst. 2020Latest: Jul 2020
dareczechNeeds Research
3 resultsaccuracyEst. 2020Latest: Dec 2021
3 resultsaccuracyEst. 2020Latest: Dec 2014
amazonNeeds Research
3 resultsaccuracyEst. 2020Latest: Mar 2020
3 resultsaccuracyEst. 2020Latest: Sep 2019
3 resultsaccuracyEst. 2020Latest: Dec 2014
recipeNeeds Research
2 resultsaccuracyEst. 2020Latest: Dec 2019
simaraNeeds Research
2 resultsaccuracyEst. 2020Latest: Apr 2023
wos-5736Needs Research
2 resultsaccuracyEst. 2020Latest: Sep 2017
i2l-140kNeeds Research
2 resultsaccuracyEst. 2020Latest: Feb 2018
classicNeeds Research
2 resultsaccuracyEst. 2020Latest: Dec 2019
scidocs-(mag)Needs Research
2 resultsaccuracyEst. 2020Latest: Feb 2022
aapdNeeds Research
2 resultsaccuracyEst. 2020Latest: Feb 2020
icdar-2019Needs Research
2 resultsaccuracyEst. 2020Latest: Mar 2022
scidocs-(mesh)Needs Research
2 resultsaccuracyEst. 2020Latest: Feb 2022
dise-2021-datasetNeeds Research
2 resultsaccuracyEst. 2020Latest: Oct 2022
imdb-mNeeds Research
2 resultsaccuracyEst. 2020Latest: Mar 2021
cedar-signatureNeeds Research
2 resultsaccuracyEst. 2020Latest: Sep 2020
textzoomNeeds Research
2 resultsaccuracyEst. 2020Latest: Nov 2022
clueweb09-bNeeds Research
2 resultsaccuracyEst. 2020Latest: Jun 2019
australianNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
warppie10pNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
wos-11967Needs Research
1 resultsaccuracyEst. 2020Latest: Sep 2017
digital-peterNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
hkrNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
food-101Needs Research
1 resultsaccuracyEst. 2020Latest: Dec 2020
baNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
1 resultsaccuracyEst. 2020Latest: Nov 2021
iam-bNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
1 resultsaccuracyEst. 2020Latest: May 2018
iam-dNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
saint-gallNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
reuters-de-enNeeds Research
1 resultsaccuracyEst. 2020Latest: Oct 2014
mpqaNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2019
wineNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
textsegNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2022
pixraw10pNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
im2latex-100kNeeds Research
1 resultsaccuracyEst. 2020Latest: Feb 2018
jaffeNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
1 resultsaccuracyEst. 2020Latest: Nov 2021
irisNeeds Research
1 resultsaccuracyEst. 2020Latest: Nov 2020
and-datasetNeeds Research
1 resultsaccuracyEst. 2020Latest: Sep 2020
reuters-en-deNeeds Research
1 resultsaccuracyEst. 2020Latest: Oct 2014
yelp-14Needs Research
1 resultsaccuracyEst. 2020Latest: Apr 2019
wikilingua-(tr->en)Needs Research
1 resultsaccuracyEst. 2020Latest: Dec 2021
wos-46985Needs Research
1 resultsaccuracyEst. 2020Latest: Sep 2017
benthamNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2021
1 resultsaccuracyEst. 2020Latest: Nov 2022
ephoieNeeds Research
1 resultsaccuracyEst. 2020Latest: Apr 2022
cl-scisummNeeds Research
1 resultsaccuracyEst. 2020Latest: Sep 2019
cerEst. 2025
cerEst. 2021
188 resultsf1Est. 2015Latest: Apr 2023
Total-TextNeeds Research
108 resultsf1Est. 2017Latest: Aug 2023
msra-td500Needs Research
61 resultsaccuracyEst. 2020Latest: Aug 2023
49 resultsaccuracyEst. 2020Latest: Jul 2022
icdar-2017-mltNeeds Research
42 resultsaccuracyEst. 2020Latest: Dec 2019
coco-textNeeds Research
33 resultsaccuracyEst. 2020Latest: May 2023
18 resultsf1Est. 2019Latest: Feb 2022
ic19-artNeeds Research
8 resultsaccuracyEst. 2020Latest: Aug 2023
4 resultsf1Est. 2019Latest: Sep 2019
ic19-rectsNeeds Research
1 resultsaccuracyEst. 2020Latest: Jun 2019
85 resultsaccuracyEst. 2020Latest: Dec 2024
18 resultsaccuracyEst. 2020
12 resultsaccuracyEst. 2020
u-diads-bibNeeds Research
8 resultsaccuracyEst. 2020Latest: Sep 2024
d4laNeeds Research
3 resultsaccuracyEst. 2020Latest: Dec 2024
svtActive
40 resultsaccuracyEst. 2020Latest: Aug 2023
iiit5kNeeds Research
21 resultsaccuracyEst. 2020Latest: Aug 2023
cute80Needs Research
20 resultsaccuracyEst. 2020Latest: Aug 2023
svtpNeeds Research
19 resultsaccuracyEst. 2020Latest: Aug 2023
icdar-2003Needs Research
12 resultsaccuracyEst. 2020Latest: Mar 2022
wostNeeds Research
5 resultsaccuracyEst. 2020Latest: May 2023
uber-textNeeds Research
3 resultsaccuracyEst. 2020Latest: May 2023
hostNeeds Research
3 resultsaccuracyEst. 2020Latest: May 2023
msdaNeeds Research
2 resultsaccuracyEst. 2020Latest: Aug 2021
svt-pNeeds Research
1 resultsaccuracyEst. 2020Latest: May 2023
ic13Needs Research
1 resultsaccuracyEst. 2020Latest: May 2023
OmniDocBench v1.5Needs Research
28 resultscompositeEst. 2024
28 resultspass-rateEst. 2024
rvl-cdipActive
33 resultsaccuracyEst. 2020Latest: Dec 2024
tobacco-3482Needs Research
14 resultsaccuracyEst. 2020Latest: Jan 2023
noisy-bangla-numeralNeeds Research
2 resultsaccuracyEst. 2020Latest: Aug 2019
2 resultsaccuracyEst. 2020Latest: Aug 2019
noisy-mnistNeeds Research
1 resultsaccuracyEst. 2020Latest: Aug 2019
n-mnistNeeds Research
1 resultsaccuracyEst. 2020Latest: Jun 2018
aipNeeds Research
1 resultsaccuracyEst. 2020Latest: Mar 2021
Object Detection3object-detection
13 resultsmAPEst. 2014
16 resultsmask-apEst. 2019Latest: Nov 2024
6 resultsmAPEst. 2012Latest: Dec 2015
32 resultsoverall-en-privateEst. 2024
12 resultsmulti-scene-f1Est. 2024
6 resultstotal-accuracyEst. 2024
accuracyEst. 2025
22 resultscerEst. 1999Latest: Sep 2024
8 resultshandwritten-levenshteinEst. 2024
kohtdNeeds Research
4 resultsaccuracyEst. 2020Latest: Sep 2021
3 resultsaccuracyEst. 2020Latest: Aug 2020
1 resultsaccuracyEst. 2020Latest: Oct 2022
accuracyEst. 2020
pubtabnetActive
18 resultsaccuracyEst. 2020Latest: Apr 2024
12 resultsaccuracyEst. 2020
6 resultsaccuracyEst. 2020
wtwNeeds Research
1 resultsaccuracyEst. 2020Latest: Mar 2023
1 resultsaccuracyEst. 2020Latest: Apr 2021
Image Classification4image-classification
16 resultstop-1-accuracyEst. 2012
4 resultsaccuracyEst. 2009
3 resultsaccuracyEst. 2009
2 resultstop-1-accuracyEst. 2019
Semantic Segmentation2image-segmentation
6 resultsmIoUEst. 2016
Cityscapes DatasetNeeds Research
mIoUEst. 2016
Depth Estimation2depth-estimation
NYU Depth V2Needs Research
abs-relEst. 2012
KITTI DepthNeeds Research
abs-relEst. 2012
Text-to-Video2text-to-video
VBenchNeeds Research
compositeEst. 2023
EvalCrafterNeeds Research
compositeEst. 2023
Image-to-Image2image-to-image
Set5Needs Research
psnrEst. 2012
Urban100Needs Research
psnrEst. 2015
Zero-Shot Object Detection2zero-shot-object-detection
LVIS Zero-ShotNeeds Research
mapEst. 2019
OmniLabelNeeds Research
mapEst. 2023
Zero-Shot Image Classification1zero-shot-image-classification
ImageNet Zero-ShotNeeds Research
top-1-accuracyEst. 2009
Video Classification3video-classification
top-1-accuracyEst. 2017
UCF-101Needs Research
top-1-accuracyEst. 2012
Kinetics-400Needs Research
top-1-accuracyEst. 2017
Mask Generation1mask-generation
SA-1BNeeds Research
iouEst. 2023
Keypoint Detection2keypoint-detection
MPII Human PoseNeeds Research
accuracyEst. 2014
COCO KeypointsNeeds Research
mapEst. 2014
Unconditional Image Generation2unconditional-image-generation
LSUN Bedroom FIDNeeds Research
fidEst. 2015
CIFAR-10 FIDNeeds Research
fidEst. 2009
Image Feature Extraction1image-feature-extraction
ImageNet Linear ProbeNeeds Research
top-1-accuracyEst. 2009
Image-to-3D1image-to-3d
compositeEst. 2022
Image-to-Video1image-to-video
I2VBenchNeeds Research
compositeEst. 2024
Video-to-Video1video-to-video
DAVISNeeds Research
j-and-fEst. 2016
Text-to-3D1text-to-3d
T3BenchNeeds Research
compositeEst. 2023
34 resultsaccuracyEst. 2021
15 resultsaccuracyEst. 2021
8 resultsaccuracyEst. 2024
24 resultsaccuracyEst. 2021
10 resultsaccuracyEst. 2018
HellaSwagNeeds Research
5 resultsaccuracyEst. 2019
CommonsenseQANeeds Research
3 resultsaccuracyEst. 2019
WinoGrandeNeeds Research
3 resultsaccuracyEst. 2019
24 resultsaccuracyEst. 2024
5 resultsaccuracyEst. 2022
StrategyQANeeds Research
2 resultsaccuracyEst. 2021
HotpotQANeeds Research
2 resultsf1Est. 2018
5 resultsaccuracyEst. 2024
3 resultsaccuracyEst. 2025
LogiQAActive
2 resultsaccuracyEst. 2020
2 resultsaccuracyEst. 2020
3 resultsaccuracyEst. 2016
3 resultsaccuracyEst. 2021
30 resultspass@1Est. 2021Latest: Sep 2024
38 resultsresolve-rateEst. 2024
25 resultspass@1Est. 2024Latest: Mar 2024
19 resultspass@1Est. 2021Latest: Sep 2024
resolve-rateEst. 2023
pass@1Est. 2023
pass@1Est. 2021
pass@1Est. 2022
pass@1Est. 2023
7 resultscomputational-accuracyEst. 2020Latest: Sep 2024
6 resultsaccuracyEst. 2019Latest: Sep 2024
6 resultsexact-matchEst. 2023Latest: Sep 2024
5 resultscorrect-patchesEst. 2014Latest: Apr 2024
Time Series Forecasting6time-series-forecasting
39 resultssmapiEst. 2018Latest: Dec 2024
12 resultsmseEst. 2021Latest: Feb 2025
6 resultsmseEst. 2021Latest: Feb 2025
6 resultsmseEst. 2021Latest: Feb 2025
6 resultsmseEst. 2021Latest: Feb 2025
6 resultsmseEst. 2021Latest: Feb 2025
Tabular Classification1tabular-classification
OpenML-CC18Needs Research
5 resultsaccuracyEst. 2019Latest: Jun 2025
Tabular Regression1tabular-regression
California HousingNeeds Research
2 resultsrmseEst. 1997
Text Classification2text-classification
7 resultsaverage-scoreEst. 2018Latest: Jul 2024
SuperGLUENeeds Research
7 resultsaverage-scoreEst. 2019Latest: Jul 2024
Question Answering1question-answering
9 resultsf1Est. 2018Latest: Jul 2024
Text Summarization1summarization
15 resultsrouge-1Est. 2015Latest: Jul 2024
8 resultsaccuracyEst. 2015Latest: Jul 2024
Named Entity Recognition1token-classification
7 resultsf1Est. 2003Latest: Jul 2024
Text Ranking2text-ranking
BEIRNeeds Research
4 resultsndcgEst. 2021Latest: Sep 2024
MS MARCONeeds Research
4 resultsmrrEst. 2016Latest: Oct 2023
Feature Extraction1feature-extraction
MTEB LeaderboardNeeds Research
6 resultsaccuracyEst. 2022Latest: Sep 2024
WMT'23Needs Research
4 resultsbleuEst. 2023
FLORES-200Needs Research
bleuEst. 2022
Fill-Mask1fill-mask
GLUENeeds Research
3 resultsaccuracyEst. 2018Latest: Jan 2023
Zero-Shot Classification1zero-shot-classification
XNLINeeds Research
3 resultsaccuracyEst. 2018Latest: Jan 2023
Table Question Answering2table-question-answering
WikiTableQuestionsNeeds Research
3 resultsaccuracyEst. 2015Latest: Apr 2020
SQANeeds Research
accuracyEst. 2017
Semantic Textual Similarity1sentence-similarity
STS BenchmarkNeeds Research
3 resultsspearmanEst. 2017Latest: Jan 2024
Language Modeling1text-generation
WikiText PerplexityNeeds Research
perplexityEst. 2016
15 resultsresolve-rateEst. 2024Latest: Feb 2025
6 resultssuccess-rateEst. 2024Latest: Apr 2025
5 resultsnormalized-scoreEst. 2024Latest: Apr 2025
5 resultstask-horizon-minutesEst. 2024Latest: Apr 2025
11 resultsmean-dscEst. 2015Latest: Jan 2024
3 resultsmean-dice-wt-tc-etEst. 2023Latest: Jun 2024
6 resultsmean-dscEst. 2015Latest: Aug 2023
6 resultsmean-dscEst. 2017Latest: Mar 2024
21 resultsaccuracyEst. 2012
4 resultsaurocEst. 2017
3 resultsmapEst. 2018Latest: Jan 2024
2 resultsaccuracyEst. 2017
2 resultsaurocEst. 2022
2 resultsaurocEst. 2020
1 resultsaurocEst. 2020
Visual Question Answering6visual-question-answering
11 resultsaccuracyEst. 2024Latest: Feb 2025
9 resultsaccuracyEst. 2019Latest: Feb 2025
8 resultsaccuracyEst. 2023Latest: Feb 2025
7 resultsaccuracyEst. 2017Latest: Oct 2024
Image Captioning2image-to-text
2 resultsciderEst. 2015Latest: Jan 2023
Audio-Text-to-Text2audio-text-to-text
AudioBenchNeeds Research
accuracyEst. 2024
Any-to-Any1any-to-any
DEMON BenchNeeds Research
accuracyEst. 2024
DPG-BenchNeeds Research
compositeEst. 2024
GenEvalNeeds Research
accuracyEst. 2023
MJHQ-30K FIDNeeds Research
fidEst. 2024
Image-Text-to-Image2image-text-to-image
MagicBrushNeeds Research
clip-scoreEst. 2023
InstructPix2PixNeeds Research
clip-scoreEst. 2023
Image-Text-to-Text3image-text-to-text
MMMUNeeds Research
accuracyEst. 2023
MMStarNeeds Research
accuracyEst. 2024
MMBenchNeeds Research
accuracyEst. 2023
Cross-Modal Retrieval1visual-document-retrieval
ViDoReNeeds Research
ndcg-at-5Est. 2024
Image-Text-to-Video1image-text-to-video
VideoBenchNeeds Research
compositeEst. 2024
Video Understanding2video-text-to-text
Video-MMENeeds Research
accuracyEst. 2024
MVBenchNeeds Research
accuracyEst. 2024
Speech Recognition4automatic-speech-recognition
17 resultswerEst. 2015Latest: Apr 2024
Mozilla Common VoiceNeeds Research
3 resultswerEst. 2019Latest: Dec 2022
Text-to-Speech2text-to-speech
6 resultsmosEst. 2019Latest: Jun 2024
The LJ Speech DatasetNeeds Research
5 resultsmosEst. 2017Latest: Jun 2024
9 resultsaurocEst. 2019Latest: Aug 2023
6 resultsaurocEst. 2021Latest: Aug 2024
6 resultsaurocEst. 2021Latest: Mar 2025
1 resultsmapEst. 2021
3 resultsaurocEst. 2022
1 resultsmapEst. 2013
1 resultsdiceEst. 2019
Atari Games1reinforcement-learning
9 resultshuman-normalized-scoreEst. 2013
9 resultsaverage-returnEst. 2012
6 resultsaccuracyEst. 2000Latest: Apr 2019
Open Graph BenchmarkNeeds Research
accuracyEst. 2020
accuracyEst. 2020
Audio Classification2audio-classification
AudioSetActive
mapEst. 2017
accuracyEst. 2015
Text-to-Audio2text-to-audio
MusicCapsNeeds Research
fadEst. 2023
AudioCapsNeeds Research
fadEst. 2019
Voice Activity Detection2voice-activity-detection
DIHARDNeeds Research
derEst. 2018
AVA-SpeechNeeds Research
accuracyEst. 2018
Audio-to-Audio2audio-to-audio
pesqEst. 2019
DNS ChallengeNeeds Research
si-snrEst. 2020
Robotics2robotics
SIMPLERNeeds Research
success-rateEst. 2024
RLBenchNeeds Research
success-rateEst. 2020