HuggingFace Task Mapping - CodeSOTA

Multimodal

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

any-to-any

Any-to-Any

Multimodal

1

0

audio-text-to-text

Audio-Text-to-Text

Multimodal

3

4

document-question-answering

Document Understanding

Computer Vision

3

28

image-text-to-text

Image-Text-to-Text

Multimodal

3

57

image-text-to-image

Image-Text-to-Image

Multimodal

2

0

image-text-to-video

Image-Text-to-Video

Multimodal

1

0

image-to-text

Image Captioning

Multimodal

2

7

text-to-image

Text-to-Image Generation

Multimodal

3

8

video-text-to-text

Video Understanding

Multimodal

2

44

visual-document-retrieval

Cross-Modal Retrieval

Multimodal

1

0

visual-question-answering

Visual Question Answering

Multimodal

8

147

Computer Vision

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

depth-estimation

Depth estimation

Computer Vision

17

0

image-classification

Image Classification

Computer Vision

24

87

image-feature-extraction

Image Feature Extraction

Computer Vision

1

0

image-segmentation

Semantic Segmentation

Computer Vision

2

24

image-to-image

Image-to-Image

Computer Vision

2

0

image-to-video

Image-to-Video

Computer Vision

1

0

image-to-3d

Image-to-3D

Computer Vision

1

0

keypoint-detection

Keypoint Detection

Computer Vision

2

1

mask-generation

Mask Generation

Computer Vision

1

0

object-detection

Object Detection

Computer Vision

11

104

text-to-3d

Text-to-3D

Computer Vision

1

0

text-to-video

Text-to-Video

Computer Vision

2

0

unconditional-image-generation

Unconditional Image Generation

Computer Vision

2

0

video-classification

Video classification

Computer Vision

6

13

video-to-video

Video-to-Video

Computer Vision

1

2

zero-shot-image-classification

Zero-Shot Image Classification

Computer Vision

1

0

zero-shot-object-detection

Zero-Shot Object Detection

Computer Vision

2

0

Natural Language Processing

Audio

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

audio-classification

Audio Classification

Audio

6

5

audio-to-audio

Audio-to-Audio

Audio

2

0

automatic-speech-recognition

Speech Recognition

Speech

11

526

text-to-audio

Text-to-Audio

Audio

1

0

text-to-speech

Text-to-speech

Audio

2

11

voice-activity-detection

Voice Activity Detection

Audio

2

0

Tabular

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

tabular-classification

Tabular Classification

Time Series

1

5

tabular-regression

Tabular Regression

Time Series

1

2

time-series-forecasting

Time-series forecasting

Time-series

6

75

Reinforcement Learning

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

reinforcement-learning

Atari Games

Reinforcement Learning

1

12

robotics

Robotics

Other

5

0

Other

HF Pipeline Tag

CodeSOTA Task

Area

Benchmarks

Results

graph-ml

Node Classification

Graphs

2

6

CodeSOTA-only tasks

Tasks tracked by CodeSOTA that don't have a direct HuggingFace pipeline equivalent.

Adversarial

Adversarial Attacks1 benchmarks

Adversarial Robustness1 benchmarks

Agentic AI

Agent Memory2 benchmarks

Autonomous Coding2 benchmarks23 results

Bioinformatics Agents1 benchmarks2 results

HCAST1 benchmarks6 results

RE-Bench1 benchmarks5 results

SWE-bench1 benchmarks81 results

Task agents9 benchmarks45 results

Time Horizon1 benchmarks5 results

Tool Use1 benchmarks19 results

Web & Desktop Agents2 benchmarks39 results

Audio

Audio Captioning1 benchmarks7 results

Audio-Language Models18 benchmarks

Automatic Speech Recognition25 benchmarks

Music Generation1 benchmarks3 results

Sound Event Detection1 benchmarks3 results

Voice cloning1 benchmarks3 results

Computer Code

Bug Detection1 benchmarks6 results

Code Completion1 benchmarks6 results

Code Generation10 benchmarks270 results

Code Summarization1 benchmarks3 results

Code Translation1 benchmarks7 results

Program Repair1 benchmarks5 results

Computer Vision

3D Understanding4 benchmarks

3D generation0

Document Image Classification7 benchmarks63 results

Document Layout Analysis5 benchmarks133 results

Document Parsing3 benchmarks149 results

Few-Shot Image Classification97 benchmarks

General OCR Capabilities4 benchmarks70 results

Handwriting Recognition6 benchmarks40 results

Image editing5 benchmarks

Image generation11 benchmarks

Image segmentation9 benchmarks3 results

OCR5 benchmarks1 results

Object counting1 benchmarks

Open-Vocabulary Object Detection2 benchmarks

Optical Character Recognition114 benchmarks831 results

Scene Text Detection11 benchmarks581 results

Scene Text Recognition11 benchmarks127 results

Table Recognition5 benchmarks71 results

Video generation0

Video segmentation3 benchmarks

General

Coding Agents7 benchmarks4 results

Computer Use Agents11 benchmarks

Embedding models0

General1 benchmarks

Omni models2 benchmarks

Reasoning0

Reinforcement Learning0

Retrieval7 benchmarks

Video-Language Models19 benchmarks4 results

Vision-Language Models40 benchmarks

World Models0

Graphs

Link Prediction1 benchmarks3 results

Molecular Property Prediction1 benchmarks3 results

Industrial Inspection

Anomaly Detection7 benchmarks27 results

Steel Defect Detection1 benchmarks

Surface Defect Detection1 benchmarks

Weld Inspection1 benchmarks

Knowledge Base

Entity Linking1 benchmarks3 results

Knowledge Graph Completion1 benchmarks3 results

Relation Extraction1 benchmarks3 results

Polish LLM General1 benchmarks5100 results

Polish Text Understanding1 benchmarks465 results

Reading Comprehension1 benchmarks2 results

Other

Other4 benchmarks

Reasoning

Arithmetic Reasoning2 benchmarks6 results

Commonsense Reasoning6 benchmarks109 results

Logical Reasoning4 benchmarks12 results

Mathematical Reasoning4 benchmarks127 results

Multi-step Reasoning4 benchmarks161 results

Reinforcement Learning

Continuous Control1 benchmarks9 results

Offline RL1 benchmarks

Robots

Robot Manipulation1 benchmarks5 results

Robot Navigation1 benchmarks

Sim-to-Real Transfer1 benchmarks

Speech

Speaker Verification1 benchmarks3 results

Speech Enhancement0

Speech Translation1 benchmarks3 results

Time Series

Tabular Machine Learning0

Time-series

Time-series classification1 benchmarks

Multimodal

Computer Vision

Natural Language Processing

Audio

Tabular

Reinforcement Learning

Other

CodeSOTA-only tasks

Adversarial

Agentic AI

Audio

Computer Code

Computer Vision

General

Graphs

Industrial Inspection

Knowledge Base

Medical

Methodology

Mobile Development

Natural Language Processing

Other

Reasoning

Reinforcement Learning

Robots

Speech

Time Series

Time-series