Natural Language Processing

Processing and understanding text? Evaluate your models on language understanding, generation, translation, and information extraction benchmarks.

9 tasks6 datasets

Explore All Results

Tasks in Natural Language Processing

Language Modeling

Predicting the next word or token in a sequence. Core task for GPT-style models.

0 datasets

View →

Machine Translation

Translating text from one language to another (WMT benchmarks).

0 datasets

View →

Question Answering

Answering questions based on context (SQuAD, Natural Questions).

1 dataset

View →

Text Classification

Categorizing text into predefined classes (sentiment, topic).

2 datasets

View →

Named Entity Recognition

Identifying and classifying named entities in text (CoNLL).

1 dataset

View →

Text Summarization

Generating concise summaries of longer documents (CNN/DailyMail, XSum).

1 dataset

View →

Natural Language Inference

Determining entailment relationships between sentences (SNLI, MNLI).

1 dataset

View →

Semantic Textual Similarity

Measuring similarity between text pairs (STS Benchmark).

0 datasets

View →

Reading Comprehension

Understanding and answering questions about passages.

0 datasets

View →

Explore Other Areas

Computer Vision

Building systems that understand images and video? Find benchmarks for recognition, detection, segmentation, and document analysis tasks.

Reasoning

Testing if your model can think logically? Benchmark math problem solving, commonsense understanding, and multi-step reasoning capabilities.

Computer Code

Developing AI coding assistants? Test code generation, completion, translation, bug detection, and repair capabilities.

Speech

Working with voice and audio? Evaluate speech-to-text accuracy, voice synthesis quality, and speaker identification performance.