Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Fill-MaskHome/Tasks/Natural Language Processing/Fill-Mask

Fill-Mask.

Fill-mask (masked language modeling) is the original BERT pretraining objective: mask 15% of tokens, predict what goes there. It powered the encoder revolution that dominated NLP from 2018 to 2022 and remains the training signal behind models like RoBERTa, DeBERTa, and XLM-RoBERTa that still run most production classification and NER systems. As a standalone task it has limited direct applications, but probing what a model predicts for masked slots became a key technique for analyzing bias, factual knowledge, and linguistic competence stored in model weights. The task has faded from the research spotlight as decoder-only (GPT-style) pretraining proved more scalable, but encoder models trained with MLM remain the most cost-efficient option for tasks that need fast inference on structured prediction.

1
Datasets
3
Results
accuracy
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

GLUE

General Language Understanding Evaluation for masked language models

Primary metric: accuracy
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on GLUE.

#Modelavg-scoreYearSource
DeBERTa-v3-large91.42023paper ↗
2ALBERT-xxlarge-v289.42020paper ↗
3RoBERTa-large88.52019paper ↗

What were you looking for on Fill-Mask?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

1 dataset tracked for this task.

GLUE
CANONICAL
3 results · accuracy
Top: DeBERTa-v3-large 91.4
§ 05 · Related tasks

Other tasks in Natural Language Processing.

Feature ExtractionNamed Entity RecognitionNatural Language InferencePolish Conversation QualityPolish Cultural CompetencyPolish Emotional IntelligencePolish LLM GeneralPolish Text Understanding
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Fill-Mask? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.