Model card
DeBERTa-v3-large.
Microsoftopen-source304M paramsDeBERTa-v3-large
DeBERTaV3. ICLR 2023. GLUE average 91.37.
§ 01 · Benchmarks
Every benchmark DeBERTa-v3-large has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | GLUE | Natural Language Processing · Fill-Mask | avg-score | 91.4% | #1 | 2023-01-01 | source ↗ |
| 02 | SQuAD v2.0 | Natural Language Processing · Question Answering | em | 88.4% | #1 | 2021-11-18 | source ↗ |
| 03 | SQuAD v2.0 | Natural Language Processing · Question Answering | f1 | 91.4% | #1 | 2021-11-18 | source ↗ |
| 04 | SuperGLUE | Natural Language Processing · Text Classification | average-score | 91.4% | #1 | 2021-11-18 | source ↗ |
| 05 | CoNLL-2003 | Natural Language Processing · Named Entity Recognition | f1 | 93.4% | #2 | 2021-11-18 | source ↗ |
| 06 | SNLI | Natural Language Processing · Natural Language Inference | accuracy | 92.2% | #2 | 2021-11-18 | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where DeBERTa-v3-large actually performs.
§ 03 · Papers
2 papers with results for DeBERTa-v3-large.
- 2023-01-01· Natural Language Processing· 1 result
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
- 2021-11-18· Natural Language Processing· 5 results
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
§ 04 · Related models
Other Microsoft models scored on Codesota.
§ 05 · Sources & freshness
Where these numbers come from.
arxiv
6
results
6 of 6 rows marked verified. · first result 2021-11-18, latest 2023-01-01.