webnlg-(seen) is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for webnlg-(seen).
Bleu is the reported evaluation metric for webnlg-(seen). Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | HTLM (fine-tuning) | verified | 65.4 | 2021 | Paper ↗ | Looks wrong? |
| 02 | GPT-2-Large (fine-tuning) | verified | 65.3 | 2021 | Paper ↗ | Looks wrong? |
Meteor is the reported evaluation metric for webnlg-(seen). Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | HTLM (fine-tuning) | verified | 0.46 | 2021 | Paper ↗ | Looks wrong? |
| 02 | GPT-2-Large (fine-tuning) | verified | 0.46 | 2021 | Paper ↗ | Looks wrong? |
Ter is the reported evaluation metric for webnlg-(seen). Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | HTLM (fine-tuning) | verified | 0.33 | 2021 | Paper ↗ | Looks wrong? |
| 02 | GPT-2-Large (fine-tuning) | verified | 0.33 | 2021 | Paper ↗ | Looks wrong? |