hoc.

Dataset from Papers With Code

Submit a result ↵

§ 01 · Leaderboard

Best published scores.

6 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.

Primary: accuracy · higher is better
All metrics: f1, micro-f1

3 rows

#	Model	Org	Submitted	Paper / code	f1
01	BioLinkBERT (large)	—	Mar 2022	LinkBERT: Pretraining Language Models with Document Link… · code	88.10
02	NCBI_BERT(large) (P)	—	Jun 2019	Transfer Learning in Biomedical Natural Language Process… · code	87.30
03	SciFive-large	—	May 2021	SciFive: a text-to-text transformer model for biomedical… · code	86.08

micro-f1

3 rows

#	Model	Org	Submitted	Paper / code	micro-f1
01	BioGPT	—	Oct 2022	BioGPT: Generative Pre-trained Transformer for Biomedica… · code	85.12
02	BioLinkBERT (large)	—	Mar 2022	LinkBERT: Pretraining Language Models with Document Link… · code	84.87
03	PubMedBERT uncased	—	Jul 2020	Domain-Specific Language Model Pretraining for Biomedica… · code	82.32

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 04 · Literature

5 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Oct 2022·BioGPT
arXiv ↗Code
LinkBERT: Pretraining Language Models with Document Links
Mar 2022·BioLinkBERT (large)
arXiv ↗Code
SciFive: a text-to-text transformer model for biomedical literature
May 2021·SciFive-large
arXiv ↗Code
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Jul 2020·PubMedBERT uncased
arXiv ↗Code
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
Jun 2019·NCBI_BERT(large) (P)
arXiv ↗Code

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

hoc.

Best published scores.

5 paperstied to this benchmark.

Neighbouring benchmarks.

Have a score that beatsthis table?

5 papers
tied to this benchmark.

Have a score that beats
this table?