Benchmarks for clinical AI.
Building healthcare AI? Find benchmarks for medical imaging, disease diagnosis, clinical text processing, and drug discovery.
Medical benchmarks are where the score translates directly into patient outcome. We track the ones that are public, reproducible, and hold up under re-scoring.
Live and coming.
ABIDE (Autism)→
The standard for autism classification from brain imaging (fMRI). Compare MCBERT, DeepASD, and other SOTA models.
Chest X-Ray AI→
CheXpert, MIMIC-CXR, and NIH ChestX-ray14. Compare CheXNet, CheXzero, and vision-language models.
Medical Segmentation
Benchmarks for organ and tumor segmentation — BraTS for brain tumors, LiTS for liver, and the MSD suite.
Clinical NLP
Clinical notes, radiology reports, and medical Q&A — MedQA, PubMedQA, and de-identified MIMIC notes.
The full register.
Medical Image Segmentation
Segmenting organs and abnormalities in medical images.
Disease Classification
Diagnosing diseases from medical images or data.
Drug Discovery
Predicting molecular properties and drug interactions.
Clinical NLP
Processing clinical notes and medical text.
Know a benchmark we’re missing?
Medical ML is scattered across MICCAI, specialty journals, and Kaggle. If there’s a public score we should be tracking, submit it — we verify and append within 48h.