Codesota · Domains · Medical4 tasks trackedUpdated 2026-05-04
§ 00 · Medical AI

Benchmarks for clinical AI.

Building healthcare AI? Find benchmarks for medical imaging, disease diagnosis, clinical text processing, and drug discovery.

Medical benchmarks are where the score translates directly into patient outcome. We track the ones that are public, reproducible, and hold up under re-scoring.

§ 01 · Featured benchmarks

Live and coming.

Disease classificationLive

ABIDE (Autism)

The standard for autism classification from brain imaging (fMRI). Compare MCBERT, DeepASD, and other SOTA models.

2 datasets · 20+ results · fMRI & MRI
RadiologyLive

Chest X-Ray AI

CheXpert, MIMIC-CXR, and NIH ChestX-ray14. Compare CheXNet, CheXzero, and vision-language models.

7 datasets · 20+ results · 900K+ images
Image segmentationSoon

Medical Segmentation

Benchmarks for organ and tumor segmentation — BraTS for brain tumors, LiTS for liver, and the MSD suite.

In queue
Text processingSoon

Clinical NLP

Clinical notes, radiology reports, and medical Q&A — MedQA, PubMedQA, and de-identified MIMIC notes.

In queue
§ 02 · Task categories

The full register.

Medical Image Segmentation

Segmenting organs and abnormalities in medical images.

Disease Classification

Diagnosing diseases from medical images or data.

Drug Discovery

Predicting molecular properties and drug interactions.

Clinical NLP

Processing clinical notes and medical text.

Know a benchmark we’re missing?

Medical ML is scattered across MICCAI, specialty journals, and Kaggle. If there’s a public score we should be tracking, submit it — we verify and append within 48h.

Submit a result Browse the register