Disease Classification
Diagnosing diseases from medical images or data.
Disease classification uses ML to diagnose medical conditions from images (radiology, pathology, dermatology), lab results, and clinical text. Models like CheXNet and Med-PaLM achieve specialist-level accuracy on narrow tasks, but clinical deployment requires FDA clearance, bias auditing, and integration with existing workflows.
History
CheXNet (Rajpurkar et al.) matches radiologist performance on pneumonia detection from chest X-rays
Esteva et al. demonstrate dermatologist-level skin cancer classification from images
Google detects diabetic retinopathy from fundus images at specialist level
COVID-19 accelerates deployment of AI diagnostic tools for chest CT classification
REMEDIS applies self-supervised pretraining to improve medical image classification
BiomedCLIP aligns medical images with clinical text for zero-shot disease classification
Med-PaLM 2 achieves expert-level performance on medical question answering
FDA clears 900+ AI medical devices, primarily for radiology classification
Foundation models (BiomedGPT, Med-Gemini) show broad medical classification capabilities
Multimodal medical AI combines imaging, labs, clinical notes, and genomics for diagnosis
How Disease Classification Works
Data Acquisition
Medical images (X-ray, CT, MRI, pathology slides), lab results, or clinical notes are collected and de-identified.
Preprocessing
Images are normalized, augmented, and standardized. Clinical text is tokenized and structured. Missing data is handled.
Feature Extraction
A pretrained backbone (ResNet, ViT, BioClinicalBERT) extracts discriminative features from the input modality.
Classification
Extracted features are mapped to disease categories through classification heads, often with multi-label output for comorbidities.
Calibration and Uncertainty
Prediction probabilities are calibrated, and uncertainty estimates flag cases for human review — critical for clinical safety.
Current Landscape
Disease classification in 2025 is the most commercially mature area of medical AI, with 900+ FDA-cleared devices. Radiology leads (chest X-ray, mammography, CT triage), followed by pathology (cancer grading) and dermatology (skin lesion classification). Foundation models are beginning to enable zero-shot classification of conditions not seen in training. The key tension is between research benchmarks (where models match specialists) and real-world deployment (where distribution shift, workflow integration, and regulatory requirements create significant barriers).
Key Challenges
Data imbalance — rare diseases have very few labeled examples, leading to poor sensitivity on important classes
Distribution shift — models trained at one hospital often perform poorly at others due to equipment and population differences
Regulatory burden — FDA/CE clearance requires extensive clinical validation, adding years to deployment timelines
Demographic bias — models may perform worse on underrepresented populations (race, age, sex) in training data
Clinical integration — fitting AI predictions into physician workflows without disrupting care is a UX and systems challenge
Quick Recommendations
Chest X-ray classification
CheXNet / TorchXRayVision
Well-validated models for pneumonia, cardiomegaly, and 14 other chest conditions
General medical image classification
BiomedCLIP / Med-Gemini
Foundation models with broad medical image understanding
Clinical NLP classification
Med-PaLM 2 / BioClinicalBERT
Best models for classifying diseases from clinical text and notes
Production deployment
FDA-cleared tools (Aidoc, Viz.ai)
Regulatory clearance required for clinical use — commercial tools have it
What's Next
The frontier is multimodal disease classification — combining imaging, genomics, lab results, clinical history, and social determinants into unified diagnostic models. Expect federated learning to enable training across hospitals without sharing patient data, and increasingly automated clinical trial matching based on AI-classified patient characteristics.
Benchmarks & SOTA
ABIDE I
Autism Brain Imaging Data Exchange I
1,112 resting-state fMRI datasets from 539 individuals with autism spectrum disorder (ASD) and 573 typically developing controls across 17 international sites. Multi-site neuroimaging data for autism classification and biomarker discovery.
State of the Art
Plymouth DL Model
Research
98
accuracy
CheXpert
CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels
224,316 chest radiographs from 65,240 patients with 14 pathology labels. Includes uncertainty labels and expert radiologist annotations for validation set. The gold standard for chest X-ray classification.
State of the Art
CheXpert AUC Maximizer
Stanford
93
auroc
NIH ChestX-ray14
NIH Clinical Center Chest X-ray Dataset
112,120 frontal-view chest X-ray images from 30,805 unique patients with 14 disease labels extracted using NLP from radiology reports. Foundational benchmark for chest X-ray AI.
State of the Art
TorchXRayVision
Cohen Lab
85.8
auroc
MIMIC-CXR
MIMIC-CXR: Medical Information Mart for Intensive Care - Chest X-ray
377,110 chest X-ray images from 227,835 studies of 65,379 patients with free-text radiology reports. Largest publicly available chest X-ray dataset with paired image-text data.
State of the Art
CheXzero
Harvard/MIT
89.2
auroc
RSNA Pneumonia Detection
RSNA Pneumonia Detection Challenge
30,000 frontal chest radiographs with bounding boxes for pneumonia detection. From 2018 RSNA Kaggle competition. Tests both classification and localization.
State of the Art
DenseNet-121 (Chest X-ray)
Research
88.5
auroc
ABIDE II
Autism Brain Imaging Data Exchange II
1,114 datasets from 521 individuals with autism spectrum disorder (ASD) and 593 typically developing controls across 19 sites. Second large-scale release complementing ABIDE I with additional multi-site neuroimaging data.
State of the Art
DeepASD
Research
93
auc
VinDr-CXR
VinDr-CXR: Vietnamese Dataset for Chest Radiograph
18,000 chest X-ray scans with radiologist annotations for 22 local labels and 6 global labels. Each image annotated by 3 radiologists with bounding box localization.
State of the Art
RAD-DINO
Microsoft
91.2
auroc
COVID-19 Image Data Collection
COVID-19 Image Data Collection
Curated dataset of COVID-19 chest X-ray and CT images with clinical metadata. Critical resource during the pandemic for developing AI diagnostic tools.
State of the Art
DenseNet-121 (Chest X-ray)
Research
94.7
auroc
PadChest
PadChest: A Large Chest X-ray Image Dataset
160,868 images from 67,625 patients with 174 radiographic findings, 19 diagnoses, and 104 anatomic locations. Multi-label classification with hierarchical taxonomy.
State of the Art
TorchXRayVision
Cohen Lab
84.6
auroc
Related Tasks
Something wrong or missing?
Help keep Disease Classification benchmarks accurate. Report outdated results, missing benchmarks, or errors.