Disease Classification2017en

NIH Clinical Center Chest X-ray Dataset

112,120 frontal-view chest X-ray images from 30,805 unique patients with 14 disease labels extracted using NLP from radiology reports. Foundational benchmark for chest X-ray AI.

Metrics:auroc, accuracy
Paper / WebsiteDownload
Current State of the Art

TorchXRayVision

Cohen Lab

85.8

auroc

auroc Progress Over Time

Showing 3 breakthroughs from May 2017 to Dec 2025

82.383.284.285.286.1May 2017Aug 2021Dec 2025aurocDate

Key Milestones

May 2017
DenseNet-121 (Chest X-ray)

Original NIH baseline model.

82.6
Nov 2017
CheXNet

Original CheXNet on ChestX-ray14. Exceeded radiologist performance on pneumonia (0.768 vs 0.633).

84.1
+1.8%
Dec 2025
TorchXRayVisionCurrent SOTA

Multi-dataset pre-training improves over single-dataset.

85.8
+2.0%
Total Improvement
3.9%
Time Span
8y 9m
Breakthroughs
3
Current SOTA
85.8

Top Models Performance Comparison

Top 4 models ranked by auroc

auroc1TorchXRayVision85.8100.0%2CheXNet84.198.0%3DenseNet-121 (Chest X-ray)82.696.3%4ResNet-50 (Chest X-ray)80.493.7%0%25%50%75%100%% of best
Best Score
85.8
Top Model
TorchXRayVision
Models Compared
4
Score Range
5.4

aurocPrimary

#ModelScorePaper / CodeDate
1
TorchXRayVisionOpen Source
Cohen Lab
85.8Dec 2025
2
CheXNetOpen Source
Stanford ML Group
84.1Dec 2025
3
DenseNet-121 (Chest X-ray)Open Source
Research
82.6Dec 2025
4
ResNet-50 (Chest X-ray)Open Source
Research
80.4Dec 2025

Other Disease Classification Datasets