| 01 | ViT-H/14 (JFT-300M) Vision Transformer ViT-H/14, pre-trained on JFT-300M and fine-tuned on CIFAR-10. 99.50 ± 0.06% reported in ViT paper Table 2 (appendix). This is the published state-of-the-art on CIFAR-10 as of 2025 (PapersWithCode SOTA). Paper: Dosovitskiy et al. 2021, ICLR 2021, arxiv:2010.11929. | verified | 99.5 | 2026 | Source ↗ | Edit result |
| 02 | AIMv2 ViT-3B/14 448px | unverified | 99.5 | 2024 | Paper ↗Code ↗ | Edit result |
| 03 | Vision Transformer (ViT-H/14) | unverified | 99.5 | 2020 | Paper ↗Code ↗ | Edit result |
| 04 | ViT-L/16 (JFT-300M) Vision Transformer ViT-L/16, pre-trained on JFT-300M and fine-tuned on CIFAR-10. 99.42% reported in ViT paper Table 2 (appendix). Paper: Dosovitskiy et al. 2021, ICLR 2021, arxiv:2010.11929. | verified | 99.42 | 2026 | Source ↗ | Edit result |
| 05 | BiT-L | unverified | 99.37 | 2019 | Paper ↗Code ↗ | Edit result |
| 06 | BiT-L (ResNet152x4) Big Transfer (BiT) ResNet-152x4 large upstream variant, pre-trained on JFT-300M and fine-tuned on CIFAR-10. 99.37% reported in BiT paper Table 1. Paper: Kolesnikov et al., ECCV 2020, arxiv:1912.11370. | verified | 99.37 | 2026 | Source ↗ | Edit result |
| 07 | ViT-H/14 (IN-21K) Vision Transformer ViT-H/14, pre-trained on ImageNet-21K and fine-tuned on CIFAR-10. 99.27% reported in ViT paper Table B1 (appendix). Paper: Dosovitskiy et al. 2021, ICLR 2021, arxiv:2010.11929. | verified | 99.27 | 2026 | Source ↗ | Edit result |
| 08 | DeiT-B Distilled Near-SOTA on CIFAR-10 with transfer learning. | unverified | 99.1 | 2025 | Source ↗ | Edit result |
| 09 | deit-b-distilled Near-SOTA on CIFAR-10 with transfer learning. | paper | 99.1 | 2025 | Source ↗ | Edit result |
| 10 | ViT-L/16 (IN-21K) Vision Transformer ViT-L/16, pretrained on ImageNet-21K and finetuned on CIFAR-10. 99.0% reported in ViT paper. Paper: Dosovitskiy et al. 2021, arxiv:2010.11929. | verified | 99 | 2026 | Source ↗ | Edit result |
| 11 | ConvNeXt V2 Base Strong CNN performance on small-scale benchmark. | unverified | 98.7 | 2025 | Source ↗ | Edit result |
| 12 | convnext-v2-base Strong CNN performance on small-scale benchmark. | paper | 98.7 | 2025 | Source ↗ | Edit result |
| 13 | EfficientNet-B8 (NoisyStudent) NoisyStudent EfficientNet-B8 trained with self-training and noise. 98.7% on CIFAR-10. Paper: Xie et al. 2020, arxiv:1911.04252. | verified | 98.7 | 2026 | Source ↗ | Edit result |
| 14 | ViT-B/16 (IN-21K) Vision Transformer ViT-B/16, pretrained on ImageNet-21K and finetuned on CIFAR-10. 98.13% reported in ViT paper. Paper: Dosovitskiy et al. 2021, arxiv:2010.11929. | verified | 98.13 | 2026 | Source ↗ | Edit result |
| 15 | Swin-B Swin Transformer Base, finetuned from IN-21K pretraining on CIFAR-10. Paper: Liu et al. 2021, arxiv:2103.14030. | verified | 98 | 2026 | Source ↗ | Edit result |
| 16 | LeJEPA ViT-L (304M) | unverified | 96.5 | 2025 | Paper ↗Code ↗ | Edit result |
| 17 | ResNet-50 With Cutout augmentation. | paper | 96.01 | 2025 | Source ↗ | Edit result |
| 18 | CN-CLIP | unverified | 96 | 2022 | Paper ↗Code ↗ | Edit result |
| 19 | ResNet-110 | unverified | 93.57 | 2015 | Paper ↗Code ↗ | Edit result |