| 01 | EVA-02-L EVA-02-L/14+ fine-tuned on CIFAR-100. Pre-trained with EVA-CLIP on Objects365 + ImageNet-21K. State-of-the-art as of 2023. arXiv Mar 2023. | paper | 97.15 | 2023 | Source ↗ | Edit result |
| 02 | CoAtNet-7 CoAtNet-7 (2.4B params) fine-tuned on CIFAR-100. Pre-trained on ImageNet-21K. arXiv Jun 2021, NeurIPS 2021. | paper | 96.38 | 2021 | Source ↗ | Edit result |
| 03 | ConvNeXt V2-H ConvNeXt V2-H fine-tuned on CIFAR-100 after FCMAE pre-training on ImageNet-22K. arXiv Jan 2023, CVPR 2023. | paper | 96.17 | 2023 | Source ↗ | Edit result |
| 04 | MAE ViT-H/14 ViT-H/14 fine-tuned on CIFAR-100 after MAE pre-training on ImageNet-1K. arXiv Nov 2021, CVPR 2022. | paper | 96.08 | 2021 | Source ↗ | Edit result |
| 05 | SwinV2-G SwinV2-G (3B params) fine-tuned on CIFAR-100. Pre-trained on ImageNet-21K with resolution 192^2. arXiv Nov 2021, CVPR 2022. | paper | 96.01 | 2021 | Source ↗ | Edit result |
| 06 | DeiT III-H/14 DeiT III ViT-H/14 fine-tuned on CIFAR-100. Improved training recipe for ViTs. arXiv Apr 2022, ECCV 2022. | paper | 95.94 | 2022 | Source ↗ | Edit result |
| 07 | InternImage-XL InternImage-XL fine-tuned on CIFAR-100. Uses deformable convolutions as core operator. arXiv Nov 2022, CVPR 2023. | paper | 95.77 | 2022 | Source ↗ | Edit result |
| 08 | FasterViT-6 FasterViT-6 fine-tuned on CIFAR-100. Hierarchical ViT with carrier tokens for high-resolution efficiency. arXiv Jun 2023, ICLR 2024. | paper | 95.72 | 2023 | Source ↗ | Edit result |
| 09 | Vision Transformer (ViT-H/14) | unverified | 94.55 | 2020 | Paper ↗Code ↗ | Edit result |
| 10 | vit-h-14 Fine-tuned from ImageNet pretraining. | paper | 94.55 | 2025 | Source ↗ | Edit result |
| 11 | ViT-H/14 Fine-tuned from ImageNet pretraining. | unverified | 94.55 | 2025 | Source ↗ | Edit result |
| 12 | AIMv2 ViT-3B/14 448px | unverified | 94.5 | 2024 | Paper ↗Code ↗ | Edit result |
| 13 | AIMv2-3B AIMv2-3B (2.7B params), multimodal autoregressive pre-training, patch14 448px. 94.5% on CIFAR-100 using attentive probing (frozen backbone). Apple, presented at CVPR 2025. Source: official HF model card. Paper: arxiv:2411.14402, Nov 2024. | verified | 94.5 | 2026 | Source ↗ | Edit result |
| 14 | AIMv2-1B AIMv2-1B, multimodal autoregressive pre-training, patch14 224px. 94.1% on CIFAR-100 using attentive probing (frozen backbone). Apple, presented at CVPR 2025. Source: official HF model card. Paper: arxiv:2411.14402, Nov 2024. | verified | 94.1 | 2026 | Source ↗ | Edit result |
| 15 | BiT-L | unverified | 93.51 | 2019 | Paper ↗Code ↗ | Edit result |
| 16 | ViT-L/16 (IN-21K) Vision Transformer ViT-L/16, pretrained on ImageNet-21K and finetuned on CIFAR-100. 93.25% reported in ViT paper (Table 5). Paper: Dosovitskiy et al. 2021, arxiv:2010.11929. | verified | 93.25 | 2026 | Source ↗ | Edit result |
| 17 | BEiT | unverified | 91.8 | 2021 | Paper ↗Code ↗ | Edit result |
| 18 | efficientnet-b7 Transfer learning from ImageNet. | paper | 91.7 | 2025 | Source ↗ | Edit result |
| 19 | ViT-B/16 Fine-tuned from ImageNet-21K. | unverified | 91.48 | 2025 | Source ↗ | Edit result |
| 20 | vit-b-16 Fine-tuned from ImageNet-21K. | paper | 91.48 | 2025 | Source ↗ | Edit result |
| 21 | LeJEPA ViT-L (304M) | unverified | 83.71 | 2025 | Paper ↗Code ↗ | Edit result |
| 22 | CN-CLIP | unverified | 79.7 | 2022 | Paper ↗Code ↗ | Edit result |
| 23 | ResNet-50 With Cutout augmentation. | paper | 78.04 | 2025 | Source ↗ | Edit result |