Unknown
330K images with 5 captions each. Standard benchmark for image captioning.
cider
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | PaLI-X-55B PaLI-X 55B (scaling up multilingual vision-language). Google, 2023. CIDEr on Karpathy test split. | Community | 149.2 | 2023 | Source |
| 2 | PaLI-17B PaLI (Pathways Language and Image model) 17B. Google Research, ICLR 2023. CIDEr on Karpathy test split without CIDEr optimization. | Community | 149.1 | 2022 | Source |
| 3 | BEiT-3 BEiT-3 (Image as a Foreign Language). Microsoft, CVPR 2023. CIDEr on Karpathy test split. | Community | 147.6 | 2022 | Source |
| 4 | BLIP-2 (OPT 2.7B) BLIP-2 with frozen OPT-2.7B. Salesforce, ICML 2023. CIDEr on Karpathy test split. | Community | 145.8 | 2023 | Source |
| 5 | OFA OFA-Huge (Unifying Architectures, Tasks, and Modalities). Alibaba DAMO, ICML 2022. CIDEr on Karpathy test split. | Community | 145.3 | 2022 | Source |
| 6 | GIT2 GIT2 (5.1B parameters). Microsoft, 2022. CIDEr on Karpathy test split. | Community | 145 | 2022 | Source |
| 7 | GIT GIT (Generative Image-to-text Transformer). Microsoft, 2022. CIDEr on Karpathy test split. | Community | 144.8 | 2022 | Source |
| 8 | CoCa CoCa (Contrastive Captioners). Google Brain, 2022. CIDEr on Karpathy test split without CIDEr optimization. | Community | 143.6 | 2022 | Source |
| 9 | SimVLM SimVLM large. ICLR 2022. CIDEr on Karpathy test split. | Community | 143.3 | 2022 | Source |
| 10 | VinVL VinVL large model. CVPR 2021. CIDEr on Karpathy test split. | Community | 140.9 | 2022 | Source |
| 11 | BLIP BLIP (Bootstrapping Language-Image Pre-training). ICML 2022. CIDEr on Karpathy test split. | Community | 136.7 | 2022 | Source |
| 12 | CogVLM CogVLM-17B zero-shot. Tsinghua KEG, Nov 2023. CIDEr on COCO Karpathy test split. Zero-shot result. | Community | 126.4 | 2023 | Source |
Higher is better
bleu-4
Higher is better
| Rank | Model | Source | Score | Year | Paper |
|---|---|---|---|---|---|
| 1 | GIT GIT. BLEU-4 on Karpathy test split. | Community | 44.1 | 2022 | Source |
| 2 | GIT2 GIT2. BLEU-4 on Karpathy test split. | Community | 44.1 | 2022 | Source |
| 3 | OFA OFA-Huge. BLEU-4 on Karpathy test split. | Community | 43.9 | 2022 | Source |
| 4 | BLIP-2 (OPT 2.7B) BLIP-2 with frozen OPT-2.7B. BLEU-4 on Karpathy test split. | Community | 43.7 | 2023 | Source |
| 5 | VinVL VinVL large model. CVPR 2021. BLEU-4 on Karpathy test split. | Community | 41 | 2022 | Source |
| 6 | CoCa CoCa. BLEU-4 on Karpathy test split. | Community | 40.9 | 2022 | Source |
| 7 | SimVLM SimVLM large. ICLR 2022. BLEU-4 on Karpathy test split. | Community | 40.6 | 2022 | Source |
| 8 | BLIP BLIP. ICML 2022. BLEU-4 on Karpathy test split. | Community | 40.4 | 2022 | Source |
spice
Higher is better