Home/Browse/Multimodal/Image Captioning/COCO Captions

COCO Captions

Unknown

330K images with 5 captions each. Standard benchmark for image captioning.

Benchmark Stats

Models13
Papers25
Metrics4

SOTA History

cider

cider

Higher is better

RankModelSourceScoreYearPaper
1PaLI-X-55B

PaLI-X 55B (scaling up multilingual vision-language). Google, 2023. CIDEr on Karpathy test split.

Community149.22023Source
2PaLI-17B

PaLI (Pathways Language and Image model) 17B. Google Research, ICLR 2023. CIDEr on Karpathy test split without CIDEr optimization.

Community149.12022Source
3BEiT-3

BEiT-3 (Image as a Foreign Language). Microsoft, CVPR 2023. CIDEr on Karpathy test split.

Community147.62022Source
4BLIP-2 (OPT 2.7B)

BLIP-2 with frozen OPT-2.7B. Salesforce, ICML 2023. CIDEr on Karpathy test split.

Community145.82023Source
5OFA

OFA-Huge (Unifying Architectures, Tasks, and Modalities). Alibaba DAMO, ICML 2022. CIDEr on Karpathy test split.

Community145.32022Source
6GIT2

GIT2 (5.1B parameters). Microsoft, 2022. CIDEr on Karpathy test split.

Community1452022Source
7GIT

GIT (Generative Image-to-text Transformer). Microsoft, 2022. CIDEr on Karpathy test split.

Community144.82022Source
8CoCa

CoCa (Contrastive Captioners). Google Brain, 2022. CIDEr on Karpathy test split without CIDEr optimization.

Community143.62022Source
9SimVLM

SimVLM large. ICLR 2022. CIDEr on Karpathy test split.

Community143.32022Source
10VinVL

VinVL large model. CVPR 2021. CIDEr on Karpathy test split.

Community140.92022Source
11BLIP

BLIP (Bootstrapping Language-Image Pre-training). ICML 2022. CIDEr on Karpathy test split.

Community136.72022Source
12CogVLM

CogVLM-17B zero-shot. Tsinghua KEG, Nov 2023. CIDEr on COCO Karpathy test split. Zero-shot result.

Community126.42023Source

CIDEr

Higher is better

RankModelSourceScoreYearPaper
1BLIP-2

COCO Karpathy test split. FlanT5-XXL backbone. Table 12. arxiv:2301.12597

Community145.82026Source
2CoCa

COCO Karpathy test split. Single-model fine-tune. Table 4. arxiv:2205.01068

Community143.62026Source

bleu-4

bleu-4

Higher is better

RankModelSourceScoreYearPaper
1GIT

GIT. BLEU-4 on Karpathy test split.

Community44.12022Source
2GIT2

GIT2. BLEU-4 on Karpathy test split.

Community44.12022Source
3OFA

OFA-Huge. BLEU-4 on Karpathy test split.

Community43.92022Source
4BLIP-2 (OPT 2.7B)

BLIP-2 with frozen OPT-2.7B. BLEU-4 on Karpathy test split.

Community43.72023Source
5VinVL

VinVL large model. CVPR 2021. BLEU-4 on Karpathy test split.

Community412022Source
6CoCa

CoCa. BLEU-4 on Karpathy test split.

Community40.92022Source
7SimVLM

SimVLM large. ICLR 2022. BLEU-4 on Karpathy test split.

Community40.62022Source
8BLIP

BLIP. ICML 2022. BLEU-4 on Karpathy test split.

Community40.42022Source

spice

spice

Higher is better

RankModelSourceScoreYearPaper
1SimVLM

SimVLM large. ICLR 2022. SPICE on Karpathy test split.

Community25.42022Source
2OFA

OFA-Huge. SPICE on Karpathy test split.

Community24.82022Source
3CoCa

CoCa. SPICE on Karpathy test split.

Community24.72022Source

Submit a Result