Codesota · Benchmark · OK-VQAHome/Browse/Multimodal/Visual Question Answering/OK-VQA
Unknown

OK-VQA.

14,055 questions requiring outside knowledge to answer. Tests models that must consult external knowledge sources beyond visual content.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

accuracy

accuracy

Higher is better

RankModelSourceScoreYearPaper
1PaLI-X-55B

PaLI-X 55B fine-tuned on OK-VQA. 2023. Google Research.

Community66.12023Source
2PaLI-17B

PaLI-17B fine-tuned on OK-VQA. ICLR 2023. Google Research.

Community64.52022Source
3GPT-4V

GPT-4V zero-shot on OK-VQA test (commonsense knowledge subset). Nov 2023. OpenAI.

Community64.282023Source
4Flamingo-80B

Flamingo-80B, 32-shot. OK-VQA test set. NeurIPS 2022. DeepMind.

Community57.82022Source
5BLIP-2 (FlanT5XXL)

BLIP-2 with FlanT5XXL backbone. Zero-shot OK-VQA test. ICML 2023. Salesforce.

Community44.72023Source
Lineage

OK-VQA in context.

See full visual question answering lineage →
This benchmark (1)
active2019-06
OK-VQA
Successors (1)
active2022-06
A-OKVQA
Broader knowledge types and better annotation.
§ 04 · Submit a result

Add to the leaderboard.

← Back to Visual Question Answering