Codesota · Benchmark · MMBenchHome/Leaderboards/Multimodal Media/Visual Question Answering/MMBench
Unknown

MMBench.

Multimodal capability benchmark for vision-language models, covering perception and reasoning abilities across multiple dimensions.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Accuracy

Accuracy is the reported evaluation metric for MMBench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01SenseNova-U1-A3B-MoTunverified91.592026Paper ↗Code ↗Looks wrong?
02Qwen2.5-VL 72B
MMBench EN test. Qwen2.5-VL 72B. Table 2. arxiv:2502.13923
verified90.52026Source ↗Looks wrong?
03InternVL3-78B
MMBench EN test. InternVL3-78B. Table 2. arxiv:2501.12891
verified90.12025Paper ↗Looks wrong?
04LongCat-Flash-Omniunverified89.82025Paper ↗Code ↗Looks wrong?
05Qwen3-VL-235B-A22B-Instructunverified89.32025Paper ↗Code ↗Looks wrong?
06Qwen3-VL-235B-A22B-Thinkingunverified88.82025Paper ↗Code ↗Looks wrong?
07Qwen2.5-VL-72Bunverified88.62025Paper ↗Code ↗Looks wrong?
08Qwen2-VL 72B
MMBench EN test. Qwen2-VL 72B. Table 6. arxiv:2409.12191
verified882024Paper ↗Looks wrong?
09Infinity-Parser2-Prounverified87.542026Paper ↗Looks wrong?
10InternVL2-76B
MMBench EN test. InternVL2-76B. Table 12. arxiv:2404.16821
verified86.52024Paper ↗Looks wrong?
11BAGEL (7B MoT)unverified852025Paper ↗Code ↗Looks wrong?
12GPT-4o
MMBench EN test. GPT-4o. System card Table 1. arxiv:2410.21276
verified83.42026Source ↗Looks wrong?
13MiniCPM-V 4.6-Thinking (16x)unverified83.12026Paper ↗Looks wrong?
14Qwen2-VL 7Bunverified832024Paper ↗Code ↗Looks wrong?
15MiniCPM-Llama3-V 2.5unverified77.22024Paper ↗Code ↗Looks wrong?
16GPT-4V
MMBench EN test. GPT-4V. Reported in multiple comparison papers incl. InternVL2 Table 12.
verified75.82026Source ↗Looks wrong?
17Qwen2-VL-2Bunverified74.92024Paper ↗Code ↗Looks wrong?
18Gemini 1.5 Pro
MMBench EN dev. Gemini 1.5 Pro. Table 5. arxiv:2403.05530
verified73.92026Source ↗Looks wrong?
19LLaVA-1.5
MMBench EN dev. 13B. Table 1. arxiv:2310.03744
verified67.72023Paper ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Visual Question Answering
MMBench Leaderboard | CodeSOTA | CodeSOTA