General OCR Capabilities2024en

MME Video OCR Benchmark

1,464 videos with 2,000 QA pairs across 25 tasks. Tests OCR capabilities in video content.

Metrics:total-accuracy
Paper / Website
Current State of the Art

Gemini 2.5 Pro

Google

73.7

total-accuracy

Top Models Performance Comparison

Top 6 models ranked by total-accuracy

total-accuracy1Gemini 2.5 Pro73.7100.0%2Qwen2.5-VL 72B69.093.6%3InternVL3-78B67.291.2%4GPT-4o66.490.1%5Gemini 1.5 Pro64.988.1%6Qwen2.5-VL 32B61.082.8%0%25%50%75%100%% of best
Best Score
73.7
Top Model
Gemini 2.5 Pro
Models Compared
6
Score Range
12.7

total-accuracyPrimary

#ModelScorePaper / CodeDate
1
Gemini 2.5 ProAPI
Google
73.7Dec 2025
2
Qwen2.5-VL 72BOpen Source
Alibaba
69Dec 2025
3
InternVL3-78BOpen Source
Shanghai AI Lab
67.2Dec 2025
4
GPT-4oAPI
OpenAI
66.4Dec 2025
5
Gemini 1.5 ProAPI
Google
64.9Dec 2025
6
Qwen2.5-VL 32BOpen Source
Alibaba
61Dec 2025

Other General OCR Capabilities Datasets