MME-VideoOCR
NTU
Video OCR benchmark with 1,464 videos and 2,000 QA pairs across 25 tasks.
6
Total Results
6
Models Tested
1
Metrics
2025-12-19
Last Updated
Total Accuracy
Overall accuracy across all video OCR tasks
Higher is better
| Rank | Model | Score | Source |
|---|---|---|---|
| 1 | gemini-25-pro 1,464 videos, 2,000 QA pairs, 25 tasks | 73.7 % | alphaxiv-leaderboard |
| 2 | qwen25-vl-72b | 69 % | alphaxiv-leaderboard |
| 3 | internvl3-78b | 67.2 % | alphaxiv-leaderboard |
| 4 | gpt-4o | 66.4 % | alphaxiv-leaderboard |
| 5 | gemini-15-pro | 64.9 % | alphaxiv-leaderboard |
| 6 | qwen25-vl-32b | 61 % | alphaxiv-leaderboard |