CC-OCR

South China University of Technology

Benchmark for OCR across multi-scene, multilingual, and document parsing tasks.

12
Total Results
5
Models Tested
4
Metrics
2025-12-21
Last Updated

Multi-Scene F1

F1 score on multi-scene text reading

Higher is better

RankModelScoreSource
1gemini-15-pro

Multi-Scene Text Reading - Overall F1 score

83.25%alphaxiv-leaderboard
2qwen2-vl-72b77.95%alphaxiv-leaderboard
3internvl2-76b76.92%alphaxiv-leaderboard
4gpt-4o76.4%alphaxiv-leaderboard
5claude-35-sonnet72.87%alphaxiv-leaderboard

KIE F1

F1 score on key information extraction

Higher is better

RankModelScoreSource
1qwen2-vl-72b

Key Information Extraction - Overall F1 score

71.76%alphaxiv-leaderboard
2gemini-15-pro67.28%alphaxiv-leaderboard
3claude-35-sonnet64.58%alphaxiv-leaderboard
4gpt-4o63.45%alphaxiv-leaderboard

Multilingual F1

F1 score on multilingual text (10 languages)

Higher is better

RankModelScoreSource
1gemini-15-pro

Multilingual Text Reading - 10 languages

78.97%alphaxiv-leaderboard
2gpt-4o73.44%alphaxiv-leaderboard

Document Parsing

Average score on document parsing

Higher is better

RankModelScoreSource
1gemini-15-pro

Document Parsing - Average Score

62.37alphaxiv-leaderboard

Explore More OCR Content