CC-OCR

South China University of Technology

Multi-scene text reading, key information extraction, multilingual text, and document parsing benchmark.

Benchmark Stats

Models5
Papers12
Metrics4

SOTA History

Not enough data to show trend.

Multi-Scene F1

F1 score on multi-scene text reading

Higher is better

RankModelSourceScoreYearPaper
1gemini-15-pro

Multi-Scene Text Reading - Overall F1 score

Editorial83.252025Source
2qwen2-vl-72bEditorial77.952025Source
3internvl2-76bEditorial76.922025Source
4gpt-4oEditorial76.42025Source
5claude-35-sonnetEditorial72.872025Source

Multilingual F1

F1 score on multilingual text (10 languages)

Higher is better

RankModelSourceScoreYearPaper
1gemini-15-pro

Multilingual Text Reading - 10 languages

Editorial78.972025Source
2gpt-4oEditorial73.442025Source

KIE F1

F1 score on key information extraction

Higher is better

RankModelSourceScoreYearPaper
1qwen2-vl-72b

Key Information Extraction - Overall F1 score

Editorial71.762025Source
2gemini-15-proEditorial67.282025Source
3claude-35-sonnetEditorial64.582025Source
4gpt-4oEditorial63.452025Source

Document Parsing

Average score on document parsing

Higher is better

RankModelSourceScoreYearPaper
1gemini-15-pro

Document Parsing - Average Score

Editorial62.372025Source

Submit a Result