Codesota · OCR · Benchmarks · CC-OCRHome/OCR/Benchmarks/CC-OCR
South China University of Technology

CC-OCR.

Benchmark for OCR across multi-scene, multilingual, and document parsing tasks.

View on AlphaXiv
§ 01 · Multi-Scene F1

Multi-Scene F1.

F1 score on multi-scene text reading

Higher is better

#ModelScoreSource
gemini-15-pro
Non-API entry from src
83.25%src
2
qwen2-vl-72b
Non-API entry from src
77.95%src
3
internvl2-76b
Non-API entry from src
76.92%src
4
gpt-4o
Non-API entry from src
76.4%src
5
claude-35-sonnet
Non-API entry from src
72.87%src
§ 02 · Multilingual F1

Multilingual F1.

F1 score on multilingual text (10 languages)

Higher is better

#ModelScoreSource
gemini-15-pro
Non-API entry from src
78.97%src
2
gpt-4o
Non-API entry from src
73.44%src
§ 03 · KIE F1

KIE F1.

F1 score on key information extraction

Higher is better

#ModelScoreSource
qwen2-vl-72b
Non-API entry from src
71.76%src
2
gemini-15-pro
Non-API entry from src
67.28%src
3
claude-35-sonnet
Non-API entry from src
64.58%src
4
gpt-4o
Non-API entry from src
63.45%src
§ 04 · Document Parsing

Document Parsing.

Average score on document parsing

Higher is better

#ModelScoreSource
gemini-15-pro
Non-API entry from src
62.37src
§ Related · Explore

More OCR content.

Verified Model Reviews
Comparisons & Guides
View all OCR benchmarks → Back to All Benchmarks