GOT-OCR2.0.

Alibabamultimodalapache-2.0

General OCR Theory model v2, end-to-end OCR system supporting scene text, formatted documents, and fine-grained OC results.

GitHub ↗

§ 02 · Benchmarks

Every benchmark GOT-OCR2.0 has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	CC-OCR	Computer Vision · General OCR Capabilities	document-parsing	39.2%	#5/6	—	source ↗
02	CC-OCR	Computer Vision · General OCR Capabilities	multi-scene-f1	61.0%	#6/9	—	source ↗
03	CC-OCR	Computer Vision · General OCR Capabilities	multilingual-f1	24.9%	#8/8	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area