Codesota · Models · Gemini 2.5 ProGoogle16 results · 15 benchmarks

Model card

Gemini 2.5 Pro.

GoogleapiMultimodal LLMProprietary3 current SOTA

#1 on OCRBench v2 Chinese, MME-VideoOCR

§ 02 · Benchmarks

Every benchmark Gemini 2.5 Pro has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	OCRBench v2	Computer Vision · General OCR Capabilities	overall-zh-private	62.2%	#1/5	2025-03-25	source ↗
02	ARC-AGI-2	Reasoning · Logical Reasoning	accuracy	5.0%	#1/3	—	source ↗
03	MME-VideoOCR	Computer Vision · General OCR Capabilities	total-accuracy	73.7%	#1/6	—	source ↗
04	ARC-Challenge	Reasoning · Commonsense Reasoning	accuracy	97.8%	#2/10	—	source ↗
05	ThaiOCRBench	Computer Vision · Optical Character Recognition	ted-score	0.8%	#2/5	—	source ↗
06	AIME 2024	Reasoning · Mathematical Reasoning	accuracy	92.0%	#3/11	—	source ↗
07	OCRBench v2	Computer Vision · General OCR Capabilities	overall-en-private	59.3%	#4/27	2025-03-25	source ↗
08	ARC-AGI-1	Reasoning · Logical Reasoning	accuracy	56.1%	#4/5	—	source ↗
09	GSM8K	Reasoning · Mathematical Reasoning	accuracy	99.0%	#4/48	—	source ↗
10	MATH	Reasoning · Mathematical Reasoning	accuracy	97.3%	#6/46	—	source ↗
11	AIME 2025	Reasoning · Mathematical Reasoning	accuracy	86.7%	#12/22	—	source ↗
12	OmniDocBench	Computer Vision · Document Parsing	composite	88.0%	#14/34	—	source ↗
13	MMLU	Reasoning · Commonsense Reasoning	accuracy	89.8%	#17/64	2025-06-17	source ↗
14	GPQA Diamond	Reasoning · Multi-step Reasoning	accuracy	84.0%	#25/74	—	source ↗
15	SWE-Bench Verified	Computer Code · Code Generation	resolve-rate	63.8%	#25/39	—	source ↗
16	SWE-bench Verified	Agentic AI · SWE-bench	resolve-rate	63.2%	#53/81	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where Gemini 2.5 Pro actually performs.

Computer Vision

benchmarks

avg rank #4.4 · 2 SOTA

Reasoning

benchmarks

avg rank #8.2 · 1 SOTA

§ 05 · Related models

Other Google models scored on Codesota.

Undisclosed params · 12 results · 1 SOTA

ViT-H/14

632M params · 2 results · 1 SOTA

CoCa (finetuned)

2.1B params · 1 result · 1 SOTA

Gemini 2.0 Flash

1 result · 1 SOTA

Noise2Music

Unknown params · 1 result · 1 SOTA

Gemini 3 Flash

Undisclosed params · 6 results

§ 06 · Sources & freshness

Where these numbers come from.

google-technical-report

results

alphaxiv-leaderboard

results

arcprize-leaderboard

result

artificialanalysis

result

AlphaXiv

result

google-blog

result

editorial

result

9 of 16 rows marked verified. · first result 2025-03-25, latest 2025-06-17.