Gemini Ultra.

Google DeepMindproprietaryUnknown paramsTransformer (decoder-only)

Largest Gemini 1.0 model. Released December 2023.

§ 02 · Benchmarks

Every benchmark Gemini Ultra has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	SNLI	Natural Language Processing · Natural Language Inference	accuracy	91.9%	#3/8	2023-12-19	source ↗
02	SuperGLUE	Natural Language Processing · Text classification	average-score	90.0%	#4/7	2023-12-19	source ↗
03	GSM8K	Reasoning · Mathematical Reasoning	accuracy	94.4%	#28/48	2024-02-01	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where Gemini Ultra actually performs.

Natural Language Processing

benchmark

avg rank #3.0

Natural Language Processing

§ 04 · Papers

1 paper with results for Gemini Ultra.

2023-12-19· Natural Language Processing· 2 results
Gemini: A Family of Highly Capable Multimodal Models

§ 05 · Related models

Other Google DeepMind models scored on Codesota.

Gemma 3 12B IT

12B params · 2 results

Gemma 3 4B IT

4B params · 2 results

400M params · 1 result

BBF (Bigger, Better, Faster)

Unknown params · 0 results

§ 06 · Sources & freshness

Where these numbers come from.

arxiv

results

gsm8k-shadow-page

result

2 of 3 rows marked verified. · first result 2023-12-19, latest 2024-02-01.

Gemini Ultra.

Every benchmark Gemini Ultra has a recorded score for.

Where Gemini Ultra actually performs.

1 paper with results for Gemini Ultra.

Gemini: A Family of Highly Capable Multimodal Models

Other Google DeepMind models scored on Codesota.

Where these numbers come from.