GPT-5.

OpenAIapi

§ 01 · Benchmarks

Every benchmark GPT-5 has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	GSM8K	Reasoning · Mathematical Reasoning	accuracy	99.2%	#2/32	2025-08-01	source ↗
02	HLE	Reasoning · Multi-step Reasoning	accuracy	25.3%	#2/13	—	unverified
03	LiveCodeBench Pro	Computer Code · Code Generation	elo	2176.00	#2/9	—	source ↗
04	LiveCodeBench	Computer Code · Code Generation	pass@1	85.0%	#3/30	—	source ↗
05	HumanEval	Computer Code · Code Generation	pass@1	95.1%	#4/42	2025-12-01	source ↗
06	GPQA	Reasoning · Multi-step Reasoning	accuracy	89.0%	#5/33	—	source ↗
07	MMLU	Reasoning · Commonsense Reasoning	accuracy	90.8%	#8/41	2025-09-01	source ↗
08	MMLU-Pro	Reasoning · Commonsense Reasoning	accuracy	87.1%	#11/20	2026-04-20	source ↗
09	SWE-Bench Verified	Computer Code · Code Generation	resolve-rate	74.9%	#13/39	—	source ↗
10	SWE-bench Verified	Agentic AI · SWE-bench	resolve-rate	74.9%	#20/81	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 02 · Strengths by area

Where GPT-5 actually performs.

§ 04 · Related models

Other OpenAI models scored on Codesota.

GPT-4o

Undisclosed params · 35 results · 9 SOTA

Undisclosed params · 8 results

GPT-4.1

7 results

§ 05 · Sources & freshness

Where these numbers come from.

editorial

results

gsm8k-shadow-page-timeline

result

livecodebench-pro-official

result

artificial-analysis

result

shadow-page-humaneval

result

openai-gpt-5-launch

result

codesota-shadow-mmlu

result

pricepertoken

result

openai-blog

result

3 of 10 rows marked verified. · first result 2025-08-01, latest 2026-04-20.