Codesota · Models · GLM-4.5Zhipu AI3 results · 3 benchmarks
Model card

GLM-4.5.

Zhipu AIopen-source
§ 02 · Benchmarks

Every benchmark GLM-4.5 has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01MMLU-ProReasoning · Commonsense Reasoningaccuracy84.6%#27/712025-08-08source ↗
02SWE-bench VerifiedAgentic AI · SWE-benchresolve-rate64.2%#52/81source ↗
03HLEReasoning · Multi-step Reasoningaccuracy8.3%#56/74source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 03 · Strengths by area

Where GLM-4.5 actually performs.

Reasoning
2
benchmarks
avg rank #41.5
Agentic AI
1
benchmark
avg rank #52.0
§ 05 · Related models

Other Zhipu AI models scored on Codesota.

GLM-5
130B params · 8 results · 1 SOTA
GLM-4.5
7 results
GLM-4.5-Air
6 results
GLM-4.7
4 results
GLM-4.5-Air
3 results
GLM-OCR
3 results
GLM-4.6
1 result
GLM-4.7
1 result
§ 06 · Sources & freshness

Where these numbers come from.

paperswithcode
1
result
editorial
1
result
scale-hle-official
1
result
2 of 3 rows marked verified.