Codesota · Models · GLM-5Zhipu AI14 results · 4 benchmarks
Model card

GLM-5.

Zhipu AIopen-source130B params
§ 01 · Benchmarks

Every benchmark GLM-5 has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01React Native EvalsMobile Development · React Native Code Generationanimation-satisfaction66.0%#4/10source ↗
02SWE-BenchComputer Code · Code Generationresolve-rate-agentic77.8%#7/252026-01-01source ↗
03React Native EvalsMobile Development · React Native Code Generationrequirement-satisfaction74.2%#8/10source ↗
04React Native EvalsMobile Development · React Native Code Generationnavigation-satisfaction86.7%#8/10source ↗
05SWE-BenchComputer Code · Code Generationresolve-rate77.8%#9/322026-01-01source ↗
06React Native EvalsMobile Development · React Native Code Generationasync-state-satisfaction73.8%#9/10source ↗
07SWE-bench VerifiedAgentic AI · SWE-benchresolve-rate77.8%#11/81source ↗
08PLCCNatural Language Processing · Polish Cultural Competencygrammar82.0%#16/165source ↗
09PLCCNatural Language Processing · Polish Cultural Competencygeography91.0%#21/165source ↗
10PLCCNatural Language Processing · Polish Cultural Competencyhistory88.0%#28/165source ↗
11PLCCNatural Language Processing · Polish Cultural Competencyaverage80.0%#33/165source ↗
12PLCCNatural Language Processing · Polish Cultural Competencyculture-and-tradition81.0%#37/165source ↗
13PLCCNatural Language Processing · Polish Cultural Competencyvocabulary72.0%#39/165source ↗
14PLCCNatural Language Processing · Polish Cultural Competencyart-and-entertainment66.0%#47/165source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where GLM-5 actually performs.

Mobile Development
1
benchmark
avg rank #7.3
Computer Code
1
benchmark
avg rank #8.0
Agentic AI
1
benchmark
avg rank #11.0
Natural Language Processing
1
benchmark
avg rank #31.6
§ 03 · Papers

1 paper with results for GLM-5.

  1. 2023-10-10· Computer Code· 1 result

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao et al.
§ 04 · Related models

Other Zhipu AI models scored on Codesota.

GLM-4.5
2 results
GLM-4.5-Air
1 result
GLM-4.6
1 result
GLM-4.7
1 result
GLM-4.7-Flash
1 result
GLM-OCR
1 result
GLM-4.5
0 results
GLM-4.5-Air
0 results
§ 05 · Sources & freshness

Where these numbers come from.

sdadas/PLCC
7
results
Callstack Incubator
4
results
zhipu-agent
1
result
swebench-leaderboard
1
result
editorial
1
result
14 of 14 rows marked verified.