Codesota · Models · Kimi-K2Moonshot.AI8 results · 2 benchmarks
Model card

Kimi-K2.

Moonshot.AIopen-source
§ 01 · Benchmarks

Every benchmark Kimi-K2 has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01SWE-Bench VerifiedComputer Code · Code Generationresolve-rate65.8%#24/39source ↗
02PLCCNatural Language Processing · Polish Cultural Competencyculture-and-tradition67.0%#72/165source ↗
03PLCCNatural Language Processing · Polish Cultural Competencyhistory73.0%#82/165source ↗
04PLCCNatural Language Processing · Polish Cultural Competencyaverage62.0%#84/165source ↗
05PLCCNatural Language Processing · Polish Cultural Competencyvocabulary54.0%#84/165source ↗
06PLCCNatural Language Processing · Polish Cultural Competencygrammar58.0%#86/165source ↗
07PLCCNatural Language Processing · Polish Cultural Competencyart-and-entertainment50.0%#88/165source ↗
08PLCCNatural Language Processing · Polish Cultural Competencygeography70.0%#91/165source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Kimi-K2 actually performs.

Computer Code
1
benchmark
avg rank #24.0
Natural Language Processing
1
benchmark
avg rank #83.9
§ 04 · Related models

Other Moonshot.AI models scored on Codesota.

Kimi-K2-0905
0 results
Kimi-K2.5
0 results
§ 05 · Sources & freshness

Where these numbers come from.

sdadas/PLCC
7
results
kimi-techreport
1
result
7 of 8 rows marked verified.