Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Natural Language Processing · Polish Cultural Competency · PLCCTasks/Natural Language Processing/Polish Cultural Competency
Polish Cultural Competency · benchmark dataset · 2025 · PL

Polish Linguistic and Cultural Competency Benchmark.

Evaluates LLMs on Polish linguistic and cultural knowledge across 6 categories: art & entertainment, culture & tradition, geography, grammar, history, and vocabulary. Accuracy (0-100) per category. Created by Dadas et al. (2025).

Paper Download datasetSubmit a result
§ 01 · Leaderboard

Best published scores.

1155 results indexed across 7 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
average · higher is better
All metrics
art-and-entertainment, average, culture-and-tradition, geography, grammar, history, vocabulary
art-and-entertainment
165 rows
#ModelOrgSubmittedPaper / codeart-and-entertainment
01Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC95
02Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC95
03Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC91
04GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC91
05Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC91
06GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC90
07GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC88
08Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC88
09GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC87
10Grok 4APIxAIApr 2026sdadas/PLCC86
11O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC86
12GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC85
13GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC85
14GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC83
15Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC83
16O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC83
17GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC82
18GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC82
19Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC80
20GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC79
21GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC79
22GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC78
23Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC78
24GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC77
25Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC77
26Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC77
27GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC76
28Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC75
29GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC74
30Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC74
31Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC73
32Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC73
33Claude Opus 4APIAnthropicApr 2026sdadas/PLCC72
34PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC72
35GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC72
36Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC72
37PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC72
38DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC71
39Grok-3-BetaOSSxAIApr 2026sdadas/PLCC71
40GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC70
41Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC69
42Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC69
43DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC69
44Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC68
45Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC67
46Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC67
47DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC66
48GLM-5OSSZhipu AIApr 2026sdadas/PLCC66
49DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC65
50Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC64
51DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC64
52MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC64
53GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC64
54DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC63
55Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC63
56Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC63
57Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC63
58O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC62
59Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC62
60GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC62
61Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC61
62DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC61
63GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC61
64Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC61
65DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC61
66Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC61
67GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC61
68PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC59
69DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC59
70Grok-4-FastOSSxAIApr 2026sdadas/PLCC59
71GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC59
72Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC58
73Grok-2-1212OSSxAIApr 2026sdadas/PLCC57
74Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC56
75Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC56
76GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC56
77Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC55
78Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC55
79Grok-4.20OSSxAIApr 2026sdadas/PLCC55
80Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC54
81Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC54
82Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC54
83Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC54
84Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC53
85Mistral-Small-4OSSMistralApr 2026sdadas/PLCC53
86Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC52
87GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC51
88Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC50
89Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC50
90GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC50
91GPT-4OpenAIApr 2026sdadas/PLCC49
92Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC49
93Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC48
94PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC48
95GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC48
96GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC47
97Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC46
98O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC46
99Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC46
100WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC45
101PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC45
102Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC45
103Command-A-03-2025OSSCohereApr 2026sdadas/PLCC44
104Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC44
105Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC44
106Gemma-3-27bGoogleApr 2026sdadas/PLCC43
107Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC43
108Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC43
109Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC43
110MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC43
111Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC43
112Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC43
113GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC42
114Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC42
115GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC42
116Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC40
117Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC39
118GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC39
119Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC39
120Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC39
121MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC39
122Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC38
123O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC38
124Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC37
125Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC37
126Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC36
127Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC35
128Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC34
129Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC33
130Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC33
131Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC33
132Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC32
133Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC31
134GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC31
135GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC30
136Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC30
137EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC30
138Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC28
139Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC27
140Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC26
141GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC26
142Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC25
143Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC25
144Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC23
145Phi-4MicrosoftApr 2026sdadas/PLCC23
146Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC22
147Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC22
148Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC21
149Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC21
150Mistral-NemoOSSMistralApr 2026sdadas/PLCC20
151Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC20
152Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC19
153Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC19
154Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC19
155GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC19
156Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC17
157Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC15
158Command-R7BOSSCohereApr 2026sdadas/PLCC14
159Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC14
160Ministral-8bOSSMistralApr 2026sdadas/PLCC14
161Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC12
162Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC12
163Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC11
164Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC5.00
165Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC5.00
average· primary
165 rows
#ModelOrgSubmittedPaper / codeaverage
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC97
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC95.83
03GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC92.17
04Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC92.17
05Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC91.67
06GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC91
07GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC90.50
08Grok 4APIxAIApr 2026sdadas/PLCC90.50
09GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC89.50
10Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC89.50
11GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC89.33
12O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC89.17
13O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC89.17
14GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC88.83
15GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC87.17
16GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC86.50
17GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC85.17
18GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC85
19GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC84.33
20Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC83.50
21Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC83
22Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC82.67
23GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC82.33
24Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC82.17
25Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC81.83
26Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC81.50
27GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC81.33
28GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC81.33
29DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC81
30Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC80.67
31GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC80.33
32Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC80.33
33GLM-5OSSZhipu AIApr 2026sdadas/PLCC80
34Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC79
35GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC78.83
36DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC78.67
37Claude Opus 4APIAnthropicApr 2026sdadas/PLCC78.67
38MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC78.50
39Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC77.83
40GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC77.83
41Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC77.67
42GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC77.50
43Grok-3-BetaOSSxAIApr 2026sdadas/PLCC77.17
44DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC76.17
45DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC76
46Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC75
47Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC74.83
48Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC74.17
49Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC73.83
50GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC73.50
51GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC73
52O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC72.83
53Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC72.33
54DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC71.67
55Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC71.67
56Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC71.33
57DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC71
58Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC71
59DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC71
60Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC70.67
61Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC70.67
62GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC70.67
63Grok-4-FastOSSxAIApr 2026sdadas/PLCC70.17
64DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC70
65PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC69.67
66Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC69.67
67DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC69.17
68Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC68.33
69Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC68.17
70PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC68.17
71Grok-4.20OSSxAIApr 2026sdadas/PLCC67.83
72GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC67
73Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC66.83
74GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC66.50
75Grok-2-1212OSSxAIApr 2026sdadas/PLCC66
76GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC65.83
77Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC65.50
78Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC63.83
79MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC63.33
80Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC63
81GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC62.50
82GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC62.17
83Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC62.17
84Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC62
85Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC62
86Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC61.33
87Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC61
88Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC61
89Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC60
90MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC59.67
91GPT-4OpenAIApr 2026sdadas/PLCC59.50
92PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC59.50
93O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC59.33
94Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC58.50
95Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC58.17
96Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC58
97Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC57.83
98Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC57
99GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC56.83
100Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC56.50
101Mistral-Small-4OSSMistralApr 2026sdadas/PLCC56.33
102Command-A-03-2025OSSCohereApr 2026sdadas/PLCC56.17
103Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC55
104GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC54.67
105Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC54.33
106GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC54.33
107Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC54.33
108Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC54.17
109PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC54.17
110Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC53
111Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC52
112O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC51.67
113WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC51.50
114Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC50.83
115Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC50.67
116Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC50.17
117Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC49.83
118Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC49.33
119Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC48.83
120Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC47.83
121Gemma-3-27bGoogleApr 2026sdadas/PLCC47.33
122PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC47
123Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC46.67
124Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC46.50
125Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC46.17
126GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC44.17
127GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC43.67
128GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC43.33
129Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC43.33
130Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC43
131Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC43
132Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC42.67
133GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC42.33
134Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC42.33
135Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC41.50
136EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC41
137Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC40.33
138Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC39.33
139Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC39.17
140Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC39
141Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC39
142Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC38.50
143Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC38.50
144Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC37.67
145Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC35.33
146Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC35.17
147Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC33
148GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC32.33
149Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC30.50
150Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC30.33
151Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC29.67
152Phi-4MicrosoftApr 2026sdadas/PLCC29.17
153Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC29.17
154Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC28.50
155Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC27.50
156Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC26.67
157Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC26
158Mistral-NemoOSSMistralApr 2026sdadas/PLCC23
159Command-R7BOSSCohereApr 2026sdadas/PLCC22.83
160Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC22.67
161Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC22.33
162Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC21.83
163Ministral-8bOSSMistralApr 2026sdadas/PLCC20.67
164Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC17.67
165Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC13.83
culture-and-tradition
165 rows
#ModelOrgSubmittedPaper / codeculture-and-tradition
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC100
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC99
03Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC98
04Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC96
05Grok 4APIxAIApr 2026sdadas/PLCC95
06GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC94
07GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC93
08GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC93
09GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC93
10O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC92
11GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC92
12GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC92
13O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC91
14Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC91
15GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC90
16Grok-3-BetaOSSxAIApr 2026sdadas/PLCC90
17Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC90
18GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC89
19GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC89
20GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC89
21GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC88
22Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC87
23GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC87
24Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC86
25GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC86
26Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC85
27Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC85
28GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC84
29GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC84
30Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC83
31GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC83
32Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC83
33GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC82
34Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC82
35Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC82
36Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC82
37Claude Opus 4APIAnthropicApr 2026sdadas/PLCC81
38GLM-5OSSZhipu AIApr 2026sdadas/PLCC81
39MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC79
40GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC79
41Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC78
42DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC78
43Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC78
44Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC78
45Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC77
46DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC76
47PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC76
48Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC76
49GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC76
50DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC76
51Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC76
52DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC76
53DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC75
54PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC75
55DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC75
56Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC75
57Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC74
58GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC74
59GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC74
60DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC73
61O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC73
62Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC73
63GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC73
64Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC72
65Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC72
66Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC71
67Grok-4-FastOSSxAIApr 2026sdadas/PLCC71
68DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC71
69DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC69
70Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC68
71GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC68
72Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC67
73Grok-2-1212OSSxAIApr 2026sdadas/PLCC67
74Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC67
75Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC67
76Grok-4.20OSSxAIApr 2026sdadas/PLCC65
77PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC65
78Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC64
79Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC64
80Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC64
81Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC63
82GPT-4OpenAIApr 2026sdadas/PLCC63
83GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC62
84Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC62
85Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC62
86Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC62
87Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC61
88Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC61
89Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC60
90PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC60
91MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC59
92MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC59
93GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC59
94GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC57
95GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC57
96Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC57
97Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC57
98Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC57
99Command-A-03-2025OSSCohereApr 2026sdadas/PLCC55
100Gemma-3-27bGoogleApr 2026sdadas/PLCC55
101Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC53
102Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC52
103Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC52
104Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC52
105Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC52
106Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC52
107Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC52
108GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC51
109O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC51
110WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC50
111Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC50
112PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC49
113Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC49
114Mistral-Small-4OSSMistralApr 2026sdadas/PLCC49
115Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC46
116Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC46
117GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC46
118Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC45
119Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC45
120GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC44
121Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC44
122O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC44
123Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC41
124Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC41
125Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC41
126Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC41
127GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC40
128EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC40
129Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC40
130GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC40
131Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC39
132Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC39
133GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC38
134Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC38
135Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC36
136Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC36
137Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC35
138Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC34
139Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC32
140Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC30
141Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC30
142Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC30
143Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC29
144Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC29
145Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC29
146Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC28
147Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC27
148GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC26
149Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC25
150Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC24
151Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC23
152Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC21
153Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC20
154Command-R7BOSSCohereApr 2026sdadas/PLCC18
155Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC17
156Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC17
157Phi-4MicrosoftApr 2026sdadas/PLCC17
158Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC16
159Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC13
160Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC13
161Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC13
162Mistral-NemoOSSMistralApr 2026sdadas/PLCC13
163Ministral-8bOSSMistralApr 2026sdadas/PLCC12
164Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC11
165Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC9.00
geography
165 rows
#ModelOrgSubmittedPaper / codegeography
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC100
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC100
03Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC98
04O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC97
05GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC97
06GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC97
07GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC97
08Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC97
09GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC96
10GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC96
11Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC96
12O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC95
13GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC95
14GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC94
15GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC94
16DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC94
17Grok 4APIxAIApr 2026sdadas/PLCC94
18GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC94
19Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC94
20GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC92
21GLM-5OSSZhipu AIApr 2026sdadas/PLCC91
22GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC90
23GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC89
24DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC89
25GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC89
26MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC89
27Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC88
28GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC88
29GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC88
30GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC88
31O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC88
32Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC87
33Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC87
34Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC86
35Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC86
36GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC86
37Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC86
38Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC86
39GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC86
40GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC86
41Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC85
42DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC85
43Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC85
44Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC85
45Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC84
46DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC84
47Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC84
48Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC84
49Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC84
50Claude Opus 4APIAnthropicApr 2026sdadas/PLCC83
51Grok-3-BetaOSSxAIApr 2026sdadas/PLCC83
52Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC83
53GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC82
54GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC82
55MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC82
56DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC82
57Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC81
58DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC80
59GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC80
60Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC80
61PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC79
62GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC79
63Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC79
64DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC79
65GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC79
66Grok-4-FastOSSxAIApr 2026sdadas/PLCC79
67Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC79
68DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC78
69O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC78
70DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC78
71Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC77
72Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC77
73Grok-2-1212OSSxAIApr 2026sdadas/PLCC77
74GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC77
75Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC76
76Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC75
77Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC75
78GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC75
79Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC75
80Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC74
81Grok-4.20OSSxAIApr 2026sdadas/PLCC74
82Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC74
83Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC73
84PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC73
85Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC72
86Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC72
87Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC72
88Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC71
89GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC71
90Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC71
91Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC70
92PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC70
93GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC69
94Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC69
95MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC68
96Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC68
97Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC68
98Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC68
99GPT-4OpenAIApr 2026sdadas/PLCC67
100Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC67
101Command-A-03-2025OSSCohereApr 2026sdadas/PLCC67
102O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC66
103PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC66
104Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC65
105Mistral-Small-4OSSMistralApr 2026sdadas/PLCC64
106GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC64
107Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC64
108Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC64
109Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC63
110Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC63
111Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC62
112Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC61
113Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC61
114Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC61
115Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC61
116WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC60
117Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC59
118GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC59
119Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC59
120Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC58
121GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC55
122GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC55
123PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC54
124EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC54
125Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC53
126Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC53
127Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC53
128Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC52
129GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC52
130Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC51
131Gemma-3-27bGoogleApr 2026sdadas/PLCC51
132Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC51
133Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC49
134Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC47
135Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC46
136Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC46
137Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC45
138Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC45
139Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC45
140Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC45
141Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC44
142Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC44
143Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC42
144Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC42
145Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC39
146Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC37
147Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC35
148GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC35
149Phi-4MicrosoftApr 2026sdadas/PLCC35
150Command-R7BOSSCohereApr 2026sdadas/PLCC33
151Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC31
152Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC31
153Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC30
154Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC30
155Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC30
156Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC27
157Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC27
158Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC27
159Mistral-NemoOSSMistralApr 2026sdadas/PLCC26
160Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC25
161Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC24
162Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC23
163Ministral-8bOSSMistralApr 2026sdadas/PLCC19
164Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC17
165Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC12
grammar
165 rows
#ModelOrgSubmittedPaper / codegrammar
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC93
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC91
03GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC90
04Grok 4APIxAIApr 2026sdadas/PLCC90
05GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC89
06GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC88
07GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC87
08Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC86
09GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC85
10GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC85
11O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC85
12Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC85
13O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC84
14DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC84
15GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC84
16GLM-5OSSZhipu AIApr 2026sdadas/PLCC82
17GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC82
18GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC82
19GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC82
20Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC80
21Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC80
22Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC80
23Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC79
24GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC79
25Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC79
26MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC79
27Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC79
28Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC77
29Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC77
30Claude Opus 4APIAnthropicApr 2026sdadas/PLCC76
31Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC76
32Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC75
33DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC75
34DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC74
35Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC74
36Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC74
37GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC74
38GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC74
39DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC73
40Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC73
41Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC73
42O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC72
43Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC72
44Grok-4-FastOSSxAIApr 2026sdadas/PLCC72
45Grok-4.20OSSxAIApr 2026sdadas/PLCC72
46MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC72
47MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC71
48Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC71
49GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC70
50GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC70
51GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC70
52GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC69
53Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC69
54GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC69
55Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC68
56Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC68
57O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC67
58GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC67
59GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC67
60Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC67
61Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC66
62GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC66
63DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC66
64GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC66
65Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC66
66Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC66
67Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC65
68Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC65
69Grok-3-BetaOSSxAIApr 2026sdadas/PLCC65
70GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC64
71DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC64
72Grok-2-1212OSSxAIApr 2026sdadas/PLCC64
73DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC64
74GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC63
75DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC63
76Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC63
77GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC62
78Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC62
79DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC62
80O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC61
81Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC61
82Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC59
83GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC59
84Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC59
85Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC59
86GPT-4OpenAIApr 2026sdadas/PLCC58
87Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC58
88Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC58
89Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC58
90Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC57
91Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC57
92Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC57
93Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC56
94Mistral-Small-4OSSMistralApr 2026sdadas/PLCC56
95GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC56
96Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC56
97Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC55
98GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC55
99GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC54
100Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC54
101Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC54
102Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC54
103Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC53
104Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC53
105PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC52
106Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC52
107GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC52
108Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC51
109Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC51
110Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC51
111Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC51
112Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC50
113Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC50
114Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC50
115Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC50
116Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC50
117Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC49
118Command-A-03-2025OSSCohereApr 2026sdadas/PLCC49
119Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC49
120WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC49
121Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC49
122Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC48
123Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC47
124PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC47
125Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC47
126Gemma-3-27bGoogleApr 2026sdadas/PLCC46
127Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC46
128Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC46
129Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC46
130Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC45
131Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC45
132GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC45
133Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC45
134GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC45
135Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC45
136Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC45
137Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC44
138Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC44
139Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC44
140GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC44
141Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC43
142Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC43
143PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC42
144GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC41
145PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC41
146EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC39
147Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC38
148Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC38
149PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC37
150Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC35
151Phi-4MicrosoftApr 2026sdadas/PLCC34
152Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC34
153Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC34
154Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC33
155Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC33
156Mistral-NemoOSSMistralApr 2026sdadas/PLCC31
157Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC30
158Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC29
159Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC29
160Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC29
161Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC27
162Ministral-8bOSSMistralApr 2026sdadas/PLCC24
163Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC23
164Command-R7BOSSCohereApr 2026sdadas/PLCC23
165Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC19
history
165 rows
#ModelOrgSubmittedPaper / codehistory
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC98
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC95
03GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC94
04Grok 4APIxAIApr 2026sdadas/PLCC94
05GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC93
06GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC92
07Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC92
08Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC92
09Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC92
10Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC92
11DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC91
12GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC91
13GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC91
14Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC91
15Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC91
16Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC90
17O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC90
18GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC90
19DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC90
20GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC90
21GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC90
22O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC89
23Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC89
24Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC89
25GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC89
26DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC89
27GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC89
28GLM-5OSSZhipu AIApr 2026sdadas/PLCC88
29Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC88
30GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC87
31MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC87
32Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC87
33GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC87
34Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC87
35Claude Opus 4APIAnthropicApr 2026sdadas/PLCC87
36Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC86
37DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC86
38GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC86
39Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC86
40GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC85
41Grok-3-BetaOSSxAIApr 2026sdadas/PLCC85
42GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC85
43Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC85
44DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC85
45GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC85
46GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC84
47Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC84
48Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC84
49DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC83
50GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC83
51Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC83
52Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC83
53DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC82
54GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC82
55DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC82
56Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC82
57GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC82
58GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC82
59Grok-4.20OSSxAIApr 2026sdadas/PLCC82
60Grok-4-FastOSSxAIApr 2026sdadas/PLCC81
61Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC81
62Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC80
63Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC80
64Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC79
65Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC79
66Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC78
67Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC78
68Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC78
69DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC77
70Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC77
71O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC77
72GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC77
73Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC76
74Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC76
75GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC76
76GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC76
77Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC75
78Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC75
79Grok-2-1212OSSxAIApr 2026sdadas/PLCC74
80Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC74
81Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC74
82Command-A-03-2025OSSCohereApr 2026sdadas/PLCC73
83Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC73
84PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC73
85PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC73
86GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC73
87Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC73
88Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC73
89Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC73
90Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC72
91GPT-4OpenAIApr 2026sdadas/PLCC72
92Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC72
93Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC71
94Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC70
95Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC70
96PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC70
97MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC69
98Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC69
99Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC69
100Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC68
101PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC68
102Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC68
103WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC67
104O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC67
105GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC67
106GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC67
107GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC66
108Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC65
109GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC65
110MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC64
111Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC64
112Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC64
113Mistral-Small-4OSSMistralApr 2026sdadas/PLCC64
114Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC64
115Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC63
116Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC63
117PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC61
118O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC61
119Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC61
120Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC61
121Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC61
122Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC61
123Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC60
124Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC58
125Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC58
126GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC57
127Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC56
128Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC55
129Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC55
130Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC54
131GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC54
132Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC54
133Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC54
134Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC53
135Gemma-3-27bGoogleApr 2026sdadas/PLCC52
136Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC52
137Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC51
138GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC51
139GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC50
140Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC50
141Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC49
142EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC49
143Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC48
144Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC47
145Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC46
146Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC44
147Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC43
148Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC42
149Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC42
150Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC42
151Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC41
152Phi-4MicrosoftApr 2026sdadas/PLCC40
153Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC37
154GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC37
155Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC36
156Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC35
157Ministral-8bOSSMistralApr 2026sdadas/PLCC33
158Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC32
159Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC30
160Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC30
161Mistral-NemoOSSMistralApr 2026sdadas/PLCC28
162Command-R7BOSSCohereApr 2026sdadas/PLCC27
163Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC25
164Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC23
165Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC14
vocabulary
165 rows
#ModelOrgSubmittedPaper / codevocabulary
01Gemini-3.1-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC96
02Gemini-3.0-Pro-PreviewOSSGoogleApr 2026sdadas/PLCC95
03GPT-5-Pro-2025-10-06 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC92
04GPT-5.4-2026-03-05 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC91
05GPT-5-2025-08-07OSSOpenAIApr 2026sdadas/PLCC91
06Gemini-2.5-Pro-Preview-06-05OSSGoogleApr 2026sdadas/PLCC90
07Gemini-2.5-Pro-Exp-03-25OSSGoogleApr 2026sdadas/PLCC90
08O3-2025-04-16OSSOpenAIApr 2026sdadas/PLCC90
09GPT-5.1-2025-11-13 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC90
10O1-2024-12-17OSSOpenAIApr 2026sdadas/PLCC88
11Gemini-3-Flash-PreviewOSSGoogleApr 2026sdadas/PLCC88
12GPT-5.2-2025-12-11 (xhigh reasoning)OSSOpenAIApr 2026sdadas/PLCC87
13GPT-5.2-2025-12-11 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC86
14GPT-5.4-mini-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC86
15GPT-5.2-2025-12-11 (medium reasoning)OSSOpenAIApr 2026sdadas/PLCC86
16GPT-5.4-2026-03-05 (low reasoning)OSSOpenAIApr 2026sdadas/PLCC85
17GPT-5.4-2026-03-05 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC85
18Grok 4APIxAIApr 2026sdadas/PLCC84
19GPT-4.5-preview-2025-02-27OSSOpenAIApr 2026sdadas/PLCC83
20Gemini-Exp-1206OSSGoogleApr 2026sdadas/PLCC82
21Gemini-2.5-Flash-Preview-04-17OSSGoogleApr 2026sdadas/PLCC81
22GPT-4o-2024-11-20OSSOpenAIApr 2026sdadas/PLCC80
23GPT-4.1-2025-04-14OSSOpenAIApr 2026sdadas/PLCC80
24GPT-4o-2024-05-13OSSOpenAIApr 2026sdadas/PLCC78
25Claude-Opus-4.6OSSAnthropicApr 2026sdadas/PLCC78
26Claude-3.5-Sonnet-20241022OSSAnthropicApr 2026sdadas/PLCC77
27GPT-5.2-2025-12-11 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC77
28GPT-4o-2024-08-06OSSOpenAIApr 2026sdadas/PLCC77
29Claude-Opus-4.5OSSAnthropicApr 2026sdadas/PLCC76
30Claude-3.5-Sonnet-20240620OSSAnthropicApr 2026sdadas/PLCC76
31GPT-5.1-2025-11-13 (default reasoning)OSSOpenAIApr 2026sdadas/PLCC75
32Claude-3.7-SonnetOSSAnthropicApr 2026sdadas/PLCC75
33Claude-3.7-Sonnet-ThinkingOSSAnthropicApr 2026sdadas/PLCC75
34DeepSeek-v3.1 (thinking)OSSDeepSeekApr 2026sdadas/PLCC74
35Claude-Sonnet-4.6OSSAnthropicApr 2026sdadas/PLCC74
36Claude-Opus-4.1OSSAnthropicApr 2026sdadas/PLCC73
37MiMo-V2-ProOSSXiaomiApr 2026sdadas/PLCC73
38Claude Opus 4APIAnthropicApr 2026sdadas/PLCC73
39DeepSeek R1OSSDeepSeekApr 2026sdadas/PLCC72
40Gemini-2.0-Flash-ExperimentalOSSGoogleApr 2026sdadas/PLCC72
41GLM-5OSSZhipu AIApr 2026sdadas/PLCC72
42DeepSeek-V3.2-SpecialeOSSDeepSeekApr 2026sdadas/PLCC71
43Qwen3.5-397B-A17BOSSAlibabaApr 2026sdadas/PLCC70
44GPT-5.4-mini-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC70
45GPT-5-mini-2025-08-07OSSOpenAIApr 2026sdadas/PLCC70
46Gemini-2.0-Flash-Thinking-Exp-01-21OSSGoogleApr 2026sdadas/PLCC69
47Grok-3-BetaOSSxAIApr 2026sdadas/PLCC69
48DeepSeek-R1-0528OSSDeepSeekApr 2026sdadas/PLCC68
49PLLuM-8x7B-nc-chatOSSPLLuMApr 2026sdadas/PLCC68
50Gemini-Pro-1.5OSSGoogleApr 2026sdadas/PLCC68
51Bielik-11B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC67
52PLLuM-12B-nc-chat-250715OSSPLLuMApr 2026sdadas/PLCC67
53Kimi-K2.5OSSMoonshot.AIApr 2026sdadas/PLCC65
54Grok-4.1-FastOSSxAIApr 2026sdadas/PLCC65
55DeepSeek-V3.2OSSDeepSeekApr 2026sdadas/PLCC65
56O4-Mini-2025-04-16OSSOpenAIApr 2026sdadas/PLCC65
57DeepSeek-v3.2-ExpOSSDeepSeekApr 2026sdadas/PLCC64
58Mistral-Large-2512OSSMistralApr 2026sdadas/PLCC64
59DeepSeek-V3OSSDeepSeekApr 2026sdadas/PLCC63
60Bielik-2.6OSSSpeakLeashApr 2026sdadas/PLCC62
61Mistral-Medium-3OSSMistralApr 2026sdadas/PLCC62
62Bielik-2.2OSSSpeakLeashApr 2026sdadas/PLCC62
63DeepSeek-v3-0324OSSDeepSeekApr 2026sdadas/PLCC62
64Claude 3 OpusAPIAnthropicApr 2026sdadas/PLCC62
65DeepSeek-v3.1 (no thinking)OSSDeepSeekApr 2026sdadas/PLCC62
66Qwen3.5-122B-A10BOSSAlibabaApr 2026sdadas/PLCC61
67Grok-3-Mini-BetaOSSxAIApr 2026sdadas/PLCC61
68GPT-5.4-nano-2026-03-17 (high reasoning)OSSOpenAIApr 2026sdadas/PLCC61
69Claude-Sonnet-4.5OSSAnthropicApr 2026sdadas/PLCC61
70Bielik-2.3OSSSpeakLeashApr 2026sdadas/PLCC61
71Bielik-2.5OSSSpeakLeashApr 2026sdadas/PLCC61
72Claude Sonnet 4APIAnthropicApr 2026sdadas/PLCC61
73MiniMax-M2.7OSSMiniMaxAIApr 2026sdadas/PLCC60
74GLM-4.5OSSZhipu AIApr 2026sdadas/PLCC60
75Kimi K2-Thinking-0905OSSMoonshot AIApr 2026sdadas/PLCC59
76GLM-4.7OSSZhipu AIApr 2026sdadas/PLCC59
77Grok-4-FastOSSxAIApr 2026sdadas/PLCC59
78Grok-4.20OSSxAIApr 2026sdadas/PLCC59
79Grok-2-1212OSSxAIApr 2026sdadas/PLCC57
80GLM-4.6OSSZhipu AIApr 2026sdadas/PLCC57
81Bielik-2.1OSSSpeakLeashApr 2026sdadas/PLCC56
82GPT-4.1-mini-2025-04-14OSSOpenAIApr 2026sdadas/PLCC56
83GPT-4 TurboAPIOpenAIApr 2026sdadas/PLCC56
84Kimi-K2OSSMoonshot.AIApr 2026sdadas/PLCC54
85Qwen3 MaxOSSAlibaba CloudApr 2026sdadas/PLCC54
86Qwen3.5-27BOSSAlibabaApr 2026sdadas/PLCC54
87Llama-3.1-Tulu-3-405BOSSMetaApr 2026sdadas/PLCC53
88Kimi-K2-0905OSSMoonshot.AIApr 2026sdadas/PLCC53
89Claude-3.5-Haiku-20241022OSSAnthropicApr 2026sdadas/PLCC52
90Mistral-Small-4OSSMistralApr 2026sdadas/PLCC52
91MiniMax-M2.5OSSMiniMaxAIApr 2026sdadas/PLCC52
92PLLuM-12B-nc-chatOSSPLLuMApr 2026sdadas/PLCC52
93GPT-4o-mini-2024-07-18OSSOpenAIApr 2026sdadas/PLCC51
94Command-A-03-2025OSSCohereApr 2026sdadas/PLCC49
95GPT-4OpenAIApr 2026sdadas/PLCC48
96GPT-5-nano-2025-08-07OSSOpenAIApr 2026sdadas/PLCC47
97Gemini-Flash-1.5OSSGoogleApr 2026sdadas/PLCC47
98GLM-4.5-AirOSSZhipu AIApr 2026sdadas/PLCC47
99O3-mini-2025-01-31OSSOpenAIApr 2026sdadas/PLCC47
100Command-R-Plus-04-2024OSSCohereApr 2026sdadas/PLCC46
101Llama-PLLuM-70B-chatOSSPLLuMApr 2026sdadas/PLCC46
102Bielik-Minitron-7B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC46
103Llama-PLLuM-70B-chat-250801OSSPLLuMApr 2026sdadas/PLCC46
104Claude-3.0-SonnetOSSAnthropicApr 2026sdadas/PLCC46
105Claude-Haiku-4.5OSSAnthropicApr 2026sdadas/PLCC45
106Qwen3.5-35B-A3BOSSAlibabaApr 2026sdadas/PLCC45
107Llama-4-MaverickOSSMetaApr 2026sdadas/PLCC45
108Qwen-MaxOSSAlibabaApr 2026sdadas/PLCC45
109PLLuM-8x7B-chatOSSPLLuMApr 2026sdadas/PLCC44
110Llama-3.1-405bOSSMetaApr 2026sdadas/PLCC43
111Qwen3-235B-A22BAlibabaApr 2026sdadas/PLCC43
112Command-R-Plus-08-2024OSSCohereApr 2026sdadas/PLCC43
113Mistral-Large-2411OSSMistralApr 2026sdadas/PLCC42
114Llama-4-ScoutOSSMetaApr 2026sdadas/PLCC42
115GPT-5.4-nano-2026-03-17 (no reasoning)OSSOpenAIApr 2026sdadas/PLCC41
116Mistral-Large-2407OSSMistralApr 2026sdadas/PLCC40
117O1-mini-2024-09-12OSSOpenAIApr 2026sdadas/PLCC40
118Bielik-4.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC39
119Ministral-14b-2512OSSMistralApr 2026sdadas/PLCC39
120GPT-4.1-nano-2025-04-14OSSOpenAIApr 2026sdadas/PLCC38
121Qwen3.5-9BOSSAlibabaApr 2026sdadas/PLCC38
122WizardLM-2-8x22bOSSMicrosoftApr 2026sdadas/PLCC38
123GPT-OSS-120bOSSOpenAIApr 2026sdadas/PLCC38
124Qwen-PlusOSSAlibabaApr 2026sdadas/PLCC38
125Gemma-3-27bGoogleApr 2026sdadas/PLCC37
126Mistral-Small-3.1-24B-2503OSSMistralApr 2026sdadas/PLCC37
127Bielik-0.1OSSSpeakLeashApr 2026sdadas/PLCC37
128Qwen3-32BOSSAlibabaApr 2026sdadas/PLCC37
129Llama-3.3-70BOSSMetaApr 2026sdadas/PLCC37
130Qwen3-Next-80B-A3B-ThinkingOSSAlibabaApr 2026sdadas/PLCC37
131Gemma-2-27bOSSGoogleApr 2026sdadas/PLCC37
132Mistral-Small-24B-2501OSSMistralApr 2026sdadas/PLCC36
133GPT-3.5-turboOSSOpenAIApr 2026sdadas/PLCC36
134Qwen-2.5-72bOSSAlibabaApr 2026sdadas/PLCC36
135Ministral-8b-2512OSSMistralApr 2026sdadas/PLCC35
136Llama-PLLuM-8B-chatOSSPLLuMApr 2026sdadas/PLCC35
137Mixtral-8x22bOSSMistralApr 2026sdadas/PLCC35
138Mistral-Small-3.2-24B-2506OSSMistralApr 2026sdadas/PLCC35
139Llama-3.1-70BOSSMetaApr 2026sdadas/PLCC34
140Qwen3-14BOSSAlibabaApr 2026sdadas/PLCC34
141EuroLLM-9BOSSUTTERApr 2026sdadas/PLCC34
142Qwen3.5-4BOSSAlibabaApr 2026sdadas/PLCC34
143Qwen-2.5-32bOSSAlibabaApr 2026sdadas/PLCC33
144PLLuM-12B-chatOSSPLLuMApr 2026sdadas/PLCC33
145Qwen3-Next-80B-A3B-InstructOSSAlibabaApr 2026sdadas/PLCC32
146Magistral-Small-2506OSSMistralApr 2026sdadas/PLCC31
147Qwen-Turbo-2024-11-01OSSAlibabaApr 2026sdadas/PLCC31
148Gemma-2-9bOSSGoogleApr 2026sdadas/PLCC30
149GLM-4.7-FlashOSSZhipu AIApr 2026sdadas/PLCC30
150Qwen-2.5-14bOSSAlibabaApr 2026sdadas/PLCC28
151Qwen3-30B-A3BOSSAlibabaApr 2026sdadas/PLCC27
152Phi-4MicrosoftApr 2026sdadas/PLCC26
153Qwen3-8BOSSAlibabaApr 2026sdadas/PLCC25
154Bielik-1.5B-v3.0-InstructOSSSpeakLeashApr 2026sdadas/PLCC23
155GPT-OSS-20bOSSOpenAIApr 2026sdadas/PLCC23
156Command-R7BOSSCohereApr 2026sdadas/PLCC22
157Ministral-3b-2512OSSMistralApr 2026sdadas/PLCC22
158Ministral-8bOSSMistralApr 2026sdadas/PLCC22
159Llama-3.0-70BOSSMetaApr 2026sdadas/PLCC22
160Qwen-2.5-7bOSSAlibabaApr 2026sdadas/PLCC21
161Mistral-NemoOSSMistralApr 2026sdadas/PLCC20
162Qwen3.5-2BOSSAlibabaApr 2026sdadas/PLCC20
163Mixtral-8x7bOSSMistralApr 2026sdadas/PLCC20
164Llama-3.1-8BOSSMetaApr 2026sdadas/PLCC19
165Mistral-7b-v0.3OSSMistralApr 2026sdadas/PLCC16
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

1 steps
of state of the art.

Each row below marks a model that broke the previous record on average. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · average
  1. Apr 2, 2026Gemini-3.1-Pro-PreviewGoogle97
Fig 3 · SOTA-setting models only. 1 entries span Apr 2026 Apr 2026.
§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies