Codesota · Models · Mixtral-8x22bMistral17 results · 3 benchmarks
Model card

Mixtral-8x22b.

Mistralopen-source
§ 01 · Benchmarks

Every benchmark Mixtral-8x22b has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01Polish MT-BenchNatural Language Processing · Polish Conversation Qualityextraction9.6%#10/50source ↗
02Polish MT-BenchNatural Language Processing · Polish Conversation Qualitywriting9.3%#10/50source ↗
03Polish MT-BenchNatural Language Processing · Polish Conversation Qualityroleplay9.1%#10/50source ↗
04Polish MT-BenchNatural Language Processing · Polish Conversation Qualitycoding6.5%#13/50source ↗
05Polish MT-BenchNatural Language Processing · Polish Conversation Qualitymath6.9%#13/50source ↗
06Polish MT-BenchNatural Language Processing · Polish Conversation Qualitypl-score8.2%#14/50source ↗
07Polish MT-BenchNatural Language Processing · Polish Conversation Qualityreasoning6.3%#16/50source ↗
08Polish MT-BenchNatural Language Processing · Polish Conversation Qualitystem9.3%#17/50source ↗
09Polish MT-BenchNatural Language Processing · Polish Conversation Qualityhumanities9.1%#28/50source ↗
10GSM8KReasoning · Mathematical Reasoningaccuracy88.0%#29/322024-04-01source ↗
11PLCCNatural Language Processing · Polish Cultural Competencyhistory69.0%#97/165source ↗
12PLCCNatural Language Processing · Polish Cultural Competencyart-and-entertainment45.0%#100/165source ↗
13PLCCNatural Language Processing · Polish Cultural Competencygrammar50.0%#112/165source ↗
14PLCCNatural Language Processing · Polish Cultural Competencyaverage49.8%#117/165source ↗
15PLCCNatural Language Processing · Polish Cultural Competencygeography59.0%#117/165source ↗
16PLCCNatural Language Processing · Polish Cultural Competencyculture-and-tradition41.0%#123/165source ↗
17PLCCNatural Language Processing · Polish Cultural Competencyvocabulary35.0%#135/165source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Mixtral-8x22b actually performs.

Reasoning
1
benchmark
avg rank #29.0
Natural Language Processing
2
benchmarks
avg rank #58.3
§ 04 · Related models

Other Mistral models scored on Codesota.

Mistral OCR 3
6 results
Codestral 22B
Unknown params · 2 results
Devstral 2
1 result
Devstral Small 2505
1 result
Mistral Large 3
123B params · 1 result
Mistral OCR 2
1 result
Devstral Small
0 results
Magistral-Small-2506
0 results
§ 05 · Sources & freshness

Where these numbers come from.

SpeakLeash/MT-Bench-PL
9
results
sdadas/PLCC
7
results
gsm8k-shadow-page
1
result
16 of 17 rows marked verified.