Codesota · Models · Mistral-Medium-3Mistral8 results · 2 benchmarks
Model card

Mistral-Medium-3.

Mistralopen-source
§ 02 · Benchmarks

Every benchmark Mistral-Medium-3 has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01PLCCNatural Language Processing · Polish Cultural Competencyvocabulary62.0%#60/165source ↗
02PLCCNatural Language Processing · Polish Cultural Competencyhistory78.0%#66/165source ↗
03HLEReasoning · Multi-step Reasoningaccuracy4.5%#69/74source ↗
04PLCCNatural Language Processing · Polish Cultural Competencygeography77.0%#71/165source ↗
05PLCCNatural Language Processing · Polish Cultural Competencyculture-and-tradition67.0%#72/165source ↗
06PLCCNatural Language Processing · Polish Cultural Competencyaverage66.8%#73/165source ↗
07PLCCNatural Language Processing · Polish Cultural Competencyart-and-entertainment56.0%#74/165source ↗
08PLCCNatural Language Processing · Polish Cultural Competencygrammar61.0%#80/165source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 03 · Strengths by area

Where Mistral-Medium-3 actually performs.

Reasoning
1
benchmark
avg rank #69.0
Natural Language Processing
1
benchmark
avg rank #70.9
§ 05 · Related models

Other Mistral models scored on Codesota.

Mistral OCR 3
6 results
Codestral 22B
Unknown params · 2 results
Devstral 2
1 result
Devstral Small 2505
1 result
Mistral Large 3
123B params · 1 result
Mistral OCR 2
1 result
Mixtral-8x22b
1 result
Devstral Small
0 results
§ 06 · Sources & freshness

Where these numbers come from.

sdadas/PLCC
7
results
scale-hle-official
1
result
8 of 8 rows marked verified.