Codesota · Benchmark · PLCCHome/Leaderboards/PLCC

Unknown

PLCC.

Evaluates LLMs on Polish linguistic and cultural knowledge across 6 categories: art & entertainment, culture & tradition, geography, grammar, history, and vocabulary. Accuracy (0-100) per category. Created by Dadas et al. (2025).

Paper ↗Leaderboard ↓

§ 01 · Leaderboard

Results by metric.

Found a wrong score or missing run?

Use row edits to send a sourced correction into moderation.

Add / edit result ↗Report issue ↗

Geography

Geography is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Geographyverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	100	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	100	2026	Source ↗	Looks wrong?
03	Gemini-2.5-Pro-Preview-06-05	verified	98	2026	Source ↗	Looks wrong?
04	Gemini-2.5-Pro-Exp-03-25	verified	97	2026	Source ↗	Looks wrong?
05	GPT-5.4-2026-03-05 (low reasoning)	verified	97	2026	Source ↗	Looks wrong?
06	GPT-5-2025-08-07	verified	97	2026	Source ↗	Looks wrong?
07	GPT-5.1-2025-11-13 (high reasoning)	verified	97	2026	Source ↗	Looks wrong?
08	O3-2025-04-16	verified	97	2026	Source ↗	Looks wrong?
09	Gemini-3-Flash-Preview	verified	96	2026	Source ↗	Looks wrong?
10	GPT-5.4-2026-03-05 (high reasoning)	verified	96	2026	Source ↗	Looks wrong?
11	GPT-5-Pro-2025-10-06 (high reasoning)	verified	96	2026	Source ↗	Looks wrong?
12	O1-2024-12-17	verified	95	2026	Source ↗	Looks wrong?
13	GPT-5.2-2025-12-11 (high reasoning)	verified	95	2026	Source ↗	Looks wrong?
14	DeepSeek-V3.2-Speciale	verified	94	2026	Source ↗	Looks wrong?
15	Gemini-2.5-Flash-Preview-04-17	verified	94	2026	Source ↗	Looks wrong?
16	GPT-5-mini-2025-08-07	verified	94	2026	Source ↗	Looks wrong?
17	GPT-5.2-2025-12-11 (medium reasoning)	verified	94	2026	Source ↗	Looks wrong?
18	GPT-5.2-2025-12-11 (xhigh reasoning)	verified	94	2026	Source ↗	Looks wrong?
19	Grok 4	verified	94	2026	Source ↗	Looks wrong?
20	GPT-5.4-mini-2026-03-17 (high reasoning)	verified	92	2026	Source ↗	Looks wrong?
21	GLM-5	verified	91	2026	Source ↗	Looks wrong?
22	GPT-4.5-preview-2025-02-27	verified	90	2026	Source ↗	Looks wrong?
23	DeepSeek-v3.1 (thinking)	verified	89	2026	Source ↗	Looks wrong?

Culture And Tradition

Culture And Tradition is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Culture And Traditionverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	100	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	99	2026	Source ↗	Looks wrong?
03	Gemini-3-Flash-Preview	verified	98	2026	Source ↗	Looks wrong?
04	Gemini-2.5-Pro-Preview-06-05	verified	96	2026	Source ↗	Looks wrong?
05	Grok 4	verified	95	2026	Source ↗	Looks wrong?
06	GPT-5-Pro-2025-10-06 (high reasoning)	verified	94	2026	Source ↗	Looks wrong?
07	GPT-5.2-2025-12-11 (xhigh reasoning)	verified	93	2026	Source ↗	Looks wrong?
08	GPT-5.4-2026-03-05 (high reasoning)	verified	93	2026	Source ↗	Looks wrong?
09	GPT-5.4-2026-03-05 (low reasoning)	verified	93	2026	Source ↗	Looks wrong?
10	O1-2024-12-17	verified	92	2026	Source ↗	Looks wrong?
11	GPT-4o-2024-05-13	verified	92	2026	Source ↗	Looks wrong?
12	GPT-4.5-preview-2025-02-27	verified	92	2026	Source ↗	Looks wrong?
13	O3-2025-04-16	verified	91	2026	Source ↗	Looks wrong?
14	Gemini-2.5-Pro-Exp-03-25	verified	91	2026	Source ↗	Looks wrong?
15	GPT-5.1-2025-11-13 (high reasoning)	verified	90	2026	Source ↗	Looks wrong?
16	Grok-3-Beta	verified	90	2026	Source ↗	Looks wrong?
17	Gemini-Exp-1206	verified	90	2026	Source ↗	Looks wrong?
18	GPT-4o-2024-11-20	verified	89	2026	Source ↗	Looks wrong?
19	GPT-5-2025-08-07	verified	89	2026	Source ↗	Looks wrong?
20	GPT-4o-2024-08-06	verified	89	2026	Source ↗	Looks wrong?

History

History is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Historyverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	98	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	95	2026	Source ↗	Looks wrong?
03	Grok 4	verified	94	2026	Source ↗	Looks wrong?
04	GPT-5.2-2025-12-11 (xhigh reasoning)	verified	94	2026	Source ↗	Looks wrong?
05	GPT-5.4-2026-03-05 (low reasoning)	verified	93	2026	Source ↗	Looks wrong?
06	Gemini-3-Flash-Preview	verified	92	2026	Source ↗	Looks wrong?
07	GPT-5.4-2026-03-05 (high reasoning)	verified	92	2026	Source ↗	Looks wrong?
08	Gemini-2.5-Pro-Exp-03-25	verified	92	2026	Source ↗	Looks wrong?
09	Claude-3.7-Sonnet-Thinking	verified	92	2026	Source ↗	Looks wrong?
10	Gemini-2.5-Pro-Preview-06-05	verified	92	2026	Source ↗	Looks wrong?
11	Claude-Opus-4.1	verified	91	2026	Source ↗	Looks wrong?
12	DeepSeek-R1-0528	verified	91	2026	Source ↗	Looks wrong?
13	Claude-3.5-Sonnet-20241022	verified	91	2026	Source ↗	Looks wrong?
14	GPT-5-2025-08-07	verified	91	2026	Source ↗	Looks wrong?
15	GPT-5-Pro-2025-10-06 (high reasoning)	verified	91	2026	Source ↗	Looks wrong?
16	GPT-5.2-2025-12-11 (medium reasoning)	verified	90	2026	Source ↗	Looks wrong?
17	DeepSeek-V3.2-Speciale	verified	90	2026	Source ↗	Looks wrong?
18	GPT-4.5-preview-2025-02-27	verified	90	2026	Source ↗	Looks wrong?
19	O1-2024-12-17	verified	90	2026	Source ↗	Looks wrong?
20	Claude-3.7-Sonnet	verified	90	2026	Source ↗	Looks wrong?
21	GPT-5.2-2025-12-11 (high reasoning)	verified	90	2026	Source ↗	Looks wrong?
22	O3-2025-04-16	verified	89	2026	Source ↗	Looks wrong?
23	DeepSeek-v3.1 (thinking)	verified	89	2026	Source ↗	Looks wrong?
24	Kimi-K2.5	verified	89	2026	Source ↗	Looks wrong?

Average

Average is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Averageverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	97	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	95.833333	2026	Source ↗	Looks wrong?
03	GPT-5.4-2026-03-05 (high reasoning)	verified	92.166667	2026	Source ↗	Looks wrong?
04	Gemini-2.5-Pro-Preview-06-05	verified	92.166667	2026	Source ↗	Looks wrong?
05	Gemini-3-Flash-Preview	verified	91.666667	2026	Source ↗	Looks wrong?
06	GPT-5-Pro-2025-10-06 (high reasoning)	verified	91	2026	Source ↗	Looks wrong?
07	Grok 4	verified	90.5	2026	Source ↗	Looks wrong?
08	GPT-5.4-2026-03-05 (low reasoning)	verified	90.5	2026	Source ↗	Looks wrong?
09	GPT-5-2025-08-07	verified	89.5	2026	Source ↗	Looks wrong?
10	Gemini-2.5-Pro-Exp-03-25	verified	89.5	2026	Source ↗	Looks wrong?
11	GPT-5.2-2025-12-11 (xhigh reasoning)	verified	89.333333	2026	Source ↗	Looks wrong?
12	O3-2025-04-16	verified	89.166667	2026	Source ↗	Looks wrong?
13	O1-2024-12-17	verified	89.166667	2026	Source ↗	Looks wrong?

Vocabulary

Vocabulary is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Vocabularyverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	96	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	95	2026	Source ↗	Looks wrong?
03	GPT-5-Pro-2025-10-06 (high reasoning)	verified	92	2026	Source ↗	Looks wrong?
04	GPT-5-2025-08-07	verified	91	2026	Source ↗	Looks wrong?
05	GPT-5.4-2026-03-05 (high reasoning)	verified	91	2026	Source ↗	Looks wrong?
06	Gemini-2.5-Pro-Exp-03-25	verified	90	2026	Source ↗	Looks wrong?
07	Gemini-2.5-Pro-Preview-06-05	verified	90	2026	Source ↗	Looks wrong?
08	GPT-5.1-2025-11-13 (high reasoning)	verified	90	2026	Source ↗	Looks wrong?
09	O3-2025-04-16	verified	90	2026	Source ↗	Looks wrong?

Art And Entertainment

Art And Entertainment is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Art And Entertainmentverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	95	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	95	2026	Source ↗	Looks wrong?
03	GPT-5.4-2026-03-05 (high reasoning)	verified	91	2026	Source ↗	Looks wrong?
04	Gemini-3-Flash-Preview	verified	91	2026	Source ↗	Looks wrong?
05	Gemini-2.5-Pro-Preview-06-05	verified	91	2026	Source ↗	Looks wrong?
06	GPT-4.5-preview-2025-02-27	verified	90	2026	Source ↗	Looks wrong?

Grammar

Grammar is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Grammarverifiedpapervendorcommunityunverified

Muted rows were not state of the art when published — an earlier or same-year result already scored better.

Rank	Model	Trust	Score	Year	Links	Fix
01	Gemini-3.1-Pro-Preview	verified	93	2026	Source ↗	Looks wrong?
02	Gemini-3.0-Pro-Preview	verified	91	2026	Source ↗	Looks wrong?
03	GPT-5.4-2026-03-05 (high reasoning)	verified	90	2026	Source ↗	Looks wrong?
04	Grok 4	verified	90	2026	Source ↗	Looks wrong?
05	GPT-5.2-2025-12-11 (xhigh reasoning)	verified	89	2026	Source ↗	Looks wrong?

§ 04 · Submit a result

Add to the leaderboard.

← Back to Leaderboards