Evaluates LLMs on Polish linguistic and cultural knowledge across 6 categories: art & entertainment, culture & tradition, geography, grammar, history, and vocabulary. Accuracy (0-100) per category. Created by Dadas et al. (2025).
Geography is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Culture And Tradition is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
History is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Average is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Gemini-3.1-Pro-Preview | verified | 97 | 2026 | Source ↗ | Looks wrong? |
| 02 | Gemini-3.0-Pro-Preview | verified | 95.833333 | 2026 | Source ↗ | Looks wrong? |
| 03 | GPT-5.4-2026-03-05 (high reasoning) | verified | 92.166667 | 2026 | Source ↗ | Looks wrong? |
| 04 | Gemini-2.5-Pro-Preview-06-05 | verified | 92.166667 | 2026 | Source ↗ | Looks wrong? |
| 05 | Gemini-3-Flash-Preview | verified | 91.666667 | 2026 | Source ↗ | Looks wrong? |
| 06 | GPT-5-Pro-2025-10-06 (high reasoning) | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 07 | Grok 4 | verified | 90.5 | 2026 | Source ↗ | Looks wrong? |
| 08 | GPT-5.4-2026-03-05 (low reasoning) | verified | 90.5 | 2026 | Source ↗ | Looks wrong? |
| 09 | GPT-5-2025-08-07 | verified | 89.5 | 2026 | Source ↗ | Looks wrong? |
| 10 | Gemini-2.5-Pro-Exp-03-25 | verified | 89.5 | 2026 | Source ↗ | Looks wrong? |
| 11 | GPT-5.2-2025-12-11 (xhigh reasoning) | verified | 89.333333 | 2026 | Source ↗ | Looks wrong? |
| 12 | O3-2025-04-16 | verified | 89.166667 | 2026 | Source ↗ | Looks wrong? |
| 13 | O1-2024-12-17 | verified | 89.166667 | 2026 | Source ↗ | Looks wrong? |
Vocabulary is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Gemini-3.1-Pro-Preview | verified | 96 | 2026 | Source ↗ | Looks wrong? |
| 02 | Gemini-3.0-Pro-Preview | verified | 95 | 2026 | Source ↗ | Looks wrong? |
| 03 | GPT-5-Pro-2025-10-06 (high reasoning) | verified | 92 | 2026 | Source ↗ | Looks wrong? |
| 04 | GPT-5-2025-08-07 | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 05 | GPT-5.4-2026-03-05 (high reasoning) | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 06 | Gemini-2.5-Pro-Exp-03-25 | verified | 90 | 2026 | Source ↗ | Looks wrong? |
| 07 | Gemini-2.5-Pro-Preview-06-05 | verified | 90 | 2026 | Source ↗ | Looks wrong? |
| 08 | GPT-5.1-2025-11-13 (high reasoning) | verified | 90 | 2026 | Source ↗ | Looks wrong? |
| 09 | O3-2025-04-16 | verified | 90 | 2026 | Source ↗ | Looks wrong? |
Art And Entertainment is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Gemini-3.1-Pro-Preview | verified | 95 | 2026 | Source ↗ | Looks wrong? |
| 02 | Gemini-3.0-Pro-Preview | verified | 95 | 2026 | Source ↗ | Looks wrong? |
| 03 | GPT-5.4-2026-03-05 (high reasoning) | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 04 | Gemini-3-Flash-Preview | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 05 | Gemini-2.5-Pro-Preview-06-05 | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 06 | GPT-4.5-preview-2025-02-27 | verified | 90 | 2026 | Source ↗ | Looks wrong? |
Grammar is the reported evaluation metric for PLCC. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Gemini-3.1-Pro-Preview | verified | 93 | 2026 | Source ↗ | Looks wrong? |
| 02 | Gemini-3.0-Pro-Preview | verified | 91 | 2026 | Source ↗ | Looks wrong? |
| 03 | GPT-5.4-2026-03-05 (high reasoning) | verified | 90 | 2026 | Source ↗ | Looks wrong? |
| 04 | Grok 4 | verified | 90 | 2026 | Source ↗ | Looks wrong? |
| 05 | GPT-5.2-2025-12-11 (xhigh reasoning) | verified | 89 | 2026 | Source ↗ | Looks wrong? |