Codesota · Natural Language Processing · Polish Emotional Intelligence · Polish EQ-BenchTasks/Natural Language Processing/Polish Emotional Intelligence
Polish Emotional Intelligence · benchmark dataset · 2025 · PL

Polish Emotional Intelligence Benchmark (EQ-Bench v2 PL).

Evaluates LLMs on emotional intelligence in Polish. Based on EQ-Bench v2 methodology adapted for Polish language. Models predict emotional intensity changes across 171 questions. Score adjusted for parseability: Benchmark Score × (Parseable / 171). Created by SpeakLeash.

Paper Download datasetSubmit a result
§ 01 · Leaderboard

Best published scores.

101 results indexed across 1 metric. Shaded row marks current SOTA; ties broken by submission date.


Primary
eq-score · higher is better
eq-score· primary
101 rows
#ModelOrgSubmittedPaper / codeeq-score
01mistralai/Mistral-Large-Instruct-2407OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench78.07
02mistralai/Mistral-Large-Instruct-2411OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench77.29
03Meta-Llama-3.1-405B-Instruct-FP8Openmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench77.23
04GPT-4o-2024-08-06OpenOpenAIApr 2026SpeakLeash/Polish-EQ-Bench75.15
05gpt-4-turbo-2024-04-09OpenApr 2026SpeakLeash/Polish-EQ-Bench74.59
06speakleash/Bielik-11B-v2.6-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench73.70
07deepseek-ai/DeepSeek-V3-0324 (API)APIdeepseek-aiApr 2026SpeakLeash/Polish-EQ-Bench73.46
08Mistral-Small-Instruct-2409OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench72.85
09CYFRAGOVPL/Llama-PLLuM-70B-chatOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench72.56
10meta-llama/Meta-Llama-3.1-70B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench72.53
11speakleash/Bielik-11B-v2.5-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench72.00
12Qwen/Qwen2-72B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench71.23
13meta-llama/Meta-Llama-3-70B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench71.21
14speakleash/Bielik-11B-v3.0-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench71.20
15GPT-4o-mini-2024-07-18OpenOpenAIApr 2026SpeakLeash/Polish-EQ-Bench71.15
16Qwen/Qwen2.5-32B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench71.15
17speakleash/Bielik-11B-v2.3-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench70.86
18meta-llama/Llama-3.3-70B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench70.73
19mistralai/Mistral-Small-24B-Instruct-2501OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench70.52
20CYFRAGOVPL/Llama-PLLuM-70B-instructOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench69.99
21alpindale/WizardLM-2-8x22B (API)APIalpindaleApr 2026SpeakLeash/Polish-EQ-Bench69.56
22Qwen/Qwen2.5-14B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench69.17
23speakleash/Bielik-11B-v2.2-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench69.05
24Qwen2-72BOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench68.93
25Qwen/Qwen2.5-72B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench68.49
26speakleash/Bielik-11B-v2.0-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench68.24
27Qwen/Qwen1.5-72B-ChatOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench68.03
28mistralai/Mixtral-8x22B-Instruct-v0.1 (API)APImistralaiApr 2026SpeakLeash/Polish-EQ-Bench67.63
29THUDM/glm-4-9b-chatOpenTHUDMApr 2026SpeakLeash/Polish-EQ-Bench61.79
30mistralai/Mistral-Nemo-Instruct-2407OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench61.76
31speakleash/Bielik-11B-v2.1-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench60.07
32Qwen1.5-32B-ChatOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench59.63
33openchat/openchat-3.5-0106-gemmaOpenopenchatApr 2026SpeakLeash/Polish-EQ-Bench59.58
34microsoft/phi-4OpenmicrosoftApr 2026SpeakLeash/Polish-EQ-Bench59.10
35Qwen/Qwen2.5-7B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench58.58
36aya-23-35BOpenCohereForAIApr 2026SpeakLeash/Polish-EQ-Bench58.41
37GPT-3.5-turboOpenOpenAIApr 2026SpeakLeash/Polish-EQ-Bench57.70
38Qwen2-57B-A14B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench57.64
39mistralai/Mixtral-8x7B-Instruct-v0.1OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench57.61
40c4ai-command-r-v01OpenCohereForAIApr 2026SpeakLeash/Polish-EQ-Bench56.43
41Phi-3-medium-4k-instructOpenmicrosoftApr 2026SpeakLeash/Polish-EQ-Bench56.40
42upstage/SOLAR-10.7B-Instruct-v1.0OpenupstageApr 2026SpeakLeash/Polish-EQ-Bench55.21
43CYFRAGOVPL/pllum-12b-nc-chat-250715OpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench55.17
44Hermes-2-Theta-Llama-3-8BOpenNousResearchApr 2026SpeakLeash/Polish-EQ-Bench54.88
45NeuralDaredevil-8B-abliteratedOpenmlabonneApr 2026SpeakLeash/Polish-EQ-Bench54.74
46Hermes-2-Pro-Llama-3-8BOpenNousResearchApr 2026SpeakLeash/Polish-EQ-Bench54.57
47utter-project/EuroLLM-9B-InstructOpenutter-projectApr 2026SpeakLeash/Polish-EQ-Bench54.11
48Qwen1.5-32BOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench54.03
49Qwen2-7B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench53.74
50speakleash/Bielik-4.5B-v3.0-InstructOpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench53.58
51recurrentgemma-9b-itOpengoogleApr 2026SpeakLeash/Polish-EQ-Bench52.82
52CYFRAGOVPL/PLLuM-12B-chatOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench52.26
53Qwen1.5-72BOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench51.44
54microsoft/Phi-4-mini-instructOpenmicrosoftApr 2026SpeakLeash/Polish-EQ-Bench50.52
55berkeley-nest/Starling-LM-7B-alphaOpenberkeley-nestApr 2026SpeakLeash/Polish-EQ-Bench49.63
56Nous-Hermes-2-SOLAR-10.7BOpenNousResearchApr 2026SpeakLeash/Polish-EQ-Bench49.27
57openchat-3.5-1210OpenopenchatApr 2026SpeakLeash/Polish-EQ-Bench49.04
58Delexa-7bOpenlex-hueApr 2026SpeakLeash/Polish-EQ-Bench48.46
59Qwen1.5-14B-ChatOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench47.96
60CYFRAGOVPL/PLLuM-8x7B-nc-chatOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench47.29
61Mistral-7B-Instruct-v0.2OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench47.02
62meta-llama/Meta-Llama-3-8B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench46.53
63Yi-1.5-9B-ChatOpen01-aiApr 2026SpeakLeash/Polish-EQ-Bench46.50
6401-ai/Yi-1.5-34B-ChatOpen01-aiApr 2026SpeakLeash/Polish-EQ-Bench46.32
65CYFRAGOVPL/Llama-PLLuM-8B-chatOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench46.20
66meta-llama/Llama-3.2-3B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench46.19
67aya-23-8BOpenCohereForAIApr 2026SpeakLeash/Polish-EQ-Bench45.43
68openchat/openchat-3.5-0106OpenopenchatApr 2026SpeakLeash/Polish-EQ-Bench45.42
69nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16OpennvidiaApr 2026SpeakLeash/Polish-EQ-Bench45.28
70CYFRAGOVPL/PLLuM-8x7B-chatOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench45.22
71mistralai/Mistral-7B-Instruct-v0.3OpenmistralaiApr 2026SpeakLeash/Polish-EQ-Bench45.21
72Kruk-7B-SP-001OpenRemekApr 2026SpeakLeash/Polish-EQ-Bench44.44
73Starling-LM-7B-betaOpenNexusflowApr 2026SpeakLeash/Polish-EQ-Bench43.78
74OpenChat3.5-0106-Spichlerz-BocianOpenRemekApr 2026SpeakLeash/Polish-EQ-Bench42.84
75falcon-11BOpentiiuaeApr 2026SpeakLeash/Polish-EQ-Bench42.41
76CYFRAGOVPL/PLLuM-8x7B-nc-instructOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench41.75
77OpenChat3.5-0106-Spichlerz-Inst-001OpenRemekApr 2026SpeakLeash/Polish-EQ-Bench41.60
78internlm2-chat-7b-sftOpeninternlmApr 2026SpeakLeash/Polish-EQ-Bench41.38
79CYFRAGOVPL/PLLuM-8x7B-instructOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench39.55
80internlm2-chat-7bOpeninternlmApr 2026SpeakLeash/Polish-EQ-Bench39.53
81Llama3-ChatQA-1.5-8BOpennvidiaApr 2026SpeakLeash/Polish-EQ-Bench39.36
82Meta-Llama-3-70BOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench39.09
83OpenHermes-2.5-Mistral-7BOpentekniumApr 2026SpeakLeash/Polish-EQ-Bench37.48
84internlm/internlm2-chat-20bOpeninternlmApr 2026SpeakLeash/Polish-EQ-Bench36.31
85CYFRAGOVPL/PLLuM-12B-instructOpenCYFRAGOVPLApr 2026SpeakLeash/Polish-EQ-Bench36.21
86Qwen/Qwen2.5-3B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench35.87
87Qwen2-7BOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench35.51
88OpenHermes-13BOpentekniumApr 2026SpeakLeash/Polish-EQ-Bench34.91
89Bielik-SOLAR-LIKE-10.7B-Instruct-v0.1OpenTeeZeeApr 2026SpeakLeash/Polish-EQ-Bench34.17
90speakleash/Bielik-7B-Instruct-v0.1OpenspeakleashApr 2026SpeakLeash/Polish-EQ-Bench31.26
91Qwen/Qwen2.5-1.5B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench27.63
92Llama-3-8B-Omnibus-1-PL-v01-INSTRUCTOpenRemekApr 2026SpeakLeash/Polish-EQ-Bench26.63
93Phi-3-mini-4k-instructOpenmicrosoftApr 2026SpeakLeash/Polish-EQ-Bench26.08
94Voicelab/trurl-2-13b-academicOpenVoicelabApr 2026SpeakLeash/Polish-EQ-Bench24.56
95Qwen1.5-7B-ChatOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench23.98
96Qwen1.5-7BOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench20.95
97meta-llama/Llama-3.2-1B-InstructOpenmeta-llamaApr 2026SpeakLeash/Polish-EQ-Bench17.82
98gemma-1.1-2b-itOpengoogleApr 2026SpeakLeash/Polish-EQ-Bench16.47
99Qwen2-1.5B-InstructOpenQwenApr 2026SpeakLeash/Polish-EQ-Bench14.79
100internlm2-chat-1_8bOpeninternlmApr 2026SpeakLeash/Polish-EQ-Bench12.13
101Yi-1.5-6B-ChatOpen01-aiApr 2026SpeakLeash/Polish-EQ-Bench4.89
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 03 · Progress

1 steps
of state of the art.

Each row below marks a model that broke the previous record on eq-score. Intermediate submissions are kept in the leaderboard above; only SOTA-setting entries are re-listed here.

Higher scores win. Each subsequent entry improved upon the previous best.

SOTA line · eq-score
  1. Apr 2, 2026mistralai/Mistral-Large-Instruct-2407mistralai78.07
Fig 3 · SOTA-setting models only. 1 entries span Apr 2026 Apr 2026.
§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies