Polish Text Understanding
Evaluating language models on understanding Polish text: sentiment, implicatures, phraseology, tricky questions, and hallucination resistance.
1
Datasets
0
Results
average
Canonical metric
Canonical Benchmark
CPTU-Bench
Evaluates LLMs on understanding Polish text across 4 dimensions: sentiment analysis, language understanding (implicatures, author intent), phraseology (idioms, phraseological compounds), and tricky questions (logic, ambiguity, hallucination resistance). Score range 0-5 per category. 378 hand-written examples. Created by SpeakLeash/Spichlerz.
Primary metric: average
Top 10
Leading models on CPTU-Bench.
No results yet. Be the first to contribute.
All datasets
1 dataset tracked for this task.
Related tasks
Other tasks in Natural Language Processing.