CodeSOTA Polish
Unknown
1,000 synthetic and real Polish text images with 5 degradation levels (clean to severe). Tests character-level OCR on diacritics with contamination-resistant synthetic categories. Categories: synth_random (pure character recognition), synth_words (Markov-generated words), real_corpus (Pan Tadeusz, official documents), wikipedia (potential contamination baseline).
Benchmark Stats
Models0
Papers0
Metrics0
SOTA History
Not enough data to show trend.
No results yet on this benchmark
Help build the community leaderboard — submit your model results.
No benchmark results available yet for CodeSOTA Polish.
Check back soon as we continue collecting data.