CodeSOTA Polish

Unknown

1,000 synthetic and real Polish text images with 5 degradation levels (clean to severe). Tests character-level OCR on diacritics with contamination-resistant synthetic categories. Categories: synth_random (pure character recognition), synth_words (Markov-generated words), real_corpus (Pan Tadeusz, official documents), wikipedia (potential contamination baseline).

Benchmark Stats

Models0
Papers0
Metrics0

SOTA History

Coming Soon
Visual timeline of state-of-the-art progression over time will appear here.

No benchmark results available yet for CodeSOTA Polish.

Check back soon as we continue collecting data.