CodeSOTA Polish.

Name: CodeSOTA Polish Benchmark Results
Creator: Unknown
License: https://creativecommons.org/licenses/by/4.0/

1,000 synthetic and real Polish text images with 5 degradation levels (clean to severe). Tests character-level OCR on diacritics with contamination-resistant synthetic categories. Categories: synth_random (pure character recognition), synth_words (Markov-generated words), real_corpus (Pan Tadeusz, official documents), wikipedia (potential contamination baseline).

Paper ↗

§ 01 · Leaderboard

Results by metric.

No results yet on this benchmark

Help build the community leaderboard — submit your model results.

Found a wrong score or missing run?

Use row edits to send a sourced correction into moderation.

Add / edit result ↗Report issue ↗

No benchmark results available yet for CodeSOTA Polish.

Check back soon as we continue collecting data.

§ 04 · Submit a result

Add to the leaderboard.

← Back to Document OCR