Optical Character Recognition2021pl

PolEval 2021 OCR Post-Correction Task

979 Polish books (69,000 pages) from 1791-1998. Focus on OCR post-correction using NLP methods. Major benchmark for Polish historical document processing.

Samples:69,000
Metrics:cer, wer, correction-accuracy
Paper / WebsiteDownload

No benchmark results indexed for this dataset yet.

Contribute results on GitHub

Other Optical Character Recognition Datasets