Codesota · OCR · Rys OCRHome/OCR/Rys OCR
Polish SOTA · Open Source · R&D

Rys OCR.

State-of-the-art Polish text recognition. Fine-tuned for correct handling of Polish diacritics (a, c, e, l, n, o, s, z, z). First release for ongoing R&D.

View on HuggingFace Contribute
§ 01 · First fine-tune results

The headline numbers.

71.3%
CER Reduction
5.58% to 1.60%
46.1%
WER Reduction
13.37% to 7.21%
10k
Training Images
Synthetic Polish documents
§ 02 · Technical details

How it's built.

Model Architecture

Base Model
PaddleOCR-VL
Parent Base
ERNIE-4.5-0.3B
Method
LoRA (Low-Rank Adaptation)
LoRA Rank
16
LoRA Alpha
32
Target Modules
q_proj, k_proj, v_proj, o_proj
VRAM Required
4-6 GB
License
Apache 2.0

Training Data

10,000 synthetic Polish document images across 7 categories:

AddressesInvoice linesReceipt linesDatesNamesPricesPhrases

Training: 1 epoch, AdamW optimizer, linear LR schedule

Framework: PEFT 0.18.0 + Transformers

§ 03 · Benchmark results

Baseline vs fine-tuned.

MetricBaselineFine-tunedImprovement
Character Error Rate (CER)5.58%1.60%v 71.3%
Word Error Rate (WER)13.37%7.21%v 46.1%
Exact Match74%76%^ 2%

Key improvement: Resolved Polish diacritic confusion (l, e, s, etc.)

§ 04 · Quick start

Inference in Python.

python
from transformers import AutoModelForCausalLM, AutoProcessor
from peft import PeftModel
from PIL import Image

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "PaddlePaddle/PaddleOCR-VL",
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "anon13370/RysOCR")

processor = AutoProcessor.from_pretrained(
    "anon13370/RysOCR",
    trust_remote_code=True
)

# Run inference
image = Image.open("your_document.png")
prompt = "OCR: "

inputs = processor(images=image, text=prompt, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}

outputs = model.generate(**inputs, max_new_tokens=256)
text = processor.decode(outputs[0], skip_special_tokens=True)
print(text)
§ 05 · Contribute

Help build Polish OCR SOTA.

This is the first fine-tune in ongoing R&D. We need your help to push Polish OCR to the next level.

Datasets

Contribute Datasets

Real Polish documents needed: invoices, receipts, historical documents, handwritten notes, street signs.

  • Scanned documents with ground truth
  • Photos of Polish text in the wild
  • Historical Polish manuscripts
  • Specialized domain texts (medical, legal)
Submit Dataset
Benchmarks

Run Benchmarks

Help us evaluate Rys OCR on more Polish-specific benchmarks and compare with other models.

  • Polish document benchmarks
  • Diacritic-specific test sets
  • Cross-model comparisons
  • Domain-specific evaluations
Submit Results
R&D

Join R&D

Collaborate on next iterations: architecture experiments, training strategies, deployment optimization.

  • Model architecture research
  • Training pipeline improvements
  • Edge deployment optimization
  • Multi-language expansion
Get Involved
§ 06 · Roadmap

What's next.

v0.1 - First Fine-Tune
10k synthetic images, LoRA on PaddleOCR-VL. 71% CER reduction.
2
v0.2 - Real Data
Train on real Polish documents. Expand domain coverage.
3
v0.3 - Handwriting
Add handwritten Polish text recognition capability.
4
v1.0 - Production Ready
Full benchmark coverage, optimized inference, API deployment.
§ 07 · Limitations

Known limitations.

  • Optimized for printed Polish text; handwritten recognition may vary
  • Best results on clean document scans
  • Requires loading both base model and LoRA weights for inference
  • Trained on synthetic data only (v0.1)
§ 08 · Related

Continue reading.

Compare with Other Models
Tutorials & Guides
← Back to OCR Models