Rys OCR
State-of-the-art Polish text recognition. Fine-tuned for correct handling of Polish diacritics (a, c, e, l, n, o, s, z, z). First release for ongoing R&D.
Model Architecture
- Base Model
- PaddleOCR-VL
- Parent Base
- ERNIE-4.5-0.3B
- Method
- LoRA (Low-Rank Adaptation)
- LoRA Rank
- 16
- LoRA Alpha
- 32
- Target Modules
- q_proj, k_proj, v_proj, o_proj
- VRAM Required
- 4-6 GB
- License
- Apache 2.0
Training Data
10,000 synthetic Polish document images across 7 categories:
Training: 1 epoch, AdamW optimizer, linear LR schedule
Framework: PEFT 0.18.0 + Transformers
Benchmark Results
| Metric | Baseline | Fine-tuned | Improvement |
|---|---|---|---|
| Character Error Rate (CER) | 5.58% | 1.60% | v 71.3% |
| Word Error Rate (WER) | 13.37% | 7.21% | v 46.1% |
| Exact Match | 74% | 76% | ^ 2% |
Key improvement: Resolved Polish diacritic confusion (l, e, s, etc.)
Quick Start
from transformers import AutoModelForCausalLM, AutoProcessor
from peft import PeftModel
from PIL import Image
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"PaddlePaddle/PaddleOCR-VL",
trust_remote_code=True,
torch_dtype="auto",
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "anon13370/RysOCR")
processor = AutoProcessor.from_pretrained(
"anon13370/RysOCR",
trust_remote_code=True
)
# Run inference
image = Image.open("your_document.png")
prompt = "OCR: "
inputs = processor(images=image, text=prompt, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
outputs = model.generate(**inputs, max_new_tokens=256)
text = processor.decode(outputs[0], skip_special_tokens=True)
print(text)Help Build Polish OCR SOTA
This is the first fine-tune in ongoing R&D. We need your help to push Polish OCR to the next level.
Contribute Datasets
Real Polish documents needed: invoices, receipts, historical documents, handwritten notes, street signs.
- - Scanned documents with ground truth
- - Photos of Polish text in the wild
- - Historical Polish manuscripts
- - Specialized domain texts (medical, legal)
Run Benchmarks
Help us evaluate Rys OCR on more Polish-specific benchmarks and compare with other models.
- - Polish document benchmarks
- - Diacritic-specific test sets
- - Cross-model comparisons
- - Domain-specific evaluations
Join R&D
Collaborate on next iterations: architecture experiments, training strategies, deployment optimization.
- - Model architecture research
- - Training pipeline improvements
- - Edge deployment optimization
- - Multi-language expansion
Roadmap
Known Limitations
- - Optimized for printed Polish text; handwritten recognition may vary
- - Best results on clean document scans
- - Requires loading both base model and LoRA weights for inference
- - Trained on synthetic data only (v0.1)