Codesota · OCR · dots.ocr 3BHome/OCR/dots.ocr 3B
Open source · Dec 2025

dots.ocr 3B: unified multilingual document parser.

A 3B parameter open-source model that handles text, tables, and formulas in 100+ languages with a single unified architecture.

Apache 2.03B paramsRuns locallyUnified model
§ 01 · Differentiator

100+ language support.

Unlike most OCR models focused on English/Chinese, dots.ocr delivers SOTA performance on low-resource languages including Tibetan, Kannada, Russian, Arabic, and more.

§ 02 · Headline numbers

The four metrics that matter.

88.41
OmniDocBench
Composite
95.2%
Text Accuracy
0.048 edit
86.8%
Table TEDS
Structure
83.2%
Formula CDM
LaTeX

Stop picking the wrong OCR model

Monthly OCR benchmark update — new models, price changes, accuracy deltas. Free.

§ 03 · OmniDocBench comparison

Where dots.ocr lands.

OmniDocBench tests end-to-end document parsing across 1,355 pages with text, tables, and formulas. Composite score: ((1-TextEditDist)*100 + TableTEDS + FormulaCDM) / 3

ModelCompositeTextTablesFormulas
PaddleOCR-VL92.86-93.5%-
dots.ocr 3B88.4195.2%86.8%83.2%
Mistral OCR 379.7590.1%70.9%78.2%
clearOCR31.784.6%0.8%~10%
§ 04 · Key advantages

Four reasons to use it.

Unified Architecture

Single 1.7B LLM foundation handles layout detection, text recognition, table parsing, and formula extraction. No multi-model pipelines.

Multilingual SOTA

Best-in-class on low-resource languages. Tested on dots.ocr-bench covering 1,493 images across 100 languages.

Prompt-Based Control

Natural language prompts define tasks: full parsing, layout-only, or targeted region extraction.

Efficient Inference

Faster than larger models while maintaining accuracy. Runs on consumer GPUs.

§ 05 · olmOCR benchmark

Strong on table recognition.

79.1%
Pass Rate
88.3%
Table Recognition
#1 ranking
3B
Parameters
§ 06 · Code example

Loading the model.

from transformers import AutoModelForCausalLM, AutoProcessor
import torch

# Load dots.ocr model
model = AutoModelForCausalLM.from_pretrained(
    "rednote-hilab/dots.ocr-3b",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
processor = AutoProcessor.from_pretrained("rednote-hilab/dots.ocr-3b")

# Parse document
image = Image.open("document.png")
inputs = processor(
    text="<parse_document>",  # Prompt for full parsing
    images=image,
    return_tensors="pt"
)

outputs = model.generate(**inputs, max_new_tokens=4096)
result = processor.decode(outputs[0], skip_special_tokens=True)
print(result)  # JSON with layout, text, tables, formulas
§ 07 · When to use

Fit for purpose.

Excellent For
  • Multilingual documents (100+ languages)
  • Low-resource languages (Tibetan, Kannada, etc.)
  • On-premise/private deployment
  • Tables + formulas in one model
  • Cost-sensitive high-volume processing
Considerations
  • Requires GPU for efficient inference
  • 3B model needs ~8GB VRAM
  • PaddleOCR-VL still leads on pure accuracy
  • API solutions may be faster to integrate
§ 08 · Verdict

The multilingual SOTA, today.

OmniDocBench
88.41
#2 open source
Table TEDS
86.8%
Strong tables
Languages
100+
Multilingual SOTA

dots.ocr 3B is the best open-source choice for multilingual document parsing. It combines text, table, and formula recognition in a single efficient model.

For English/Chinese-only workloads, PaddleOCR-VL has slightly higher accuracy. For API-based solutions, Mistral OCR 3 is faster to integrate but lower accuracy.

Best use case: Organizations needing to process documents in multiple languages with data privacy requirements.

GitHub: rednote-hilab/dots.ocr
License: Apache 2.0
Model: 3B parameters, runs on consumer GPUs

#1 on OmniDocBench92.86 compositeSOTA shipped

Run the best OCR model on your Mac — $6

Hardparse runs PaddleOCR-VL-1.5 locally via Apple Metal. No cloud, no API keys, no subscription. Tables, formulas, handwriting, 109 languages.

Every purchase directly supports CodeSOTA's independent benchmark research.

§ 09 · Related

Continue reading.

Compare with Other Models
Tutorials & Guides
← OCR BenchmarksMistral OCR 3 →