Mistral OCR 3: Independent Benchmark Results
We ran the full OmniDocBench benchmark (1355 images) ourselves. Here's what we found.
Independently Verified Benchmark
CodeSOTA ran the full OmniDocBench evaluation suite on December 19, 2025. We processed all 1,355 images through Mistral's OCR API and computed metrics using the official evaluation tools.
OmniDocBench Results (Verified)
OmniDocBench is a comprehensive document parsing benchmark with 1,355 pages across 9 document types.
The composite score formula: ((1-TextEditDist)*100 + TableTEDS + FormulaCDM) / 3
| Metric | Mistral OCR 3 | GPT-4o | PaddleOCR-VL |
|---|---|---|---|
| Composite Score | 79.75 verified | ~85 | 92.86 |
| Text Edit Distance | 0.099 verified | 0.02 | 0.03 |
| Table TEDS | 70.9% verified | - | 93.5% |
| Table Structure TEDS | 75.3% verified | - | - |
| Formula (Edit Distance) | 0.218 verified | - | - |
| Reading Order | 91.6% verified | - | - |
Lower is better for Edit Distance metrics. CodeSOTA verification date: December 19, 2025.
Performance by Document Type
Mistral OCR 3 performs best on academic papers and exam papers, struggles with newspapers:
| Document Type | Text Accuracy | Table TEDS |
|---|---|---|
| Academic Literature | 97.9% | 83.0% |
| Exam Papers | 92.8% | 88.0% |
| Books | 93.9% | 82.7% |
| Research Reports | 95.8% | 82.0% |
| Magazines | 97.9% | 71.0% |
| PPT Slides | 95.7% | 72.6% |
| Newspapers | 67.0% | 58.3% |
Performance by Language
Text accuracy
Text accuracy
Text accuracy
Pricing
Real-time processing via API
50% discount for async processing
Our full benchmark run cost $2.71 for 1,355 images using the standard API.
Code Example
from mistralai import Mistral
import base64
client = Mistral(api_key="your-api-key")
# Load document
with open("document.pdf", "rb") as f:
doc_data = base64.b64encode(f.read()).decode()
# OCR with Mistral OCR 3
response = client.ocr.process(
model="mistral-ocr-2512",
document={"type": "pdf", "data": doc_data}
)
# Output is markdown with HTML tables
print(response.content) When to Use Mistral OCR 3
- Academic papers (97.9% accuracy)
- Exam papers (92.8% + 88% tables)
- Research reports and books
- Cost-sensitive high-volume OCR
- English text extraction
- Newspapers (67% - avoid)
- Complex multi-column layouts
- Chinese text (86% vs 94% English)
- Table recognition vs PaddleOCR
Verdict
Composite Score: 79.75 - Mid-tier performance on OmniDocBench.
Mistral OCR 3 sits between traditional OCR and top VLMs. It's significantly behind PaddleOCR-VL (92.86) and GPT-4o (~85), but at $1-2/1000 pages, it's one of the cheapest options.
Best use case: High-volume academic/research document processing where cost matters more than absolute accuracy.
Model ID: mistral-ocr-2512
API: docs.mistral.ai
Verified: December 19, 2025 by CodeSOTA