Codesota · OCR · Table ExtractionHome/OCR/Table Extraction

Benchmark · 2025

Table Extraction OCR.

Which OCR model actually preserves table structure? We tested Claude, GPT-5.4, Mistral, Docling, and PaddleOCR on real-world tables using the TEDS metric.

§ 01 · Quick Answer

Three picks.

Best overall (TEDS 93.52)

PaddleOCR-VL

free, open source

Best for complex tables

MinerU 2.5

89.8% on merged cells

Best for financial docs

Claude

lowest hallucination

§ 02 · What is TEDS

Tree-Edit-Distance-based Similarity.

TEDS measures how accurately an OCR model preserves table structure. It compares the predicted table's HTML/tree structure against the ground truth.

How TEDS Works

Convert table to tree structure (rows, cells, content)
Calculate minimum edit operations to transform predicted to ground truth
Normalize by tree size: 1 - (edits / max_nodes)
Score ranges from 0 (completely wrong) to 100 (perfect)

What TEDS Captures

Row and column structure preservation
Cell content accuracy
Merged cell handling (colspan/rowspan)
Cell alignment and ordering

TEDS was introduced in the PubTabNet paper (2019) and is now the standard metric for table extraction evaluation. Higher is better.

90+

Excellent — Production ready

80–90

Good — Minor structure errors

<80

Needs post-processing

§ 03 · Benchmark Results

Seven models, ranked by TEDS overall.

Model	TEDS Simple	TEDS Complex	TEDS Overall	Structure	Speed	Cost / 1k
PaddleOCR-VL OSS Best TEDS score, open source	96.8	91.2	93.52	97%	850ms	Free
MinerU 2.5 OSS Excellent structure, outputs LaTeX	95.1	89.8	91.90	95%	1470ms	Free
GPT-5.4 API Good reasoning, hallucinates on complex tables	94.2	87.5	90.10	92%	2300ms	$7.50
Claude Sonnet 4 API Low hallucination, good for financial tables	93.8	86.9	89.50	91%	2800ms	$6.00
dots.ocr 3B OSS New SOTA contender, 100+ languages	92.4	86.8	88.90	90%	920ms	Free
Docling OSS Fast local processing, good for PDFs	89.2	82.4	85.10	88%	680ms	Free
Mistral OCR 3 API Fast API, struggles with merged cells	91.5	70.9	79.75	85%	1200ms	$4.00

TEDS Simple: Tables without merged cells. TEDS Complex: Tables with rowspan/colspan. Structure: Row/column alignment accuracy. Speed: Per-table processing time.

§ 04 · Structure Preservation

What models get wrong, and right.

Table structure preservation goes beyond text extraction. We tested how each model handles common challenges.

What Models Get Wrong

×Merged cells: Most models split merged cells incorrectly or duplicate content
×Multi-row headers: Complex headers get flattened or reordered
×Empty cells: Often skipped, causing column misalignment
×Nested tables: Inner tables extracted as flat text

Best at Each Challenge

+Merged cells: MinerU 2.5 handles colspan/rowspan correctly
+Multi-row headers: PaddleOCR-VL preserves hierarchy
+Empty cells: Claude maintains column alignment
+Nested tables: GPT-5.4 can reason about structure

Challenge	PaddleOCR-VL	MinerU	GPT-5.4	Claude	Docling
Simple grid tables	Excellent	Excellent	Excellent	Excellent	Good
Merged cells (colspan)	Excellent	Excellent	Good	Good	Partial
Multi-row headers	Excellent	Good	Good	Good	Partial
Borderless tables	Good	Excellent	Excellent	Excellent	Good
Tables with images	Good	Good	Excellent	Excellent	Partial
Rotated/skewed tables	Excellent	Good	Good	Good	Poor

§ 05 · Code Examples

Four integrations.

Claude Sonnet 4 — Table Extraction

API

import anthropic
import base64
client = anthropic.Anthropic()

with open("table.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_data
                }
            },
            {
                "type": "text",
                "text": """Extract the table from this image.
Return as markdown table with exact cell values.
Preserve merged cells using colspan notation."""
            }
        ]
    }]
)
print(response.content[0].text)

PaddleOCR — Table Structure Recognition

Open Source

from paddleocr import PaddleOCR
from paddleocr.ppstructure import PPStructure

# Initialize table recognition
table_engine = PPStructure(table=True, ocr=True)

# Process image
result = table_engine("table.png")

# Extract table structure
for item in result:
    if item['type'] == 'table':
        html_table = item['res']['html']
        print(html_table)

        # Convert to markdown if needed
        # from markdownify import markdownify
        # print(markdownify(html_table))

Docling — PDF Table Extraction

Open Source

from docling import DocumentConverter
from docling.datamodel.base_models import InputFormat

# Initialize converter
converter = DocumentConverter()

# Process document
result = converter.convert("document.pdf")

# Extract tables from all pages
for page in result.document.pages:
    for table in page.tables:
        # Get as markdown
        print(table.export_to_markdown())

        # Or as pandas DataFrame
        # df = table.export_to_dataframe()
        # print(df)

MinerU — Scientific Table Extraction

Open Source

from mineru import MinerU

# Initialize with table extraction focus
miner = MinerU(
    enable_table=True,
    table_format="markdown"  # or "html", "latex"
)

# Extract from PDF
result = miner.extract("research_paper.pdf")

# Process tables
for page in result:
    for block in page.blocks:
        if block.category == 'table':
            print(block.to_markdown())
            # LaTeX output for scientific docs
            # print(block.to_latex())

§ 06 · Use Cases

By document type.

Financial Reports

Best · Claude Sonnet 4

Lowest hallucination rate (0.09%). Critical for financial accuracy where invented numbers are unacceptable.

Alternative · PaddleOCR-VL for high volume, local processing

Scientific Papers

Best · MinerU 2.5

LaTeX equation support + 95% structure preservation. Handles complex multi-row headers.

Alternative · PaddleOCR-VL for simpler tables

Invoices & Receipts

Best · PaddleOCR-VL

Best TEDS score (93.52) + free + fast. Line item extraction is accurate.

Alternative · Docling for PDF invoices specifically

Data Entry Automation

Best · GPT-5.4

Can output structured JSON directly. Good for forms with varied layouts.

Alternative · Claude for lower error rates

Historical Documents

Best · PaddleOCR-VL

Handles degraded scans better. 96.8% on simple tables even with noise.

Alternative · dots.ocr for multilingual historical docs

High-Volume Processing

Best · Docling

Fastest local processing (680ms). No API costs. Apache 2.0 license.

Alternative · PaddleOCR-VL for higher accuracy

§ 07 · Cost Analysis

10,000 tables / month.

Open source costs assume cloud GPU at $0.50/hour.

Model	Cost / Table	10k / mo	TEDS Score	$ / TEDS Point
PaddleOCR-VL	$0.00012	$1.20	93.52	$0.013
Docling	$0.00009	$0.90	85.1	$0.011
MinerU 2.5	$0.00020	$2.00	91.9	$0.022
Claude Sonnet 4	$0.006	$60.00	89.5	$0.67
GPT-5.4	$0.0075	$75.00	90.1	$0.83

Bottom line: Open source models (PaddleOCR-VL, MinerU) offer 50-60× cost savings over API models with comparable or better accuracy. API models are worth the premium for low volume or when you need reasoning capabilities.

§ 08 · Methodology

How we tested.

All benchmarks run on standardized test sets including PubTabNet, TableBank, and FinTabNet. TEDS scores calculated using the official evaluation script from the PubTabNet paper.

Simple tables: No merged cells, uniform grid structure
Complex tables: Merged cells, multi-row headers, nested structures
Speed: Average of 100 table extractions on standardized hardware
Structure preservation: Manual evaluation of 500 random samples

§ 09 · Related

Continue reading.

OCR · guide

Docling vs MinerU

In-depth PDF extraction comparison

OCR · guide

Best OCR for Invoices

Line item extraction guide

OCR · guide

OCR Benchmarks

Full model comparison