Home/OCR/Docling/Reference
ReferenceInformation-oriented

Docling API Reference

Complete reference for classes, methods, configuration options, and constants.

DocumentConverter

Main class for converting documents.

from docling.document_converter import DocumentConverter converter = DocumentConverter( format_options: dict = None,  # Per-format configuration)

Methods

MethodParametersReturns
convert(source)source: str | Path | URLConversionResult
convert_batch(sources)sources: Iterable[str | Path]Iterator[ConversionResult]

PdfPipelineOptions

Configuration for PDF processing pipeline.

from docling.datamodel.pipeline_options import PdfPipelineOptions options = PdfPipelineOptions()
ParameterTypeDefaultDescription
do_ocrboolTrueEnable OCR for scanned content
do_table_structureboolTrueEnable table detection
ocr_optionsOcrOptionsRapidOcrOptions()OCR engine configuration
images_scalefloat1.0Scale factor for images
table_structure_optionsTableStructureOptions-Table parsing options

OCR Engine Options

EasyOcrOptions

ParameterTypeDefault
langlist[str]["en"]
use_gpuboolTrue
confidence_thresholdfloat0.5
force_full_page_ocrboolFalse

RapidOcrOptions

ParameterTypeDescription
det_model_pathstrPath to detection model
rec_model_pathstrPath to recognition model
cls_model_pathstrPath to classification model
force_full_page_ocrboolForce OCR on all pages

TesseractOcrOptions

ParameterTypeDescription
langstrLanguage code (e.g., "eng")
force_full_page_ocrboolForce OCR on all pages

Requires: TESSDATA_PREFIX environment variable

VLM Model Specifications

Pre-configured model specs for VLM pipeline.

from docling.datamodel import vlm_model_specs
ConstantModelBackendBest For
SMOLDOCLING_MLXSmolDocling-256MMLXApple Silicon
SMOLDOCLING_TRANSFORMERSSmolDocling-256MTransformersCPU / CUDA
GRANITEDOCLING_MLXGranite-Docling-258MMLXApple Silicon
GRANITEDOCLING_VLLMGranite-Docling-258MvLLMNVIDIA GPU (fastest)
GRANITE_VISION_TRANSFORMERSGranite-Docling-258MTransformersCPU / CUDA

Model Specifications

SpecSmolDocling-256MGranite-Docling-258M
Parameters256M258M
Vision EncoderSigLIP base (93M)siglip2-base-patch16-512
Language ModelSmolLM-2 (135M)Granite 165M
Inference Speed~0.35s/page (A100)~0.35s/page (A100)
VRAM Usage~489 MB~500 MB
LicenseApache 2.0Apache 2.0
HuggingFaceds4sd/SmolDocling-256M-previewibm-granite/granite-docling-258M

Export Formats

Methods available on result.document:

MethodReturnsUse Case
export_to_markdown()strHuman-readable, LLM input
export_to_html()strWeb display
export_to_dict()dictJSON serialization, lossless
export_to_text()strPlain text extraction
export_to_doctags()strNative Docling format

Table Export Methods

MethodReturns
table.export_to_markdown()str (Markdown table)
table.export_to_dataframe()pandas.DataFrame
table.export_to_html()str (HTML table)

Supported Input Formats

from docling.datamodel.base_models import InputFormat

Documents

  • InputFormat.PDF - PDF files
  • InputFormat.DOCX - Word documents
  • InputFormat.PPTX - PowerPoint
  • InputFormat.XLSX - Excel
  • InputFormat.HTML - Web pages

Media

  • InputFormat.IMAGE - PNG, JPG, TIFF
  • InputFormat.WAV - Audio (ASR)
  • InputFormat.MP3 - Audio (ASR)
  • InputFormat.VTT - Subtitles

Installation Extras

CommandIncludes
pip install doclingCore + RapidOCR
pip install "docling[vlm]"+ VLM pipeline support
pip install "docling[easyocr]"+ EasyOCR engine
pip install "docling[tesserocr]"+ Tesseract binding
pip install "docling[ocrmac]"+ macOS native OCR
pip install "docling[asr]"+ Audio speech recognition
pip install "docling[cuda]"+ NVIDIA CUDA support
pip install "docling[mac_intel]"+ Intel Mac (PyTorch 2.2.2)

Combine extras: pip install "docling[vlm,easyocr]"