Document parsing that
runs on your Mac.
Hardparse uses PaddleOCR-VL — the highest-scoring open-source OCR model — to extract tables, formulas, and structured text from any document. No cloud. No subscriptions. Your documents never leave your machine.
Why engineers are switching from cloud OCR
The economics of document parsing changed in October 2025. Here's what that means for you.
Cheaper than Textract
AWS Textract costs $65,000/mo for 1M pages. PaddleOCR-VL on a single GPU: $390. Hardparse puts this on your Mac for a one-time $25.
Beats GPT-4o on documents
PaddleOCR-VL scores 92.56% on OmniDocBench — higher than GPT-4o, Gemini 2.5 Pro, and every commercial API we've tested.
Sent to any server
The Mac app is fully local. No API keys, no internet, no data collection. Process contracts, medical records, financial statements — everything stays on your machine.
How Hardparse compares
| Feature | Hardparse (Mac) | Hardparse API | AWS Textract | Google Doc AI | GPT-4o |
|---|---|---|---|---|---|
| OmniDocBench Score | 92.56% | 92.56% | ~85% | ~87% | ~88% |
| Table extraction | Native | Native | Add-on ($$$) | Add-on | Prompt-based |
| Math / LaTeX | Yes | Yes | No | No | Partial |
| Handwriting | Yes | Yes | Yes | Yes | Yes |
| Languages | 109 | 11 | ~20 | ~60 | 90+ |
| Privacy | 100% local | Cloud | Cloud | Cloud | Cloud |
| Pricing | $25 once | Free / $49/mo | $65K/1M pg | $30K/1M pg | Token-based |
| Internet required | No | Yes | Yes | Yes | Yes |
| Output formats | MD, JSON, TXT | MD, JSON | JSON | JSON | Text |
Accuracy scores from OmniDocBench v1.5. Pricing based on 1M pages/month at standard tier. See full cost analysis.
Choose your workflow
Same engine, two ways to use it.
Mac App
Drag and drop. Get structured text. Everything runs on your Mac's GPU via Metal. No internet, no accounts, no data leaves your machine.
macOS 14+ · Apple Silicon · 2.1 GB
API
One POST request. Structured output. Drop it into your pipeline and forget about OCR infrastructure.
No credit card · 500 free calls · Upgrade anytime
What people parse
Invoices & Receipts
Extract line items, totals, tax info into structured data
Academic Papers
Tables, citations, equations rendered as LaTeX
Bank Statements
Transaction tables parsed into rows and columns
Contracts & Legal
Clauses, signatures, handwritten notes — fully local
Medical Records
HIPAA-friendly: zero data leaves your machine
Engineering Drawings
Annotations, dimensions, part numbers extracted
Multilingual Docs
109 languages including CJK, Arabic, Devanagari
Screenshots
Paste from clipboard, get structured text instantly
Three lines to parse a document
curl -X POST https://api.hardparse.com/v1/parse \
-H "Authorization: Bearer hp_your_key" \
-F "file=@invoice.pdf"
# Response:
{
"regions": [
{ "type": "table", "confidence": 0.97, "markdown": "| Item | Qty | Price |\n|---|---|---|\n| Widget | 100 | $5.00 |" },
{ "type": "text", "confidence": 0.99, "markdown": "## Invoice #2847\nDate: March 15, 2026" },
{ "type": "handwriting", "confidence": 0.94, "markdown": "Approved - JS" }
],
"processing_time_ms": 1240
}The benchmark story behind Hardparse
In October 2025, PaddleOCR-VL launched with 0.9 billion parameters and scored 92.56% on OmniDocBench — beating GPT-4o, Gemini 2.5 Pro, and every commercial API. A model 220x smaller than GPT-4 that's better at reading documents. Hardparse is the easiest way to use that model.
Frequently asked questions
Do I need an internet connection?
Not for the Mac app. The AI models are bundled in the app (2.1 GB download). Everything runs on your Mac's GPU. The API obviously requires internet.
How accurate is it on tables?
PaddleOCR-VL is the top-scoring model on OmniDocBench table extraction. In our testing, it handles nested tables, merged cells, and borderless tables that break AWS Textract.
Can I use it for sensitive documents?
The Mac app processes everything locally. No data is sent anywhere. No telemetry, no analytics, no data collection at all. This makes it suitable for legal, medical, and financial documents.
What about Intel Macs?
The app requires Apple Silicon (M1 or later) for GPU acceleration via Metal. Intel Macs are not supported.
How does the API compare to running it myself?
The API handles infrastructure, scaling, and model updates. If you need to process documents at scale without managing GPUs, the API is the easier path. If privacy is paramount, the Mac app keeps everything local.
Is there a free trial of the Mac app?
The Mac App Store doesn't support free trials, but Apple offers refunds within 14 days. The API has a permanent free tier with 500 calls/month — you can test accuracy there before buying the app.
Stop paying per page.
One purchase. Unlimited documents. The highest-accuracy OCR model, running on your Mac.
Mac App: one-time purchase, no subscription · API: 500 free calls/month, no credit card