Codesota · The Smart RouterOne API · every task · three tiersIssue: April 22, 2026
§ 00 · The pitch

One API. Every task.
Three tiers.

Codesota is the smart router for AI agents and apps. A single unified API across OCR, speech, vision, code, retrieval, and translation — with SOTA, balanced, and cheap tiers picked from the open benchmark registry. Not one vendor's models. Not one modality. Every task, measured, routed.

§ 01 · Audience

Why a router, and why now.

Three readers. One product. The same unified API serves the agent builder, the app builder, and the research team — because the registry underneath is the same registry for all of them.

Agent builders

Eight SDKs, one integration.

An agent like OpenClaw, Hermes, or a Claude-Code-class tool needs TTS, STT, OCR, code, retrieval, and vision all at once. Today that means eight SDKs, eight billing portals, eight rate-limit ceilings, and eight model-deprecation cycles. Codesota collapses all of it into a single bearer token and a single contract.

App builders

You shouldn't read a paper to transcribe a receipt.

Picking the right model per task means reading benchmark papers, reconciling leaderboards, renting GPUs, and stitching APIs. The router reads the registry for you and calls the model that wins on your chosen tier — SOTA, balanced, or cheap — so the product team can ship the feature instead of the infrastructure.

Research teams

What's SOTA is what ships.

When the benchmark registry moves — a new OCR model clears OmniDocBench, a new embedder takes MTEB — the router picks it up on the next call. No drift between the number on the leaderboard and the model in production. The assay is the contract.

§ 02 · Contract

The three-tier grammar.

One request parameter, three possible answers. The tier grammar is the same for every endpoint — /v1/ocr, /v1/tts, /v1/stt — because the benchmark underneath is the arbiter, not a vendor's marketing page.


Param
{"tier": "sota" | "balanced" | "cheap"}
Default
balanced
Arbiter
the registry at /tasks
Tier spec
Applies to every /v1/<task> endpoint
TierWhat it picksCriterionCost profileExample (OCR)
sotaFrontier model for the taskhighest score on the canonical benchmarkhighest $/callGPT-5.4 · Gemini 2.5 Pro · Claude Opus 4.6
balancedBest quality-per-dollarcost-adjusted score across the benchmarkorder of magnitude cheaper than SOTAopen model on GPU cloud · hosted by us
cheapOSS self-hosted, amortisedfloor set by benchmark score, not vibeslowest $/call, latency variesPaddleOCR-VL-1.5 · dots.ocr · MonkeyOCR-pro
Adjacent to, not the same as

Other platforms route — but not across modalities, not across vendors, and not on benchmark-derived tiers. A brief map:

  • OpenRouter routes within the LLM modality — one contract for text generation, many vendors. Codesota routes across modalities (OCR, TTS, STT, vision, code) under one contract.
  • Replicate and Together host models across modalities, but you still pick the model. Codesota picks per call, per tier, per benchmark.
  • Hyperscalers (AWS, Azure, GCP) offer task APIs, but each is locked to the vendor's own models. Codesota routes across vendors and across closed/open lines.
Fig 2 · The three tiers are a public contract, not a pricing page. The benchmark registry at /tasks is the arbiter for every pick. Nothing is model-hardcoded.
§ 03 · Proof

One endpoint, already shipping.

We turned the thesis into a single endpoint. POST /v1/ocr is live today at hardparse.com. On OmniDocBench, the open-source winner is measurably better and two orders of magnitude cheaper than the closed frontier.


Endpoint
POST /v1/ocr
Hosted at
hardparse.com
Benchmark
OmniDocBench · Mar 2026
Full table
/ocr
OmniDocBench · top-6 · price per 1K pages
.csv.json
#ModelOrgKindScore$/1KCost basis
01PaddleOCR-VL-1.5Baiduopen94.50$0.09self-hosted, amortised
02dots.ocr 3BRednoteopen88.41$0.04self-hosted, amortised
03MonkeyOCR-proOSSopen86.96$0.03self-hosted, amortised
04GPT-5.4OpenAIclosed API85.80$15.00vendor API · retail
05Gemini 2.5 ProGoogleclosed API84.20$12.50vendor API · retail
06Mistral OCR 3Mistralclosed API83.40$1.00vendor API · retail
Footnote · how the 167× is computed

PaddleOCR-VL-1.5 self-hosted cost of $0.09/1K pages is the amortised inference cost on a single A100 at a typical utilisation — COGS, not a retail price. GPT-5.4's $15/1K is OpenAI's published list price. So this is a COGS-vs-retail comparison, which flatters the delta. For a straight retail-vs-retail comparison every row shows the actual hosted price you can buy today on /ocr.

Fig 3 · OmniDocBench composite score combines text accuracy, layout understanding, and table extraction. Shaded row is the tier-1 winner the router picks for tier: "sota" on OCR today.
§ 04
Manifesto

Intelligence as a commodity.

Oil has grades. Electricity has tariffs. Shipping has class codes. Every mature market commoditises by standardising the contract, not the molecule. Intelligence is next — and the contract is the thing worth building.

OpenAI, Anthropic, Google are refineries. They ship something extraordinary, but a refinery's output is only useful once the market around it standardises how you buy, price, and substitute it. On OCR today, we can already quote three interchangeable grades against the same assay. The rest of the tasks follow the same shape.

Grade
Brent vs WTI
sota / balanced / cheap
Contract
barrel specification
POST /v1/<task>
Quality cert
assay report
CodeSOTA benchmark
Spot price
$ per barrel
$ per 1K calls

Codesota is building the assay and the contract. The benchmark registry is the assay report. The task endpoint is the grade. Everything else is implementation detail.

§ 05 · Roadmap

Tasks in flight.

What has shipped, what's in design partnership, what's ranked and waiting its turn. We publish this table as-is so readers can see the gap between the registry and the router — and help close it.


Live means the endpoint accepts production traffic today. In design means we are co-building with one or more partner teams. Ranked means the benchmark exists in the registry but no endpoint is served yet.

Endpoint status · Apr 2026
Sorted: live → design → ranked
TaskModalityRegistry coverageEndpointStatus
Document OCRVisionOmniDocBench · Mar 2026/v1/ocrLive
Text-to-SpeechSpeechranked/v1/ttsIn design
Speech-to-Text (ASR)Speechranked/v1/sttIn design
Visual Question AnsweringMultimodalranked/v1/vqaIn design
TranslationTextranked/v1/translateIn design
Everything elseMixedranked · not servedRanked
Fig 5 · The "everything else" row stands in for the long tail of benchmarks the registry tracks but the router does not yet serve. Full index: /tasks.
§ 06 · Designed for

Agents we're designed for.

Public products whose shape matches the router. Listed as design targets, not customer logos. If you're building one of these, the rest of this page is for you.

Design target

OpenClaw

Open-source desktop agent that takes real actions — files, browsing, shell. Replaces eight vendor SDKs with one router contract.

Needs: OCR · code · retrieval · STT
Design target

Hermes

Nous Research's Hermes Agent routes through 20+ models on OpenRouter today. Codesota folds the cross-modality work (OCR, speech, vision) into the same tier contract.

Needs: multimodal I/O · cost-tier control
Design target

Claude Code

Coding agent class that benefits from cheap-tier retrieval and embedded OCR when the user drops a screenshot or PDF. One bearer token for the non-LLM work.

Needs: OCR · retrieval · code
Design target

General coding agents

RPA tools, DevOps agents, QA bots. All of them need the same non-LLM modalities behind a contract they can depend on for two years.

Needs: TTS · STT · OCR · retrieval
§ 07 · Design partners

Register as a design partner.

Three doors, one building. Pick the one that matches what you're shipping this quarter.

Codesota is independent, open, and built in Warsaw. The benchmark registry is free to read forever. The router is free to try on OCR. Paid tiers come after design partnership — not before.

Door 01

I want the OCR endpoint today

Production-ready. Bearer token, one POST, tiered response. Start on hardparse.com.

Open hardparse.com →
Door 02

I want my task next

Propose the benchmark, the dataset, and the endpoint shape. We co-build in design partnership.

Submit a task proposal →
Door 03

I'm building an agent

One API, every non-LLM modality, three tiers — drop in as a design partner before we price the general tier.

Contact the team →