/api/sota — the registry, callable.
One GET, one task, one dated pick. Free, CORS-open, no auth, agent-discoverable. Returns the current state-of-the-art per task as JSON, with full provenance — benchmark id, metric, paper link, model card link, and a stable snapshot id you can cache against.
CodeSOTA does not run your inference. We are the assay; you call the model at your own provider. That separation is the product — it’s why the answers are credible.
Three curls.
Example payload.
What the OCR call returns. TTS also supports constraint filters for language, deployment, commercial use, streaming, cloning, latency, cost, params, RAM, sample rate, and code-switching.
Schema reference.
| Field | Type | Meaning |
|---|---|---|
| task | string | Short alias if one exists, otherwise the full DB id. |
| task_full_id | string | Always the full DB id. Stable across alias renames. |
| task_name | string | Human-readable task name from the registry. |
| benchmark | string | Dataset id of the winning benchmark for this task. |
| benchmark_version | string | null | Reserved for v0.2 quarterly snapshots — currently null. |
| tier | "sota" | Always "sota" in v0.1. balanced/cheap return 501. |
| as_of | ISO 8601 string | null | Date the winning result was scored. |
| snapshot_id | string | reg-YYYY-MM-DD-xxxxxx. Stable across re-fetches that don’t change the pick. |
| pick | object | The SOTA model entry for this task — see schema below. |
| pick.model_id | string | Codesota canonical model id. |
| pick.model_url | URL | Direct link to the model card on codesota.com. |
| pick.vendor | string | null | Vendor or upstream lab if known. |
| pick.provider_hints | string[] | null | Reserved — where you can run this model. v0.2. |
| pick.score | number | Numeric score, in the metric’s native units. |
| pick.score_metric | string | snake_case <benchmark>_<metric>, e.g. omnidocbench_composite. |
| pick.metric_id | string | Raw metric id from the registry. |
| pick.higher_is_better | boolean | Direction. true for accuracy/pass@k; false for CER/WER/FID/loss. |
| pick.benchmark | object | { id, name } of the benchmark dataset. |
| pick.cost_per_1k_usd | number | null | Reserved — unified vendor pricing. v0.2. |
| pick.cost_basis | string | null | Reserved — narrates the cost model used. v0.2. |
| pick.result_date | YYYY-MM-DD | null | Date of the winning result. |
| runners_up | pick[] | Up to 3 entries below the pick, same shape. |
| registry_url | URL | Curated task hub on codesota.com (e.g. /ocr). |
| methodology_url | URL | Always /methodology — published, dated, citable. |
| retrieved_at | ISO 8601 | Timestamp the response was generated. |
Short names accepted.
The DB carries verbose, hyphenated task ids. The API accepts both forms.
| Short | Full id |
|---|---|
| ocr | document-ocr |
| code | code-generation |
| asr | speech-recognition |
| stt | speech-recognition |
| tts | text-to-speech |
| vqa | visual-question-answering |
| caption | image-captioning |
| t2i | text-to-image |
| t2v | text-to-video |
Status codes.
| HTTP | When | Body |
|---|---|---|
| 404 | Unknown task id (or alias not registered) | { error, hint, see } |
| 404 | Task is registered but has zero scored runs | { error, hint, see } |
| 501 | Caller asked for tier=balanced or tier=cheap (v0.1 supports only sota) | { error, hint, see } |
| 503 | Registry database unreachable | { error } |
| 500 | Unexpected query failure | { error, hint } |
All error responses include CORS headers, JSON body, and a hint or see field where useful. see typically points back to the index endpoint or the task hub on codesota.com.
What you can count on.
- Caching. Cache-Control
public, max-age=300, s-maxage=300. Vercel’s edge serves most hits without touching the database. - CORS.
Access-Control-Allow-Origin: *. No credentials, no cookies. Browser code from any origin can call the endpoint. - Rate limit. None today. Reasonable use: a router polling every minute is fine; a crawler hitting every task every second is not. We’ll publish quotas if abuse warrants them.
- Versioning. The path is permanent. Reserved fields (provider_hints, cost_per_1k_usd, cost_basis, benchmark_version) publish as null today and populate in v0.2 — additions only, never breaking.
- Discoverability. Allowed in robots.txt for AI and search crawlers. Documented in llms.txt for LLM-grounded answers.
Why this is not a router.
S&P doesn’t trade. UL doesn’t sell appliances. CodeSOTA doesn’t route inference. The reason every score in the registry is credible is that we have no financial stake in which model wins — adding inference markup would put exactly the wrong incentive in the loop.
/api/sota is the assay. You call the model at your own provider — OpenAI, Anthropic, Replicate, fal, self-hosted, whatever. Your latency, your billing, your circuit breakers. Our SOTA pick stays editorially independent of where you run inference.
If you want a reference router built on this endpoint, see hardparse.com — a separate product that demonstrates how to consume the registry. Note the separation: hardparse takes inference traffic, codesota does not.