/api/sota — the registry, callable.
One GET, one task, one dated pick. Free, CORS-open, no auth, agent-discoverable. Returns the current state-of-the-art per task as JSON, with full provenance — benchmark id, metric, paper link, model card link, and a stable snapshot id you can cache against.
CodeSOTA does not run your inference. We are the assay; you call the model at your own provider. That separation is the product — it’s why the answers are credible.
Three curls.
Example payload.
What the OCR call returns. Fields documented in § 03.
Schema reference.
| Field | Type | Meaning |
|---|---|---|
| task | string | Short alias if one exists, otherwise the full DB id. |
| task_full_id | string | Always the full DB id. Stable across alias renames. |
| task_name | string | Human-readable task name from the registry. |
| benchmark | string | Dataset id of the winning benchmark for this task. |
| benchmark_version | string | null | Reserved for v0.2 quarterly snapshots — currently null. |
| tier | "sota" | Always "sota" in v0.1. balanced/cheap return 501. |
| as_of | ISO 8601 string | null | Date the winning result was scored. |
| snapshot_id | string | reg-YYYY-MM-DD-xxxxxx. Stable across re-fetches that don’t change the pick. |
| pick | object | The SOTA model entry for this task — see schema below. |
| pick.model_id | string | Codesota canonical model id. |
| pick.model_url | URL | Direct link to the model card on codesota.com. |
| pick.vendor | string | null | Vendor or upstream lab if known. |
| pick.provider_hints | string[] | null | Reserved — where you can run this model. v0.2. |
| pick.score | number | Numeric score, in the metric’s native units. |
| pick.score_metric | string | snake_case <benchmark>_<metric>, e.g. omnidocbench_composite. |
| pick.metric_id | string | Raw metric id from the registry. |
| pick.higher_is_better | boolean | Direction. true for accuracy/pass@k; false for CER/WER/FID/loss. |
| pick.benchmark | object | { id, name } of the benchmark dataset. |
| pick.cost_per_1k_usd | number | null | Reserved — unified vendor pricing. v0.2. |
| pick.cost_basis | string | null | Reserved — narrates the cost model used. v0.2. |
| pick.result_date | YYYY-MM-DD | null | Date of the winning result. |
| runners_up | pick[] | Up to 3 entries below the pick, same shape. |
| registry_url | URL | Curated task hub on codesota.com (e.g. /ocr). |
| methodology_url | URL | Always /methodology — published, dated, citable. |
| retrieved_at | ISO 8601 | Timestamp the response was generated. |
Short names accepted.
The DB carries verbose, hyphenated task ids. The API accepts both forms.
| Short | Full id |
|---|---|
| ocr | document-ocr |
| code | code-generation |
| asr | speech-recognition |
| stt | speech-recognition |
| tts | text-to-speech |
| vqa | visual-question-answering |
| caption | image-captioning |
| t2i | text-to-image |
| t2v | text-to-video |
Status codes.
| HTTP | When | Body |
|---|---|---|
| 404 | Unknown task id (or alias not registered) | { error, hint, see } |
| 404 | Task is registered but has zero scored runs | { error, hint, see } |
| 501 | Caller asked for tier=balanced or tier=cheap (v0.1 supports only sota) | { error, hint, see } |
| 503 | Registry database unreachable | { error } |
| 500 | Unexpected query failure | { error, hint } |
All error responses include CORS headers, JSON body, and a hint or see field where useful. see typically points back to the index endpoint or the task hub on codesota.com.
What you can count on.
- Caching. Cache-Control
public, max-age=300, s-maxage=300. Vercel’s edge serves most hits without touching the database. - CORS.
Access-Control-Allow-Origin: *. No credentials, no cookies. Browser code from any origin can call the endpoint. - Rate limit. None today. Reasonable use: a router polling every minute is fine; a crawler hitting every task every second is not. We’ll publish quotas if abuse warrants them.
- Versioning. The path is permanent. Reserved fields (provider_hints, cost_per_1k_usd, cost_basis, benchmark_version) publish as null today and populate in v0.2 — additions only, never breaking.
- Discoverability. Allowed in robots.txt for AI and search crawlers. Documented in llms.txt for LLM-grounded answers.
Why this is not a router.
S&P doesn’t trade. UL doesn’t sell appliances. CodeSOTA doesn’t route inference. The reason every score in the registry is credible is that we have no financial stake in which model wins — adding inference markup would put exactly the wrong incentive in the loop.
/api/sota is the assay. You call the model at your own provider — OpenAI, Anthropic, Replicate, fal, self-hosted, whatever. Your latency, your billing, your circuit breakers. Our SOTA pick stays editorially independent of where you run inference.
If you want a reference router built on this endpoint, see hardparse.com — a separate product that demonstrates how to consume the registry. Note the separation: hardparse takes inference traffic, codesota does not.