How it works

The Data Flywheel

CodeSOTA starts where Papers with Code left off. Community contributions and verified results, compounding over time.

Traction

12,600+
Unique visitors
30,400+
Page views
125
Countries
+71%
MoM growth (Mar 2026)
1United States
3,666
2Poland
2,428
3Singapore
1,161
4Germany
464
5India
447
6China
422
7France
335
8Japan
310
9United Kingdom
294
10South Korea
272

Last 12 months. Analytics via Vercel.

The cycle

01

Seed data

79,817 papers and 9,327 benchmarks from Papers with Code. Seven years of ML research, preserved.

02

Fresh verification

We run benchmarks ourselves. Flag stale claims. Update results that papers got wrong.

03

Community submissions

Researchers submit new results. Vendors update scores. The community catches errors and fills gaps.

04

Continuous improvement

Better data attracts more users. More users submit more results. Quality compounds.

Where we are now

1,519
PWC results indexed
146
Datasets covered
17
Research areas

Live now

  • OCR benchmarks — 16+ models, 9 datasets
  • Browse system — PWC archive integrated
  • SOTA timeline charts on dataset pages
  • Open JSON API at /data/

Coming next

  • Speech recognition benchmarks
  • Code generation leaderboards
  • Automated benchmark verification
  • Community voting on accuracy
  • Contributor profiles

How to contribute

Submit new results

Every dataset page has a submission form. We review submissions and add verified results to the leaderboards.

Report issues

Incorrect score? Broken link? Results that don't match the source? Use the report button on any dataset page. Accurate data is the foundation.

Build on the data

All data is open JSON. Build dashboards, cite in papers, integrate into tools.

/data/benchmarks.json/data/datasets.json

Why this matters

When Meta shut down Papers with Code, it proved that critical research infrastructure can't depend on corporate goodwill.

The old model

  • Company acquires community resource
  • Promises to keep it open
  • Priorities change, resource dies
  • Community scrambles to preserve data

The new model

  • Open data from day one
  • Community contributions drive growth
  • No single point of failure
  • Value compounds over time

Built on Papers with Code

79,817 papers, 9,327 benchmarks, 5,628 datasets from seven years of ML research. This foundation lets us focus on what PWC couldn't: continuous verification and community-driven updates.

InheritedAdding
Historical benchmark resultsFresh verification (2026)
Paper-to-code linkagesWorking code verification
Static leaderboardsCommunity submissions + voting
Aggregated claimsWe run benchmarks ourselves
Research-focused metricsPractical recommendations

Join the flywheel

Every submission improves the data. Every user makes the flywheel spin faster.