The Data Flywheel
CodeSOTA starts where Papers with Code left off. We're building the definitive ML benchmark resource through community contributions and verified results.
Seed Data
PWC's 79,817 papers and 9,327 benchmarks as our foundation. Seven years of ML research, preserved and indexed.
Fresh Verification
We run benchmarks ourselves. Flag stale claims. Update results that papers got wrong. No more trusting 3-year-old numbers.
Community Submissions
Researchers submit new results. Vendors update their model scores. The community catches errors and fills gaps.
Continuous Improvement
Better data attracts more users. More users submit more results. The cycle accelerates. Quality compounds.
Where We Are Now
1,519
PWC Results Indexed
146
Datasets Covered
16+
OCR Models Verified
Live now:
- OCR benchmarks - 16+ models, 9 datasets
- Browse system - PWC archive data integrated
- SOTA timeline charts on dataset pages
- Result submission forms on every benchmark
- Open JSON API at
/data/
Coming next:
- Speech recognition benchmarks
- Code generation leaderboards
- Automated benchmark verification pipeline
- Community voting on result accuracy
- Contributor profiles and attribution
How to Contribute
Submit New Results
Got a new model result? Published a paper with benchmark scores? Every dataset page has a submission form at the bottom. We review submissions and add verified results to the leaderboards.
Browse datasets to submit resultsReport Issues
Found an incorrect score? Spotted a broken paper link? Results that don't match the source? Use the "Report Issue" button on any dataset page. Accurate data is the foundation.
Report an issueBuild on the Data
All our data is open. The PWC archive, our benchmarks, everything is JSON you can fetch and use. Build analysis tools, research dashboards, or integrate into your workflow.
/data/pwc-archive.json/data/benchmarks.json/data/datasets.jsonSpread the Word
The flywheel only works with community participation. Share CodeSOTA with your research group, link to it in your papers, mention it when discussing benchmarks. More users = better data for everyone.
Why This Matters
"When Meta shut down Papers with Code, it proved that critical research infrastructure can't depend on corporate goodwill. The ML community needs benchmark tracking that's community-owned and community-driven."
The old model (broken)
- - Company acquires community resource
- - Promises to keep it open "forever"
- - Priorities change, resource gets shut down
- - Community scrambles to preserve data
The new model (sustainable)
- - Open data from day one
- - Community contributions drive growth
- - No single point of failure
- - Value compounds over time
Built on Papers with Code
We're not starting from scratch. CodeSOTA is built on the preserved Papers with Code dataset - 79,817 papers, 9,327 benchmarks, and 5,628 datasets from seven years of ML research. This foundation lets us focus on what PWC couldn't do: continuous verification and community-driven updates.
| What we inherited | What we're adding |
|---|---|
| Historical benchmark results | Fresh verification (Dec 2025) |
| Paper-to-code linkages | Working code verification |
| Static leaderboards | Community submissions + voting |
| Aggregated claims | We run benchmarks ourselves |
| Research-focused metrics | Practical recommendations |
Join the Flywheel
Every submission improves the data. Every issue report increases accuracy. Every user makes the flywheel spin faster.