Elo-rated competitive programming benchmark built from continuously-updated Codeforces, ICPC, and IOI problems. Each LLM is treated as a virtual Codeforces contestant; ratings are fit via Bayesian MAP Elo on the standard Codeforces scale (~800 novice to ~3800 top human). Built by Olympiad medalists to limit contamination.
Elo is the reported evaluation metric for LiveCodeBench Pro. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Gemini 3.1 Pro | verified | 2887 | 2026 | Source ↗ | Looks wrong? |
| 02 | Gemini 3 Pro | verified | 2439 | 2026 | Source ↗ | Looks wrong? |
| 03 | GPT-5 | verified | 2176 | 2026 | Source ↗ | Looks wrong? |
| 04 | o4-mini | verified | 2092 | 2026 | Source ↗ | Looks wrong? |
| 05 | Gemini 2.5 Pro | verified | 1769 | 2026 | Source ↗ | Looks wrong? |
| 06 | Qwen3-235B-A22B | verified | 1673 | 2026 | Source ↗ | Looks wrong? |
| 07 | Claude Sonnet 4.5 | verified | 1412 | 2026 | Source ↗ | Looks wrong? |
| 08 | Gemini 2.5 Flash | verified | 1288 | 2026 | Source ↗ | Looks wrong? |
| 09 | DeepSeek R1 | verified | 1161 | 2026 | Source ↗ | Looks wrong? |
| 10 | o3 | verified | 1010 | 2026 | Source ↗ | Looks wrong? |