10,000 coding problems from Codewars, AtCoder, Kattis, and CodeForces. Ranges from introductory to competition level.
Pass@5 is the reported evaluation metric for APPS. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Muted rows were not state of the art when published — an earlier or same-year result already scored better.
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | CodeLlama-34B | verified | 32.81 | 2023 | Source ↗ | Looks wrong? |
| 02 | CodeLlama-13B | verified | 23.74 | 2023 | Source ↗ | Looks wrong? |
| 03 | CodeLlama-7B | verified | 10.76 | 2023 | Source ↗ | Looks wrong? |