| 01 | o4-mini OpenAI model card. MBPP pass@1. | verified | 94.9 | 2026 | Source ↗ | Looks wrong? |
| 02 | o3-mini OpenAI o3-mini model card. MBPP pass@1. | verified | 93.3 | 2026 | Source ↗ | Looks wrong? |
| 03 | Claude Opus 4 Anthropic model card. MBPP pass@1. | verified | 92 | 2026 | Source ↗ | Looks wrong? |
| 04 | Claude 3.5 Sonnet (Oct 2024) Qwen2.5-Coder tech report Table 16 | verified | 91 | 2024 | Source ↗ | Looks wrong? |
| 05 | GPT-4.1 OpenAI GPT-4.1 model card. MBPP pass@1. | verified | 90.9 | 2026 | Source ↗ | Looks wrong? |
| 06 | Qwen2.5-Coder-32B-Instruct Qwen2.5-Coder tech report Table 16 | verified | 90.2 | 2024 | Source ↗ | Looks wrong? |
| 07 | Qwen2.5-Coder 32B Table 2, arxiv:2409.12186. MBPP pass@1. | verified | 90.2 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 08 | Claude Sonnet 4 Anthropic model card. MBPP pass@1. | verified | 89.6 | 2026 | Source ↗ | Looks wrong? |
| 09 | DeepSeek-Coder-V2-Instruct Qwen2.5-Coder tech report Table 16 | verified | 89.4 | 2024 | Source ↗ | Looks wrong? |
| 10 | DeepSeek-V3 DeepSeek-V3 tech report. MBPP pass@1. | verified | 89.3 | 2026 | Source ↗ | Looks wrong? |
| 11 | Claude 3.5 Sonnet | unverified | 89.2 | 2025 | Source ↗ | Looks wrong? |
| 12 | claude-35-sonnet | paper | 89.2 | 2025 | Source ↗ | Looks wrong? |
| 13 | GPT-4o | unverified | 87.8 | 2025 | Source ↗ | Looks wrong? |
| 14 | GPT-4o (Aug 2024) Qwen2.5-Coder tech report Table 16 | verified | 86.8 | 2024 | Source ↗ | Looks wrong? |
| 15 | Qwen2.5-Coder-7B-Instruct Qwen2.5-Coder tech report Table 16 | verified | 83.5 | 2024 | Source ↗ | Looks wrong? |
| 16 | Codestral 22B v0.1 Qwen2.5-Coder tech report Table 16 | verified | 78.2 | 2024 | Source ↗ | Looks wrong? |
| 17 | Llama 4 Maverick Meta Llama 4 Maverick model card | verified | 77.6 | 2025 | Source ↗ | Looks wrong? |
| 18 | Llama 4 Maverick (17B-128E) Meta Llama 4 Maverick model card | verified | 77.6 | 2025 | Source ↗ | Looks wrong? |
| 19 | Codestral 22B Mistral official blog, May 2024. MBPP pass@1. | verified | 75.4 | 2024 | Source ↗ | Looks wrong? |
| 20 | Gemma-3-27b Gemma 3 tech report | verified | 74.4 | 2025 | Source ↗ | Looks wrong? |
| 21 | Gemma 3 27B IT Gemma 3 tech report | verified | 74.4 | 2025 | Source ↗ | Looks wrong? |
| 22 | Gemma 3 12B IT Gemma 3 tech report | verified | 73 | 2025 | Source ↗ | Looks wrong? |
| 23 | Llama 4 Scout (17B-16E) Meta Llama 4 Scout model card, pre-trained | verified | 67.8 | 2025 | Source ↗ | Looks wrong? |
| 24 | Llama-4-Scout Meta Llama 4 Scout model card, pre-trained | verified | 67.8 | 2025 | Source ↗ | Looks wrong? |
| 25 | Gemma 3 4B IT Gemma 3 tech report | verified | 63.2 | 2025 | Source ↗ | Looks wrong? |
| 26 | Code Llama 34B Code Llama paper, arxiv:2308.12950. MBPP pass@1. | verified | 62.6 | 2026 | Source ↗ | Looks wrong? |
| 27 | StarCoder2 15B Table 2, arxiv:2402.19173. StarCoder2-15B base model. | verified | 54.4 | 2024 | Paper ↗Code ↗ | Looks wrong? |