Multilingual benchmark (Python, TypeScript, Java, C#) for cross-file code completion requiring understanding of cross-file context. 1000 examples per language from GitHub repos. Primary metric is Exact Match.
Exact Match is the reported evaluation metric for CrossCodeEval. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Claude Sonnet 4 | verified | 44.5 | 2026 | Source ↗ | Looks wrong? |
| 02 | Qwen2.5-Coder 32B | verified | 43.7 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 03 | DeepSeek-Coder-V2-Instruct | verified | 41.3 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 04 | GPT-4o | verified | 38.2 | 2023 | Paper ↗Code ↗ | Looks wrong? |
| 05 | Codestral 22B | verified | 35.6 | 2024 | Source ↗ | Looks wrong? |
| 06 | StarCoder2 15B | verified | 32.1 | 2024 | Paper ↗Code ↗ | Looks wrong? |