Code translation benchmark between C++, Java, and Python using 852 GeeksForGeeks programming problems. Each function has correct implementations in all three languages. Primary metric is Computational Accuracy (CA) — translated code must pass all unit tests.
Computational Accuracy is the reported evaluation metric for TransCoder (GeeksForGeeks). Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | Claude Sonnet 4 | verified | 89.4 | 2026 | Source ↗ | Looks wrong? |
| 02 | GPT-4o | verified | 88.2 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 03 | Qwen2.5-Coder 32B | verified | 86.3 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 04 | DeepSeek-Coder-V2-Instruct | verified | 84.6 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 05 | StarCoder2 15B | verified | 78.4 | 2024 | Paper ↗Code ↗ | Looks wrong? |
| 06 | CodeT5+ | verified | 72.1 | 2023 | Paper ↗Code ↗ | Looks wrong? |
| 07 | TransCoder | verified | 68.7 | 2020 | Paper ↗Code ↗ | Looks wrong? |