Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Program RepairHome/Tasks/Computer Code/Program Repair

Program Repair.

Automatically fixing bugs in code.

1
Datasets
5
Results
correct-patches
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

Defects4J

Standard program repair benchmark with 835 real bugs from 17 open-source Java projects. Each bug has a fix and triggering test suite. Primary metric is the number of correctly fixed bugs (plausible and correct patches).

Primary metric: correct-patches
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on Defects4J.

#Modelcorrect-patchesYearSource
SRepair1012026paper ↗
2Claude Opus 489.02026paper ↗
3GPT-4o82.02026paper ↗
4ChatRepair78.02026paper ↗
5AlphaRepair23.02026paper ↗

What were you looking for on Program Repair?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

1 dataset tracked for this task.

Defects4J
CANONICAL
5 results · correct-patches
Top: SRepair 101
§ 05 · Related tasks

Other tasks in Computer Code.

Bug DetectionCode CompletionCode GenerationCode SummarizationCode Translation
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Program Repair? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.