Codesota · Benchmark · CodeSearchNetHome/Leaderboards/Vision & Documents/Document OCR/CodeSearchNet
Unknown

CodeSearchNet.

Benchmark for code summarization (docstring generation) across 6 programming languages: Python, Java, JavaScript, PHP, Ruby, Go. Over 2M (code, docstring) pairs. Primary metric is BLEU-4.

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Bleu 4

Bleu 4 is the reported evaluation metric for CodeSearchNet. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Bleu 4verifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01GPT-4o
Python split BLEU-4. LLM code summarization evaluation study (arxiv:2407.01511).
verified25.32026Source ↗Looks wrong?
02Qwen2.5-Coder 32B
Python split BLEU-4. Qwen2.5-Coder paper.
verified23.42024Paper ↗Code ↗Looks wrong?
03DeepSeek-Coder-V2-Instruct
Python split BLEU-4. DeepSeek-Coder-V2 paper.
verified22.82024Paper ↗Code ↗Looks wrong?
04CodeT5+ 2B
Python split BLEU-4. CodeT5+ 2B model. CodeT5+ paper Table 4.
verified21.362023Paper ↗Code ↗Looks wrong?
05CodeT5+
Python split BLEU-4. CodeT5+ paper (220M encoder-decoder variant).
verified20.012023Paper ↗Code ↗Looks wrong?
06UniXcoder
Python split BLEU-4. UniXcoder paper Table 2.
verified19.062022Paper ↗Code ↗Looks wrong?
07CodeBERT
Python split BLEU-4. CodeBERT paper Table 3.
verified17.652020Paper ↗Code ↗Looks wrong?

Smoothed Bleu 4

Smoothed Bleu 4 is the reported evaluation metric for CodeSearchNet. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Smoothed Bleu 4verifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01CodeBERT (MLM+RTD)
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified15.992020Paper ↗Code ↗Looks wrong?
02CodeBERT (MLM)
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified15.552020Paper ↗Code ↗Looks wrong?
03pre-train w/ code only
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified15.152020Paper ↗Code ↗Looks wrong?
04CodeBERT (RTD)
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified15.032020Paper ↗Code ↗Looks wrong?
05RoBERTa
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified14.522020Paper ↗Code ↗Looks wrong?
06Transformer
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified14.312020Paper ↗Code ↗Looks wrong?
07seq2seq
From paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
verified13.362020Paper ↗Code ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Document OCR