Codesota · Computer Vision · Optical Character Recognition · cnn-/-daily-mailTasks/Computer Vision/Optical Character Recognition
Optical Character Recognition · benchmark dataset · 2020 · EN

cnn-/-daily-mail.

Dataset from Papers With Code

Saturated benchmark· last significant update Jun 2022

ROUGE-based evaluation is saturated. No significant improvements since 2022. Modern summarization uses LLM-as-judge (G-Eval), human preference evaluations, or factual consistency metrics.

Submit a result
§ 01 · Leaderboard

Best published scores.

101 results indexed across 4 metrics. Shaded row marks current SOTA; ties broken by submission date.


Primary
accuracy · higher is better
All metrics
ppl, rouge-1, rouge-2, rouge-l
ppl
2 rows
#ModelOrgSubmittedPaper / codeppl
01Bottom-Up SumAug 2018Bottom-Up Abstractive Summarization · code32.75
02C2F + ALTERNATESep 2017papers-with-code23.60
rouge-1
33 rows
#ModelOrgSubmittedPaper / coderouge-1
01Scrambled code + broken (alter)Oct 2022Universal Evasion Attacks on Summarization Scoring · code48.18
02BRIOOSSYale NLPMar 2022BRIO: Bringing Order to Abstractive Summarization47.78
03PEGASUS + SummaRerankerMar 2022SummaReranker: A Multi-Task Mixture-of-Experts Re-rankin… · code47.16
04GPT-3.5-Turbo + TriSum RationaleOpenAIMar 2024TriSum: Learning Summarization Ability from Large Langua…46.70
05BRIDOOSSBRIDO AuthorsFeb 2025BRIDO: Bringing Democratic Order to Abstractive Summariz…45.81
06TriSum-JOSSTriSum AuthorsMar 2024TriSum: Learning Summarization Ability from Large Langua…45.70
07Fourier TransformerMay 2023Fourier Transformer: Fast Long Range Modeling by Removin… · code44.76
08GLM-XXLargeMar 2021GLM: General Language Model Pretraining with Autoregress… · code44.70
09HAT-BARTApr 2021Hierarchical Learning for Generation with Long Source Se…44.48
10MatchSum (RoBERTa-base)Apr 2020Extractive Summarization as Text Matching · code44.41
11Hie-BARTJun 2021papers-with-code44.35
12MatchSum (BERT-base)Apr 2020Extractive Summarization as Text Matching · code44.22
13BertSumExtAug 2019Text Summarization with Pretrained Encoders · code43.85
14BigBird-PegasusJul 2020Big Bird: Transformers for Longer Sequences · code43.84
15T5-11BOSSGoogleOct 2019Exploring the Limits of Transfer Learning with a Unified… · code43.52
16SumHiSOSSSumHiS AuthorsJun 2024SumHiS: Extractive Summarization Exploiting Hidden Struc…43.48
17BERTSUM+TransformerMar 2019Fine-tune BERT for Extractive Summarization · code43.25
18UniLM (Abstractive Summarization)May 2019Unified Language Model Pre-training for Natural Language… · code43.08
19Selector+Pointer GeneratorSep 2019Mixture Content Selection for Diverse Sequence Generatio… · code41.72
20NeuSUMJul 2018Neural Document Summarization by Jointly Learning to Sco… · code41.59
21Bottom-Up SumAug 2018Bottom-Up Abstractive Summarization · code41.22
22Llama-2-70B-chatOSSMeta AIJul 2025An Evaluation of Large Language Models on Text Summariza…40.98
23TaLK Convolutions (Deep)Feb 2020Time-aware Large Kernel Convolutions · code40.59
24Lead-3Apr 2017Get To The Point: Summarization with Pointer-Generator N… · code40.34
25TaLK Convolutions (Standard)Feb 2020Time-aware Large Kernel Convolutions · code40.03
26ML + RL (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code39.87
27DynamicConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code39.84
28LightConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code39.52
29Synthesizer (R+V)May 2020Synthesizer: Rethinking Self-Attention in Transformer Mo… · code38.57
30ML + Intra-Attention (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code38.30
31Mistral-7B-Instruct-v0.1OSSMistral AIJul 2025An Evaluation of Large Language Models on Text Summariza…37.44
32C2F + ALTERNATESep 2017papers-with-code31.10
33GPT-2Feb 2019papers-with-code · code29.34
rouge-2
33 rows
#ModelOrgSubmittedPaper / coderouge-2
01SumHiSOSSSumHiS AuthorsJun 2024SumHiS: Extractive Summarization Exploiting Hidden Struc…32.52
02BRIOOSSYale NLPMar 2022BRIO: Bringing Order to Abstractive Summarization23.75
03GPT-3.5-Turbo + TriSum RationaleOpenAIMar 2024TriSum: Learning Summarization Ability from Large Langua…23.50
04BRIDOOSSBRIDO AuthorsFeb 2025BRIDO: Bringing Democratic Order to Abstractive Summariz…22.95
05TriSum-JOSSTriSum AuthorsMar 2024TriSum: Learning Summarization Ability from Large Langua…22.70
06PEGASUS + SummaRerankerMar 2022SummaReranker: A Multi-Task Mixture-of-Experts Re-rankin… · code22.55
07T5-11BOSSGoogleOct 2019Exploring the Limits of Transfer Learning with a Unified… · code21.55
08Fourier TransformerMay 2023Fourier Transformer: Fast Long Range Modeling by Removin… · code21.55
09GLM-XXLargeMar 2021GLM: General Language Model Pretraining with Autoregress… · code21.40
10Hie-BARTJun 2021papers-with-code21.37
11HAT-BARTApr 2021Hierarchical Learning for Generation with Long Source Se…21.31
12BigBird-PegasusJul 2020Big Bird: Transformers for Longer Sequences · code21.11
13MatchSum (RoBERTa-base)Apr 2020Extractive Summarization as Text Matching · code20.86
14MatchSum (BERT-base)Apr 2020Extractive Summarization as Text Matching · code20.62
15UniLM (Abstractive Summarization)May 2019Unified Language Model Pre-training for Natural Language… · code20.43
16BertSumExtAug 2019Text Summarization with Pretrained Encoders · code20.34
17BERTSUM+TransformerMar 2019Fine-tune BERT for Extractive Summarization · code20.24
18Scrambled code + broken (alter)Oct 2022Universal Evasion Attacks on Summarization Scoring · code19.84
19NeuSUMJul 2018Neural Document Summarization by Jointly Learning to Sco… · code19.01
20TaLK Convolutions (Deep)Feb 2020Time-aware Large Kernel Convolutions · code18.97
21Selector+Pointer GeneratorSep 2019Mixture Content Selection for Diverse Sequence Generatio… · code18.74
22Bottom-Up SumAug 2018Bottom-Up Abstractive Summarization · code18.68
23TaLK Convolutions (Standard)Feb 2020Time-aware Large Kernel Convolutions · code18.45
24Lead-3Apr 2017Get To The Point: Summarization with Pointer-Generator N… · code17.70
25Llama-2-70B-chatOSSMeta AIJul 2025An Evaluation of Large Language Models on Text Summariza…17.23
26Mistral-7B-Instruct-v0.1OSSMistral AIJul 2025An Evaluation of Large Language Models on Text Summariza…16.42
27DynamicConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code16.25
28Synthesizer (R+V)May 2020Synthesizer: Rethinking Self-Attention in Transformer Mo… · code16.24
29LightConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code15.97
30ML + RL (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code15.82
31C2F + ALTERNATESep 2017papers-with-code15.40
32ML + Intra-Attention (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code14.81
33GPT-2Feb 2019papers-with-code · code8.27
rouge-l
33 rows
#ModelOrgSubmittedPaper / coderouge-l
01Scrambled code + broken (alter)Oct 2022Universal Evasion Attacks on Summarization Scoring · code45.35
02BRIOOSSYale NLPMar 2022BRIO: Bringing Order to Abstractive Summarization44.55
03PEGASUS + SummaRerankerMar 2022SummaReranker: A Multi-Task Mixture-of-Experts Re-rankin… · code43.87
04BRIDOOSSBRIDO AuthorsFeb 2025BRIDO: Bringing Democratic Order to Abstractive Summariz…42.51
05SumHiSOSSSumHiS AuthorsJun 2024SumHiS: Extractive Summarization Exploiting Hidden Struc…42.44
06TriSum-JOSSTriSum AuthorsMar 2024TriSum: Learning Summarization Ability from Large Langua…41.90
07HAT-BARTApr 2021Hierarchical Learning for Generation with Long Source Se…41.52
08GLM-XXLargeMar 2021GLM: General Language Model Pretraining with Autoregress… · code41.40
09Fourier TransformerMay 2023Fourier Transformer: Fast Long Range Modeling by Removin… · code41.34
10Hie-BARTJun 2021papers-with-code41.05
11BigBird-PegasusJul 2020Big Bird: Transformers for Longer Sequences · code40.74
12GPT-3.5-Turbo + TriSum RationaleOpenAIMar 2024TriSum: Learning Summarization Ability from Large Langua…40.70
13T5-11BOSSGoogleOct 2019Exploring the Limits of Transfer Learning with a Unified… · code40.69
14MatchSum (RoBERTa-base)Apr 2020Extractive Summarization as Text Matching · code40.55
15MatchSum (BERT-base)Apr 2020Extractive Summarization as Text Matching · code40.38
16UniLM (Abstractive Summarization)May 2019Unified Language Model Pre-training for Natural Language… · code40.34
17BertSumExtAug 2019Text Summarization with Pretrained Encoders · code39.90
18BERTSUM+TransformerMar 2019Fine-tune BERT for Extractive Summarization · code39.63
19Selector+Pointer GeneratorSep 2019Mixture Content Selection for Diverse Sequence Generatio… · code38.79
20Bottom-Up SumAug 2018Bottom-Up Abstractive Summarization · code38.34
21NeuSUMJul 2018Neural Document Summarization by Jointly Learning to Sco… · code37.98
22ML + RL (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code36.90
23TaLK Convolutions (Deep)Feb 2020Time-aware Large Kernel Convolutions · code36.81
24DynamicConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code36.73
25Lead-3Apr 2017Get To The Point: Summarization with Pointer-Generator N… · code36.57
26LightConvJan 2019Pay Less Attention with Lightweight and Dynamic Convolut… · code36.51
27TaLK Convolutions (Standard)Feb 2020Time-aware Large Kernel Convolutions · code36.13
28Synthesizer (R+V)May 2020Synthesizer: Rethinking Self-Attention in Transformer Mo… · code35.95
29ML + Intra-Attention (Paulus et al., 2017)May 2017A Deep Reinforced Model for Abstractive Summarization · code35.49
30C2F + ALTERNATESep 2017papers-with-code28.80
31Llama-2-70B-chatOSSMeta AIJul 2025An Evaluation of Large Language Models on Text Summariza…27.52
32GPT-2Feb 2019papers-with-code · code26.58
33Mistral-7B-Instruct-v0.1OSSMistral AIJul 2025An Evaluation of Large Language Models on Text Summariza…24.53
Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.
§ 04 · Literature

24 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies