| 01 | Scrambled code + broken (alter) From paper: Universal Evasion Attacks on Summarization Scoring | verified | 48.18 | 2022 | Paper ↗Code ↗ | Looks wrong? |
| 02 | BRIO BRIO: Bringing Order to Abstractive Summarization. ACL 2022. BART-large with contrastive learning. SOTA on CNN/DM at time of publication. Score from Table 1 of the paper. | verified | 47.78 | 2022 | Paper ↗ | Looks wrong? |
| 03 | PEGASUS + SummaReranker From paper: SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization | verified | 47.16 | 2022 | Paper ↗Code ↗ | Looks wrong? |
| 04 | GPT-3.5-Turbo + TriSum Rationale GPT-3.5-Turbo prompted with TriSum structured rationale. Best ROUGE-1 in TriSum paper Table 2. Zero-shot with structured chain-of-thought prompting. | verified | 46.7 | 2024 | Paper ↗ | Looks wrong? |
| 05 | BRIDO BRIDO: Bringing Democratic Order to Abstractive Summarization. arXiv Feb 2025. BART-based with democratic contrastive learning. Trades slight ROUGE drop vs BRIO for better factual consistency (3.82% G-Eval improvement). Table 2. | verified | 45.81 | 2025 | Paper ↗ | Looks wrong? |
| 06 | TriSum-J TriSum: Learning Summarization Ability from LLMs with Structured Rationale. NAACL 2024. TriSum-J = joint learning stage. Table 2. | verified | 45.7 | 2024 | Paper ↗ | Looks wrong? |
| 07 | Fourier Transformer From paper: Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator | verified | 44.76 | 2023 | Paper ↗Code ↗ | Looks wrong? |
| 08 | GLM-XXLarge From paper: GLM: General Language Model Pretraining with Autoregressive Blank Infilling | verified | 44.7 | 2021 | Paper ↗Code ↗ | Looks wrong? |
| 09 | HAT-BART From paper: Hierarchical Learning for Generation with Long Source Sequences | verified | 44.48 | 2021 | Paper ↗ | Looks wrong? |
| 10 | MatchSum (RoBERTa-base) From paper: Extractive Summarization as Text Matching | verified | 44.41 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 11 | Hie-BART From paper: Hie-BART: Document Summarization with Hierarchical BART | verified | 44.35 | 2021 | Paper ↗ | Looks wrong? |
| 12 | MatchSum (BERT-base) From paper: Extractive Summarization as Text Matching | verified | 44.22 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 13 | BertSumExt From paper: Text Summarization with Pretrained Encoders | verified | 43.85 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 14 | BigBird-Pegasus From paper: Big Bird: Transformers for Longer Sequences | verified | 43.84 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 15 | T5-11B From paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | verified | 43.52 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 16 | SumHiS SumHiS: Extractive Summarization Exploiting Hidden Structure. arXiv Jun 2024. With semantic filtering. Extractive approach; ROUGE-2 of 32.52 exceeds prior extractive SOTA by 10%. Table 1. | verified | 43.48 | 2024 | Paper ↗ | Looks wrong? |
| 17 | BERTSUM+Transformer From paper: Fine-tune BERT for Extractive Summarization | verified | 43.25 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 18 | UniLM (Abstractive Summarization) From paper: Unified Language Model Pre-training for Natural Language Understanding and Generation | verified | 43.08 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 19 | Selector+Pointer Generator From paper: Mixture Content Selection for Diverse Sequence Generation | verified | 41.72 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 20 | NeuSUM From paper: Neural Document Summarization by Jointly Learning to Score and Select Sentences | verified | 41.59 | 2018 | Paper ↗Code ↗ | Looks wrong? |
| 21 | Bottom-Up Sum From paper: Bottom-Up Abstractive Summarization | verified | 41.22 | 2018 | Paper ↗Code ↗ | Looks wrong? |
| 22 | Llama-2-70B-chat Llama-2-70B-chat with 7-shot in-context learning on CNN/DailyMail. Best overall ICL result in arXiv:2507.05123 (Jul 2025). Outperforms zero-shot by substantial margin. | verified | 40.98 | 2025 | Paper ↗ | Looks wrong? |
| 23 | TaLK Convolutions (Deep) From paper: Time-aware Large Kernel Convolutions | verified | 40.59 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 24 | Lead-3 From paper: Get To The Point: Summarization with Pointer-Generator Networks | verified | 40.34 | 2017 | Paper ↗Code ↗ | Looks wrong? |
| 25 | TaLK Convolutions (Standard) From paper: Time-aware Large Kernel Convolutions | verified | 40.03 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 26 | ML + RL (Paulus et al., 2017) From paper: A Deep Reinforced Model for Abstractive Summarization | verified | 39.87 | 2017 | Paper ↗Code ↗ | Looks wrong? |
| 27 | DynamicConv From paper: Pay Less Attention with Lightweight and Dynamic Convolutions | verified | 39.84 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 28 | LightConv From paper: Pay Less Attention with Lightweight and Dynamic Convolutions | verified | 39.52 | 2019 | Paper ↗Code ↗ | Looks wrong? |
| 29 | Synthesizer (R+V) From paper: Synthesizer: Rethinking Self-Attention in Transformer Models | verified | 38.57 | 2020 | Paper ↗Code ↗ | Looks wrong? |
| 30 | ML + Intra-Attention (Paulus et al., 2017) From paper: A Deep Reinforced Model for Abstractive Summarization | verified | 38.3 | 2017 | Paper ↗Code ↗ | Looks wrong? |
| 31 | Mistral-7B-Instruct-v0.1 Zero-shot evaluation of Mistral-7B-Instruct-v0.1 on CNN/DailyMail. Best zero-shot 7B result in arXiv:2507.05123 (Jul 2025). Standard ROUGE scoring against reference highlights. | verified | 37.44 | 2025 | Paper ↗ | Looks wrong? |
| 32 | C2F + ALTERNATE From paper: Coarse-to-Fine Attention Models for Document Summarization | verified | 31.1 | 2017 | Paper ↗ | Looks wrong? |
| 33 | GPT-2 From paper: Language Models are Unsupervised Multitask Learners | verified | 29.34 | 2019 | Paper ↗Code ↗ | Looks wrong? |