Continual Learning
Learning new tasks without forgetting old ones.
Continual learning (lifelong learning) enables models to learn from non-stationary data streams without forgetting previous knowledge — the fundamental challenge of catastrophic forgetting. Replay-based methods, parameter isolation, and regularization are the three paradigms, with experience replay remaining the most practical approach.
History
McCloskey & Cohen document catastrophic interference in neural networks
EWC (Elastic Weight Consolidation) by Kirkpatrick et al. — regularization-based continual learning
PackNet and Progressive Neural Networks use parameter isolation for task-specific subnetworks
Experience Replay variants (ER, MER, A-GEM) show simple replay buffers are highly effective
DER++ combines knowledge distillation with experience replay for strong baseline
CoRe50 and CORe benchmarks standardize continual learning evaluation
CLS-ER (Complementary Learning Systems) bridges neuroscience theory with replay methods
Continual pretraining of LLMs becomes practically important — how to update without forgetting
CLIP-based continual learning leverages pretrained features for zero-shot class incremental learning
Foundation model fine-tuning as a continual learning problem — LoRA, adapters, and progressive specialization
How Continual Learning Works
Task Arrival
New data arrives sequentially — new classes, new domains, or distribution shifts — without access to all previous data simultaneously.
Replay / Regularization
To prevent forgetting: (1) replay stored examples from previous tasks, (2) regularize weights to stay near previous solutions (EWC), or (3) freeze/isolate task-specific parameters.
Knowledge Consolidation
New knowledge is integrated with existing knowledge through distillation, feature alignment, or complementary learning systems.
Evaluation
Performance is measured on all tasks seen so far — both backward transfer (impact on old tasks) and forward transfer (benefit to new tasks).
Current Landscape
Continual learning in 2025 is gaining practical importance as foundation models need updating without catastrophic forgetting. The research community has established that simple replay methods (ER, DER++) are strong baselines that are hard to beat consistently. The field is shifting from toy benchmarks (permuted MNIST) to practical scenarios: continual pretraining of LLMs, incremental class learning with pretrained features, and domain adaptation over time. The connection to foundation model fine-tuning (LoRA, adapters) is creating new synergies between continual learning theory and practical model management.
Key Challenges
Catastrophic forgetting — learning new tasks overwrites weights important for old tasks
Task boundary assumption — many methods assume clear task boundaries, which are absent in real-world data streams
Memory constraints — replay buffers consume memory that scales with the number of tasks
Evaluation complexity — comparing methods requires standardized task sequences, which are hard to design fairly
Scalability — most continual learning research uses small datasets (CIFAR, MNIST); scaling to foundation models is different
Quick Recommendations
Practical continual learning
Experience Replay (ER / DER++)
Simple, effective, and consistently competitive across benchmarks
Foundation model updating
LoRA + selective replay
Efficient fine-tuning that preserves base model capabilities
Memory-constrained
EWC / SI (Synaptic Intelligence)
Regularization-based approaches require no replay buffer
Class-incremental learning
FOSTER / MEMO with CLIP features
Leverages pretrained features for incremental class learning
What's Next
The frontier is continual learning for foundation models — updating LLMs, VLMs, and other large models with new knowledge without degrading existing capabilities. Expect methods that combine efficient fine-tuning (LoRA) with intelligent replay and knowledge consolidation, and practical tools for managing model versions across continuous data streams.
Benchmarks & SOTA
No datasets indexed for this task yet.
Contribute on GitHubRelated Tasks
Something wrong or missing?
Help keep Continual Learning benchmarks accurate. Report outdated results, missing benchmarks, or errors.