Time Series Classification
Classifying time series patterns.
Time series classification assigns labels to temporal sequences — ECG diagnosis, activity recognition, industrial fault detection. ROCKET (random convolution features) and InceptionTime dominate benchmarks on the UCR/UEA archive, while foundation models like Chronos are beginning to enable zero-shot classification.
History
UCR Time Series Archive established — becomes the standard evaluation suite
DTW (Dynamic Time Warping) + 1-NN shown to be a surprisingly strong baseline
BOSS (Bag of SFA Symbols) introduces dictionary-based approaches for TSC
COTE/HIVE-COTE ensemble combines multiple distance and feature-based classifiers
InceptionTime applies deep CNN with inception modules, matching HIVE-COTE
ROCKET generates 10K random convolution features — fast, accurate, and scalable
HIVE-COTE 2.0 becomes the most accurate single classifier on the UCR archive
MultiROCKET extends ROCKET with multi-scale random convolutions
QUANT and HYDRA push random feature methods further
Time series foundation models applied to classification via embedding extraction
How Time Series Classification Works
Time Series Representation
Raw temporal data (univariate or multivariate) is optionally transformed — z-normalization, resampling, or feature extraction (ROCKET random convolutions, SFA words).
Feature Extraction
ROCKET: apply 10K random convolutional kernels, extract max value and proportion of positive values per kernel. Alternatively, deep networks (InceptionTime) learn features end-to-end.
Classification
For ROCKET: a simple ridge classifier or linear SVM on the random features. For deep learning: global average pooling followed by a dense classification layer.
Ensemble (optional)
HIVE-COTE combines multiple representation-specific classifiers (distance, dictionary, interval, shapelet) via weighted voting.
Current Landscape
Time series classification in 2025 has mature, well-benchmarked solutions. ROCKET-family methods offer the best speed-accuracy tradeoff, training in minutes on CPU while matching or exceeding deep learning. HIVE-COTE 2.0 is the accuracy champion but costly. InceptionTime is the go-to deep learning approach. The emerging disruption is foundation model embeddings — using Chronos or TimesFM to extract features from time series, then training a simple classifier on top, which works surprisingly well with few labeled examples.
Key Challenges
Variable-length sequences require alignment or padding strategies that can distort temporal patterns
Class imbalance — many real-world TSC tasks have rare but important classes (fault detection, rare diseases)
Multivariate complexity — correlations across channels in multivariate time series are hard to capture effectively
Interpretability — explaining why a time series was classified a certain way is important for medical and industrial applications
Scalability — HIVE-COTE is highly accurate but computationally expensive; ROCKET scales much better
Quick Recommendations
Fast and accurate baseline
ROCKET / MultiROCKET
Best accuracy-speed tradeoff — trains in minutes, competitive with deep learning
Maximum accuracy
HIVE-COTE 2.0
Most accurate on the UCR archive, but slower to train
Deep learning approach
InceptionTime
Best deep learning baseline, GPU-friendly, handles multivariate naturally
Transfer learning / few-shot
Chronos/TimesFM embeddings + classifier
Foundation model features enable classification with very few labeled examples
What's Next
The frontier is few-shot and zero-shot time series classification using foundation models, enabling classification on new domains without domain-specific training data. Expect advances in multivariate TSC (EEG, sensor arrays), interpretable classification (which temporal patterns drive the decision), and real-time streaming classification for edge deployment.
Benchmarks & SOTA
No datasets indexed for this task yet.
Contribute on GitHubRelated Tasks
Time Series Forecasting
Time-series forecasting exploded in 2023-2025 when foundation models crossed over from NLP. Nixtla's TimeGPT (2023), Google's TimesFM (2024), and Amazon's Chronos showed that a single pretrained model can zero-shot forecast diverse series, rivaling task-specific statistical models like ETS and ARIMA. Yet the Monash benchmark and M-competition lineage (M4, M5) reveal an uncomfortable truth: simple ensembles of statistical methods still win on many univariate tasks. The real battle now is multivariate long-horizon forecasting, where PatchTST and iTransformer compete with state-space models like Mamba.
Tabular Classification
Tabular classification — predicting discrete labels from structured rows and columns — remains the one domain where gradient-boosted trees (XGBoost, LightGBM, CatBoost) stubbornly rival deep learning. Despite years of effort, neural approaches like TabNet (2019) and FT-Transformer (2021) only match tree methods on certain splits, and a 2022 NeurIPS study by Grinsztajn et al. confirmed that trees still dominate on medium-sized datasets. The real frontier is AutoML systems (AutoGluon, FLAML) that ensemble both paradigms, and the emerging question of whether foundation models pretrained on millions of tables can finally tip the balance.
Tabular Regression
Tabular regression — predicting continuous values from structured data — powers everything from house-price estimation to demand forecasting and shares the same tree-vs-neural tension as classification. XGBoost and LightGBM remain brutally effective defaults, but recent work on differentiable trees and table-aware transformers (TabPFN, 2022) showed that meta-learned priors can beat tuned GBDTs on small datasets in seconds. The challenge is distribution shift: real-world regression targets drift over time, and most benchmarks (UCI, Kaggle) are static snapshots that hide this problem entirely.
Something wrong or missing?
Help keep Time Series Classification benchmarks accurate. Report outdated results, missing benchmarks, or errors.