Time Seriestabular-classification

Tabular Classification

Tabular classification — predicting discrete labels from structured rows and columns — remains the one domain where gradient-boosted trees (XGBoost, LightGBM, CatBoost) stubbornly rival deep learning. Despite years of effort, neural approaches like TabNet (2019) and FT-Transformer (2021) only match tree methods on certain splits, and a 2022 NeurIPS study by Grinsztajn et al. confirmed that trees still dominate on medium-sized datasets. The real frontier is AutoML systems (AutoGluon, FLAML) that ensemble both paradigms, and the emerging question of whether foundation models pretrained on millions of tables can finally tip the balance.

1 datasets5 resultsView full task mapping →

Tabular classification predicts categorical outcomes from structured row-column data — the bread and butter of real-world ML. Gradient-boosted trees (XGBoost, LightGBM, CatBoost) remain king, consistently beating deep learning approaches despite years of research into neural tabular methods.

History

2014

XGBoost released — becomes the dominant Kaggle competition tool

2017

LightGBM (Microsoft) introduces gradient-based one-side sampling for faster training

2018

CatBoost (Yandex) handles categorical features natively with ordered target encoding

2019

TabNet (Google) applies attention mechanisms to tabular data

2021

FT-Transformer shows transformers can be competitive on tabular tasks with proper feature tokenization

2022

Grinsztajn et al. show tree-based methods still outperform deep learning on most tabular benchmarks

2023

TabPFN uses in-context learning to predict on small tabular datasets without training

2024

TabR and ModernNCA show retrieval-augmented approaches improve deep tabular performance

2024

Large-scale tabular benchmarks (TabZilla, OpenML-CC18) enable rigorous comparison

2025

LLM-based tabular prediction emerges but still lags behind well-tuned GBDT

How Tabular Classification Works

1Data PreprocessingHandle missing values2Feature EngineeringCreate interaction features3Model TrainingFor GBDT: sequentially fit …4Hyperparameter TuningCritical for both paradigms…5Ensemble and Calibrat…Combine multiple models via…Tabular Classification Pipeline
1

Data Preprocessing

Handle missing values, encode categorical variables (one-hot, target encoding, CatBoost native), and optionally scale numerical features.

2

Feature Engineering

Create interaction features, polynomial features, and domain-specific transformations that capture known relationships.

3

Model Training

For GBDT: sequentially fit shallow trees where each tree corrects residual errors of the ensemble. For neural: tokenize features and process through transformer or MLP layers.

4

Hyperparameter Tuning

Critical for both paradigms — learning rate, tree depth, regularization (GBDT) or architecture, dropout, learning rate schedule (neural).

5

Ensemble and Calibration

Combine multiple models via averaging or stacking, and calibrate probabilities if needed for classification tasks.

Current Landscape

Tabular classification in 2025 remains dominated by gradient-boosted trees. Despite significant research into neural approaches (TabNet, FT-Transformer, TabR, TabPFN), well-tuned XGBoost/LightGBM still wins most head-to-head comparisons on standard benchmarks. The gap narrows on large datasets with many features, where transformers can leverage their capacity. In practice, AutoML tools (AutoGluon) that ensemble both paradigms provide the best results. The field's dirty secret: careful feature engineering and hyperparameter tuning matter more than model architecture.

Key Challenges

Trees vs. neural gap — deep learning still hasn't consistently beaten GBDT on heterogeneous tabular data

Feature engineering dependency — performance on tabular data is highly sensitive to manual feature engineering

Small datasets — many real-world tabular tasks have <10K rows, where deep learning overfits

Heterogeneous features — mixing numerical, categorical, ordinal, and text features in one model is non-trivial

Distribution shift — tabular data in production drifts over time, requiring monitoring and retraining

Quick Recommendations

Default choice for tabular

XGBoost / LightGBM / CatBoost

Consistently the best or near-best on tabular benchmarks; fast, interpretable, robust

Deep learning on tabular

FT-Transformer / TabR

Best neural approaches, competitive with GBDT when well-tuned

Very small datasets (<1K rows)

TabPFN

In-context learning approach that requires no training, works well on tiny datasets

AutoML

AutoGluon / H2O AutoML

Automatic model selection and ensembling across GBDT and neural methods

What's Next

The frontier is foundation models for tabular data — pretrained on millions of tables and fine-tuned for specific tasks. Early results (TabPFN, CARTE) show promise for small-data regimes. Expect LLM-based approaches that treat table rows as serialized text to improve, especially when tables contain text-heavy columns that benefit from language understanding.

Benchmarks & SOTA

Related Tasks

Something wrong or missing?

Help keep Tabular Classification benchmarks accurate. Report outdated results, missing benchmarks, or errors.

0/2000
Tabular Classification Benchmarks - Time Series - CodeSOTA | CodeSOTA