Codesota · Tasks · Continuous ControlHome/Tasks/Reinforcement Learning/Continuous Control

Continuous Control.

Continuous control — learning smooth motor commands in simulated physics — was transformed by MuJoCo and the OpenAI Gym suite in the mid-2010s. SAC (2018) and TD3 became reliable baselines, but the field shifted toward harder locomotion (humanoid parkour, dexterous hands) and sim-to-real transfer after DeepMind's dm_control and Isaac Gym raised the bar. DreamerV3 (2023) showed that world-model approaches can match or beat model-free methods across dozens of control tasks with a single hyperparameter set, signaling a move toward generalist RL agents.

Datasets

Results

average-return

Canonical metric

§ 02 · Canonical benchmark

The reference dataset.

MuJoCo

Physics-based continuous control benchmark. Evaluated on 15 DMControl tasks; metric is mean normalized score (0=random, 1000=expert) at 1M environment steps.

Primary metric: average-return

View full leaderboard →

§ 03 · Top 10

Leading models.

Leading models on MuJoCo.

#	Model	average-return	Year	Source
★	TD-MPC2 (317M params)	960	2026	paper ↗
2	TD-MPC2 (19M params)	953	2026	paper ↗
3	FOWM	945	2026	paper ↗
4	BRO	941	2026	paper ↗
5	TD-MPC2 (5M params)	929	2026	paper ↗
6	DreamerV3	897	2026	paper ↗
7	TD-MPC	857	2026	paper ↗
8	DrQ-v2	799	2026	paper ↗
9	SAC (state-based)	777	2026	paper ↗

What were you looking for on Continuous Control?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

1 dataset tracked for this task.

MuJoCo

CANONICAL

9 results · average-return

Top: TD-MPC2 (317M params) — 960

§ 05 · Related tasks

Other tasks in Reinforcement Learning.

Atari Games Offline RL

Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Continuous Control? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.