Continuous Control

Continuous control — learning smooth motor commands in simulated physics — was transformed by MuJoCo and the OpenAI Gym suite in the mid-2010s. SAC (2018) and TD3 became reliable baselines, but the field shifted toward harder locomotion (humanoid parkour, dexterous hands) and sim-to-real transfer after DeepMind's dm_control and Isaac Gym raised the bar. DreamerV3 (2023) showed that world-model approaches can match or beat model-free methods across dozens of control tasks with a single hyperparameter set, signaling a move toward generalist RL agents.

1
Datasets
12
Results
average-return
Canonical metric
Canonical Benchmark

MuJoCo

Physics-based continuous control benchmark. Evaluated on 15 DMControl tasks; metric is mean normalized score (0=random, 1000=expert) at 1M environment steps.

Primary metric: average-return
View full leaderboard

Top 10

Leading models on MuJoCo.

RankModelaverage-returnYearSource
1
TD3
55922026paper
2
SAC
51792026paper
3
PPO
20382026paper
4
TD-MPC2 (317M params)
9602026paper
5
TD-MPC2 (19M params)
9532026paper
6
FOWM
9452026paper
7
BRO
9412026paper
8
TD-MPC2 (5M params)
9292026paper
9
DreamerV3
8972026paper
10
TD-MPC
8572026paper

All datasets

1 dataset tracked for this task.

Related tasks

Other tasks in Reinforcement Learning.