Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Continuous ControlHome/Tasks/Reinforcement Learning/Continuous Control

Continuous Control.

Continuous control — learning smooth motor commands in simulated physics — was transformed by MuJoCo and the OpenAI Gym suite in the mid-2010s. SAC (2018) and TD3 became reliable baselines, but the field shifted toward harder locomotion (humanoid parkour, dexterous hands) and sim-to-real transfer after DeepMind's dm_control and Isaac Gym raised the bar. DreamerV3 (2023) showed that world-model approaches can match or beat model-free methods across dozens of control tasks with a single hyperparameter set, signaling a move toward generalist RL agents.

1
Datasets
9
Results
average-return
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

MuJoCo

Physics-based continuous control benchmark. Evaluated on 15 DMControl tasks; metric is mean normalized score (0=random, 1000=expert) at 1M environment steps.

Primary metric: average-return
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on MuJoCo.

#Modelaverage-returnYearSource
TD-MPC2 (317M params)9602026paper ↗
2TD-MPC2 (19M params)9532026paper ↗
3FOWM9452026paper ↗
4BRO9412026paper ↗
5TD-MPC2 (5M params)9292026paper ↗
6DreamerV38972026paper ↗
7TD-MPC8572026paper ↗
8DrQ-v27992026paper ↗
9SAC (state-based)7772026paper ↗

What were you looking for on Continuous Control?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

1 dataset tracked for this task.

MuJoCo
CANONICAL
9 results · average-return
Top: TD-MPC2 (317M params) 960
§ 05 · Related tasks

Other tasks in Reinforcement Learning.

Atari GamesOffline RL
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Continuous Control? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.