Large-scale robot learning benchmark with 100 diverse manipulation tasks in simulation. Standard multi-task benchmark for language-conditioned robotic manipulation. Evaluated on 18 tasks with 100 demonstrations.
Average task success rate across 18 RLBench manipulation tasks with 100 demonstrations each.
Higher is better
Muted rows were not state of the art when published — an earlier or same-year result already scored better.
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | RVT-2 | verified | 81.4 | 2026 | Source ↗ | Looks wrong? |
| 02 | RVT | verified | 62.9 | 2026 | Source ↗ | Looks wrong? |
| 03 | PerAct | verified | 43.4 | 2026 | Source ↗ | Looks wrong? |