Codesota · General · Vision-Language Models · Meta-World authors' collected datasetTasks/General/Vision-Language Models
Vision-Language Models · benchmark dataset · EN

Meta-World MT50 (authors' collected dataset).

Meta-World (authors' collected dataset) — a collection of simulated demonstrations in the Meta-World MT50 benchmark used by the SmolVLA paper (arXiv:2506.01844). According to the Hugging Face dataset card (lerobot/metaworld_mt50) the dataset was created with LeRobot and contains 2,500 episodes (total_frames: 204,806), ~49 tasks (HF metadata lists total_tasks: 49), fps: 80, stored in parquet/video chunks; license: apache-2.0. From the SmolVLA paper: the authors collected 50 demonstrations per each of the 50 MT50 tasks (2,500 episodes) and evaluate with 10 trials per task reporting a binary success rate averaged across tasks. Hugging Face dataset: https://huggingface.co/datasets/lerobot/metaworld_mt50 (meta/info.json lists the dataset metadata shown above).

Paper Submit a result
§ 01 · Leaderboard

Best published scores.

No results indexed yet — be the first to submit a score.

No benchmark results indexed yet
§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result Read submission guide
What a submission needs
  • 01A public checkpoint or API endpoint
  • 02A reproduction script with frozen commit + seed
  • 03Declared evaluation environment (Python, deps)
  • 04One row per metric declared by this dataset
  • 05A contact so we can follow up on discrepancies