How many models are tracked on LIBERO-Long?

Codesota tracks 5 models on LIBERO-Long.

When was the LIBERO-Long leaderboard last updated?

The LIBERO-Long leaderboard on Codesota includes results through 2026, with the earliest tracked result from 2024.

Codesota · Robots · Robot Manipulation · LIBERO-LongTasks/Robots/Robot Manipulation

Robot Manipulation · benchmark dataset · 2023 · EN

LIBERO-Long (LIBERO-10): 10 long-horizon robot manipulation tasks.

Name: LIBERO-Long (LIBERO-10): 10 long-horizon robot manipulation tasks Benchmark Results
Creator: Codesota
Published: 2024-01-01
License: https://creativecommons.org/licenses/by/4.0/

LIBERO-Long (also called LIBERO-10) is one of four task suites in the LIBERO benchmark for lifelong robot learning. It contains 10 long-horizon manipulation tasks requiring multi-step reasoning and diverse object/spatial/goal knowledge. Reported as success rate (%).

Paper ↗Submit a result ↵

§ 01 · Leaderboard

Best published scores.

5 results indexed across 1 metric. Shaded row marks current SOTA; ties broken by submission date.

Primary: success_rate · higher is better

success-rate

5 rows

#	Model	Org	Submitted	Paper / code	success-rate
01	MolmoAct2-Think	—	May 2026	MolmoAct2: Action Reasoning Models for Real-world Deploy… · code	98.10
02	MolmoAct2	—	May 2026	MolmoAct2: Action Reasoning Models for Real-world Deploy… · code	97.20
03	UD-VLA	—	Nov 2025	Unified Diffusion VLA: Vision-Language-Action Model via … · code	92.70
04	SmolVLA (2.25B)	—	Jun 2025	SmolVLA: A Vision-Language-Action Model for Affordable a… · code	88.75
05	OpenVLAOSS	Stanford / Google DeepMind / TRI	Jun 2024	OpenVLA: An Open-Source Vision-Language-Action Model · code	76.50

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 04 · Literature

4 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

MolmoAct2: Action Reasoning Models for Real-world Deployment
May 2026·MolmoAct2-Think, MolmoAct2
arXiv ↗Code
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process
Nov 2025·UD-VLA
arXiv ↗Code
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Jun 2025·SmolVLA (2.25B)
arXiv ↗Code
OpenVLA: An Open-Source Vision-Language-Action Model
Jun 2024·OpenVLA
arXiv ↗Code

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

LIBERO-Long (LIBERO-10): 10 long-horizon robot manipulation tasks.

Best published scores.

4 paperstied to this benchmark.

Have a score that beatsthis table?

4 papers
tied to this benchmark.

Have a score that beats
this table?