LIBERO-Long (also called LIBERO-10) is one of four task suites in the LIBERO benchmark for lifelong robot learning. It contains 10 long-horizon manipulation tasks requiring multi-step reasoning and diverse object/spatial/goal knowledge. Reported as success rate (%).
Success Rate is the reported evaluation metric for LIBERO-Long. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Edit |
|---|---|---|---|---|---|---|
| 01 | MolmoAct2-Think | unverified | 98.1 | 2026 | Paper ↗Code ↗ | Edit result |
| 02 | MolmoAct2 | unverified | 97.2 | 2026 | Paper ↗Code ↗ | Edit result |
| 03 | UD-VLA | unverified | 92.7 | 2025 | Paper ↗Code ↗ | Edit result |
| 04 | SmolVLA (2.25B) | unverified | 88.75 | 2025 | Paper ↗Code ↗ | Edit result |
| 05 | OpenVLA | unverified | 76.5 | 2024 | Paper ↗Code ↗ | Edit result |
Success Rate is the reported evaluation metric for LIBERO-Long. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Edit |
|---|---|---|---|---|---|---|
| 01 | π0 (Pi-Zero) | paper | 85.2 | 2026 | Source ↗ | Edit result |
| 02 | OpenVLA | paper | 53.7 | 2026 | Source ↗ | Edit result |
| 03 | Octo-Base | paper | 51.1 | 2026 | Source ↗ | Edit result |