Codesota · Models · OpenVLAStanford / Google DeepMind / TRI1 results · 1 benchmarks

Model card

OpenVLA.

Stanford / Google DeepMind / TRIopen-source7B vision-language-action model based on Llama-2 + fused DINOv2/SigLIP vision encoders

Kim et al. 2024. Open-source VLA model fine-tuned on Open X-Embodiment data.

GitHub ↗

§ 02 · Benchmarks

Every benchmark OpenVLA has a recorded score for.

#	Benchmark	Area · Task	Metric	Value	Rank	Date	Source
01	LIBERO-Long	Robots · Robot Manipulation	success-rate	76.5%	#5/5	—	source ↗

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.

§ 03 · Strengths by area

Where OpenVLA actually performs.

Robots

benchmark

avg rank #5.0

§ 04 · Papers

1 paper with results for OpenVLA.

2024-06-13· 1 result
OpenVLA: An Open-Source Vision-Language-Action Model

§ 06 · Sources & freshness

Where these numbers come from.

pwc-dump

result

0 of 1 rows marked verified.

OpenVLA.

Every benchmark OpenVLA has a recorded score for.

Where OpenVLA actually performs.

1 paper with results for OpenVLA.

OpenVLA: An Open-Source Vision-Language-Action Model

Where these numbers come from.