Robotics · Grasping · Dex-Net 2.0Benchmark detail
Grasp benchmark · Parallel-jaw

Dex-Net 2.0

parallel-jawGQ-CNNsynthetic trainingsim-to-real

Dex-Net 2.0 introduced the Grasp-Quality CNN (GQ-CNN): a network that scores candidate parallel-jaw grasps from a single depth image, trained on 6.7 million synthetic point-cloud / grasp pairs generated from a large dataset of 3D models with analytic robustness labels.

It was a landmark demonstration that a model trained entirely in simulation could transfer to a physical robot with high reliability, anchoring the synthetic-training paradigm that still dominates grasp learning.

Primary source
At a glance
Source
Mahler et al., RSS 2017
Year
2017
Scale
6.7M synthetic point clouds + grasps from thousands of 3D models
Gripper
Parallel-jaw
Modality
Depth
Best-known
HW: 93% on adversarial · 99% precision on 40 novel objects (YuMi)
Key results
  • HW: 93% success on a set of known adversarial objects (ABB YuMi)
  • HW: 99% precision on 40 novel objects
  • Introduced the GQ-CNN grasp-quality network

SIM = simulation result · HW = physical hardware. Image-wise accuracy is detection quality, not real-robot pick success. Figures cited from Mahler et al., RSS 2017.

Related benchmarks

← Back to the grasping register
Parallel-jaw

GraspNet-1Billion

De-facto clutter benchmark · AnyGrasp current SOTA (AP)

Parallel-jaw

Grasp-Anything

Language-driven grasp synthesis · open-vocabulary scenes

Parallel-jaw

Jacquard

~95% image-wise (GR-ConvNet-class)

Parallel-jaw

Cornell Grasp

~99% image-wise accuracy — saturated benchmark