Cornell Grasp
The Cornell Grasp Dataset is the classic planar-grasp benchmark: 885 RGB-D images of 240 graspable objects annotated with 8,019 ground-truth grasp rectangles. It defined the 5-parameter planar grasp representation (x, y, θ, width, height).
It is effectively saturated — modern detectors report ~99% image-wise accuracy — which is precisely why it should not be read as a measure of real-robot reliability.
Primary source →- Source
- Lenz et al., IJRR / RSS 2013–15
- Year
- 2011–13
- Scale
- 885 RGB-D images · 240 objects · 8,019 labeled grasp rectangles
- Gripper
- Parallel-jaw
- Modality
- RGB-D
- Best-known
- ~99% image-wise accuracy — saturated benchmark
- ~99% image-wise accuracy (saturated)
- Defined the planar grasp-rectangle representation
- Detection accuracy, not physical pick success
SIM = simulation result · HW = physical hardware. Image-wise accuracy is detection quality, not real-robot pick success. Figures cited from Lenz et al., IJRR / RSS 2013–15.
Related benchmarks
← Back to the grasping registerGraspNet-1Billion →
De-facto clutter benchmark · AnyGrasp current SOTA (AP)
Dex-Net 2.0 →
HW: 93% on adversarial · 99% precision on 40 novel objects (YuMi)
Grasp-Anything →
Language-driven grasp synthesis · open-vocabulary scenes
Jacquard →
~95% image-wise (GR-ConvNet-class)