GraspNet-1Billion
GraspNet-1Billion is the de-facto benchmark for general object grasping in clutter. It pairs 97,280 real RGB-D images of 190 cluttered tabletop scenes (88 objects, captured from two cameras) with roughly 1.1 billion densely annotated parallel-jaw grasp poses generated analytically and verified in simulation.
Its lasting contribution is a uniform evaluation protocol: an Average Precision (AP) metric over predicted 6-DoF grasps that finally let the field compare grasp-detection models on the same footing. Most modern point-cloud grasp detectors — Graspness, GSNet, and AnyGrasp — report on it.
Primary source →- Source
- Fang et al., CVPR 2020
- Year
- 2020
- Scale
- 97,280 RGB-D images · 190 cluttered scenes · 88 objects · ~1.1B grasp poses
- Gripper
- Parallel-jaw
- Modality
- RGB-D · point cloud
- Best-known
- De-facto clutter benchmark · AnyGrasp current SOTA (AP)
- Standardized AP metric over predicted 6-DoF grasps in clutter
- AnyGrasp (Fang et al., T-RO 2023) is the current state of the art
- Shares its 190 scenes and 88 objects with SuctionNet-1Billion
SIM = simulation result · HW = physical hardware. Image-wise accuracy is detection quality, not real-robot pick success. Figures cited from Fang et al., CVPR 2020.
Related benchmarks
← Back to the grasping registerDex-Net 2.0 →
HW: 93% on adversarial · 99% precision on 40 novel objects (YuMi)
Grasp-Anything →
Language-driven grasp synthesis · open-vocabulary scenes
Jacquard →
~95% image-wise (GR-ConvNet-class)
Cornell Grasp →
~99% image-wise accuracy — saturated benchmark