Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

Last update: Dec 29, 2022

Overview

GraspNet Baseline

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020).

[paper] [dataset] [API] [doc]

Top 50 grasps detected by our baseline model.

Requirements

Python 3
PyTorch 1.6
Open3d >=0.8
TensorBoard 2.3
NumPy
SciPy
Pillow
tqdm

Installation

Get the code.

git clone https://github.com/graspnet/graspnet-baseline.git
cd graspnet-baseline

Install packages via Pip.

pip install -r requirements.txt

Compile and install pointnet2 operators (code adapted from votenet).

cd pointnet2
python setup.py install

Compile and install knn operator (code adapted from pytorch_knn_cuda).

cd knn
python setup.py install

Install graspnetAPI for evaluation.

git clone https://github.com/graspnet/graspnetAPI.git
cd graspnetAPI
pip install .

Tolerance Label Generation

Tolerance labels are not included in the original dataset, and need additional generation. Make sure you have downloaded the orginal dataset from GraspNet. The generation code is in dataset/generate_tolerance_label.py. You can simply generate tolerance label by running the script: (--dataset_root and --num_workers should be specified according to your settings)

cd dataset
sh command_generate_tolerance_label.sh

Or you can download the tolerance labels from Google Drive/Baidu Pan and run:

mv tolerance.tar dataset/
cd dataset
tar -xvf tolerance.tar

Training and Testing

Training examples are shown in command_train.sh. --dataset_root, --camera and --log_dir should be specified according to your settings. You can use TensorBoard to visualize training process.

Testing examples are shown in command_test.sh, which contains inference and result evaluation. --dataset_root, --camera, --checkpoint_path and --dump_dir should be specified according to your settings. Set --collision_thresh to -1 for fast inference.

The pretrained weights can be downloaded from:

checkpoint-rs.tar [Google Drive] [Baidu Pan]
checkpoint-kn.tar [Google Drive] [Baidu Pan]

checkpoint-rs.tar and checkpoint-kn.tar are trained using RealSense data and Kinect data respectively.

Demo

A demo program is provided for grasp detection and visualization using RGB-D images. You can refer to command_demo.sh to run the program. --checkpoint_path should be specified according to your settings (make sure you have downloaded the pretrained weights). The output should be similar to the following example:

Try your own data by modifying get_and_process_data() in demo.py. Refer to doc/example_data/ for data preparation. RGB-D images and camera intrinsics are required for inference. factor_depth stands for the scale for depth value to be transformed into meters. You can also add a workspace mask for denser output.

Results

Results "In repo" report the model performance with single-view collision detection as post-processing. In evaluation we set --collision_thresh to 0.01.

Evaluation results on RealSense camera:

		Seen			Similar			Novel
	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4
In paper	27.56	33.43	16.95	26.11	34.18	14.23	10.55	11.25	3.98
In repo	47.47	55.90	41.33	42.27	51.01	35.40	16.61	20.84	8.30

Evaluation results on Kinect camera:

		Seen			Similar			Novel
	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4
In paper	29.88	36.19	19.31	27.84	33.19	16.62	11.51	12.92	3.56
In repo	42.02	49.91	35.34	37.35	44.82	30.40	12.17	15.17	5.51

Citation

Please cite our paper in your publications if it helps your research:

@inproceedings{fang2020graspnet,
  title={GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping},
  author={Fang, Hao-Shu and Wang, Chenxi and Gou, Minghao and Lu, Cewu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR)},
  pages={11444--11453},
  year={2020}
}

License

All data, labels, code and models belong to the graspnet team, MVIG, SJTU and are freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an email at fhaoshu at gmail_dot_com and cc lucewu at sjtu.edu.cn .

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

Related tags

Overview

GraspNet Baseline

Requirements

Installation

Tolerance Label Generation

Training and Testing

Demo

Results

Citation

License

Owner

GraspNet

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Convert human motion from video to .bvh

Python implementation of Bayesian optimization over permutation spaces.

A simple baseline for 3d human pose estimation in PyTorch.

Repo for the Video Person Clustering dataset, and code for the associated paper

Python package for missing-data imputation with deep learning

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"

Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

PRTR: Pose Recognition with Cascade Transformers

An end-to-end image translation model with weight-map for color constancy

A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

Chinese license plate recognition

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Training Structured Neural Networks Through Manifold Identification and Variance Reduction