Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Last update: Dec 09, 2022

Related tags

Deep Learning FusionDepth

Overview

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

This paper has been accepted by Conference on Robot Learning 2021.

By Ziyue Feng, Longlong Jing, Peng Yin, Yingli Tian, and Bing Li.

Arxiv: Link YouTube: link Slides: Link

Abstract

Self-supervised monocular depth prediction provides a cost-effective solution to obtain the 3D location of each pixel. However, the existing approaches usually lead to unsatisfactory accuracy, which is critical for autonomous robots. In this paper, we propose a novel two-stage network to advance the self-supervised monocular dense depth learning by leveraging low-cost sparse (e.g. 4-beam) LiDAR. Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps. Then, an efficient feed-forward refine network is further designed to correct the errors in these initial depth maps in pseudo-3D space with real-time performance. Extensive experiments show that our proposed model significantly outperforms all the state-of-the-art self-supervised methods, as well as the sparse-LiDAR-based methods on both self-supervised monocular depth prediction and completion tasks. With the accurate dense depth prediction, our model outperforms the state-of-the-art sparse-LiDAR-based method (Pseudo-LiDAR++) by more than 68% for the downstream task monocular 3D object detection on the KITTI Leaderboard.

⚙️ Setup

You can install the dependencies with:

conda create -n depth python=3.6.6
conda activate depth
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge
pip install tensorboardX==1.4
conda install opencv=3.3.1   # just needed for evaluation
pip install open3d
pip install wandb
pip install scikit-image

We ran our experiments with PyTorch 1.8.0, CUDA 11.1, Python 3.6.6 and Ubuntu 18.04.

💾 KITTI Data Prepare

Download Data

You need to first download the KITTI RAW dataset, put in the kitti_data folder.

Our default settings expect that you have converted the png images to jpeg with this command, which also deletes the raw KITTI .png files:

find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}'

or you can skip this conversion step and train from raw png files by adding the flag --png when training, at the expense of slower load times.

Preprocess Data

# bash prepare_1beam_data_for_prediction.sh
# bash prepare_2beam_data_for_prediction.sh
# bash prepare_3beam_data_for_prediction.sh
bash prepare_4beam_data_for_prediction.sh
# bash prepare_r100.sh # random sample 100 LiDAR points
# bash prepare_r200.sh # random sample 200 LiDAR points

⏳ Training

By default models and tensorboard event files are saved to log/mdp/.

Depth Prediction:

python trainer.py
python inf_depth_map.py --need_path
python inf_gdc.py
python refiner.py

Depth Completion:

Please first download the KITTI Completion dataset.

python completor.py

Monocular 3D Object Detection:

Please first download the KITTI 3D Detection dataset.

python export_detection.py

Then you can train the PatchNet based on the exported depth maps.

📊 KITTI evaluation

python evaluate_depth.py
python evaluate_completion.py

Citation

@article{feng2021advancing,
  title={Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR},
  author={Feng, Ziyue and Jing, Longlong and Yin, Peng and Tian, Yingli and Li, Bing},
  journal={arXiv preprint arXiv:2109.09628},
  year={2021}
}

Reference

Our code is based on the Monodepth2: https://github.com/nianticlabs/monodepth2

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Related tags

Overview

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Abstract

⚙️ Setup

💾 KITTI Data Prepare

⏳ Training

📊 KITTI evaluation

Citation

Reference

Owner

Ziyue Feng

Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"

This is an official implementation for "PlaneRecNet".

Multi-Horizon-Forecasting-for-Limit-Order-Books

Create time-series datacubes for supervised machine learning with ICEYE SAR images.

Metrics to evaluate quality and efficacy of synthetic datasets.

EXplainable Artificial Intelligence (XAI)

Bringing sanity to world of messed-up data

CSD: Consistency-based Semi-supervised learning for object Detection

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

An Implementation of Fully Convolutional Networks in Tensorflow.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Unified learning approach for egocentric hand gesture recognition and fingertip detection

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

This is an official source code for implementation on Extensive Deep Temporal Point Process

Code and data (Incidents Dataset) for ECCV 2020 Paper "Detecting natural disasters, damage, and incidents in the wild".

FFCV: Fast Forward Computer Vision (and other ML workloads!)

ByteTrack超详细教程！训练自己的数据集&&摄像头实时检测跟踪

mmdetection version of TinyBenchmark.