Real-time Joint Semantic Reasoning for Autonomous Driving

Overview

MultiNet

MultiNet is able to jointly perform road segmentation, car detection and street classification. The model achieves real-time speed and state-of-the-art performance in segmentation. Check out our paper for a detailed model description.

MultiNet is optimized to perform well at a real-time speed. It has two components: KittiSeg, which sets a new state-of-the art in road segmentation; and KittiBox, which improves over the baseline Faster-RCNN in both inference speed and detection performance.

The model is designed as an encoder-decoder architecture. It utilizes one VGG encoder and several independent decoders for each task. This repository contains generic code that combines several tensorflow models in one network. The code for the individual tasks is provided by the KittiSeg, KittiBox, and KittiClass repositories. These repositories are utilized as submodules in this project. This project is built to be compatible with the TensorVision back end, which allows for organizing experiments in a very clean way.

Requirements

The code requires Python 2.7, Tensorflow 1.0, as well as the following python libraries:

  • matplotlib
  • numpy
  • Pillow
  • scipy
  • runcython
  • commentjson

Those modules can be installed using: pip install numpy scipy pillow matplotlib runcython commentjson or pip install -r requirements.txt.

Setup

  1. Clone this repository: https://github.com/MarvinTeichmann/MultiNet.git
  2. Initialize all submodules: git submodule update --init --recursive
  3. cd submodules/KittiBox/submodules/utils/ && make to build cython code
  4. [Optional] Download Kitti Road Data:
    1. Retrieve kitti data url here: http://www.cvlibs.net/download.php?file=data_road.zip
    2. Call python download_data.py --kitti_url URL_YOU_RETRIEVED
  5. [Optional] Run cd submodules/KittiBox/submodules/KittiObjective2/ && make to build the Kitti evaluation code (see submodules/KittiBox/submodules/KittiObjective2/README.md for more information)

Running the model using demo.py only requires you to perform step 1-3. Step 4 and 5 is only required if you want to train your own model using train.py. Note that I recommend using download_data.py instead of downloading the data yourself. The script will also extract and prepare the data. See Section Manage data storage if you like to control where the data is stored.

To update MultiNet do:
  1. Pull all patches: git pull
  2. Update all submodules: git submodule update --init --recursive

If you forget the second step you might end up with an inconstant repository state. You will already have the new code for MultiNet but run it old submodule versions code. This can work, but I do not run any tests to verify this.

Tutorial

Getting started

Run: python demo.py --gpus 0 --input data/demo/um_000005.png to obtain a prediction using demo.png as input.

Run: python evaluate.py to evaluate a trained model.

Run: python train.py --hypes hypes/multinet2.json to train a multinet2

If you like to understand the code, I would recommend looking at demo.py first. I have documented each step as thoroughly as possible in this file.

Only training of MultiNet3 (joint detection and segmentation) is supported out of the box. The data to train the classification model is not public an those cannot be used to train the full MultiNet3 (detection, segmentation and classification). The full code is given here, so you can still train MultiNet3 if you have your own data.

Manage Data Storage

MultiNet allows to separate data storage from code. This is very useful in many server environments. By default, the data is stored in the folder MultiNet/DATA and the output of runs in MultiNet/RUNS. This behaviour can be changed by setting the bash environment variables: $TV_DIR_DATA and $TV_DIR_RUNS.

Include export TV_DIR_DATA="/MY/LARGE/HDD/DATA" in your .profile and the all data will be downloaded to /MY/LARGE/HDD/DATA/. Include export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS" in your .profile and all runs will be saved to /MY/LARGE/HDD/RUNS/MultiNet

Modifying Model & Train on your own data

The model is controlled by the file hypes/multinet3.json. This file points the code to the implementation of the submodels. The MultiNet code then loads all models provided and integrates the decoders into one neural network. To train on your own data, it should be enough to modify the hype files of the submodels. A good start will be the KittiSeg model, which is very well documented.

    "models": {
        "segmentation" : "../submodules/KittiSeg/hypes/KittiSeg.json",
        "detection" : "../submodules/KittiBox/hypes/kittiBox.json",
        "road" : "../submodules/KittiClass/hypes/KittiClass.json"
    },

RUNDIR and Experiment Organization

MultiNet helps you to organize a large number of experiments. To do so, the output of each run is stored in its own rundir. Each rundir contains:

  • output.log a copy of the training output which was printed to your screen
  • tensorflow events tensorboard can be run in rundir
  • tensorflow checkpoints the trained model can be loaded from rundir
  • [dir] images a folder containing example output images. image_iter controls how often the whole validation set is dumped
  • [dir] model_files A copy of all source code need to build the model. This can be very useful of you have many versions of the model.

To keep track of all the experiments, you can give each rundir a unique name with the --name flag. The --project flag will store the run in a separate subfolder allowing to run different series of experiments. As an example, python train.py --project batch_size_bench --name size_5 will use the following dir as rundir: $TV_DIR_RUNS/KittiSeg/batch_size_bench/size_5_KittiSeg_2017_02_08_13.12.

The flag --nosave is very useful to not spam your rundir.

Useful Flags & Variabels

Here are some Flags which will be useful when working with KittiSeg and TensorVision. All flags are available across all scripts.

--hypes : specify which hype-file to use
--logdir : specify which logdir to use
--gpus : specify on which GPUs to run the code
--name : assign a name to the run
--project : assign a project to the run
--nosave : debug run, logdir will be set to debug

In addition the following TensorVision environment Variables will be useful:

$TV_DIR_DATA: specify meta directory for data
$TV_DIR_RUNS: specify meta directory for output
$TV_USE_GPUS: specify default GPU behaviour.

On a cluster it is useful to set $TV_USE_GPUS=force. This will make the flag --gpus mandatory and ensure, that run will be executed on the right GPU.

Citation

If you benefit from this code, please cite our paper:

@article{teichmann2016multinet,
  title={MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving},
  author={Teichmann, Marvin and Weber, Michael and Zoellner, Marius and Cipolla, Roberto and Urtasun, Raquel},
  journal={arXiv preprint arXiv:1612.07695},
  year={2016}
}
Owner
Marvin Teichmann
Germany Phd student. Working on Deep Learning and Computer Vision projects.
Marvin Teichmann
RL agent to play μRTS with Stable-Baselines3

Gym-μRTS with Stable-Baselines3/PyTorch This repo contains an attempt to reproduce Gridnet PPO with invalid action masking algorithm to play μRTS usin

Oleksii Kachaiev 24 Nov 11, 2022
This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

AtlasNet V2 - Learning Elementary Structures This work was build upon Thibault Groueix's AtlasNet and 3D-CODED projects. (you might want to have a loo

Théo Deprelle 123 Nov 11, 2022
这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer

Time Series Research with Torch 这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer。 建立原因 相较于mxnet和TF,Torch框架中的神经网络层需要提前指定输入维度: # 建立线性层 TensorF

Chi Zhang 85 Dec 29, 2022
Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

Victor-Alexandru Darvariu 3 Dec 05, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

Oliver Hahn 1 Jan 26, 2022
Streamlit tool to explore coco datasets

What is this This tool given a COCO annotations file and COCO predictions file will let you explore your dataset, visualize results and calculate impo

Jakub Cieslik 75 Dec 16, 2022
This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields This is the code for "HyperNeRF: A Higher-Dimensional

Google 702 Jan 02, 2023
Code for Ditto: Building Digital Twins of Articulated Objects from Interaction

Ditto: Building Digital Twins of Articulated Objects from Interaction Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu CVPR 2022, Oral Project | arxiv News 2022

UT Robot Perception and Learning Lab 78 Dec 22, 2022
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 04, 2023
ParaGen is a PyTorch deep learning framework for parallel sequence generation

ParaGen is a PyTorch deep learning framework for parallel sequence generation. Apart from sequence generation, ParaGen also enhances various NLP tasks, including sequence-level classification, extrac

Bytedance Inc. 169 Dec 22, 2022
HyperLib: Deep learning in the Hyperbolic space

HyperLib: Deep learning in the Hyperbolic space Background This library implements common Neural Network components in the hypberbolic space (using th

105 Dec 25, 2022
PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

1 Oct 02, 2021
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News December 27: v1.1.0 New loss functions: CentroidTripletLoss and VICRegLoss Mean reciprocal rank + per-class accuracies See the release notes Than

Kevin Musgrave 5k Jan 05, 2023
Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

PyOpenVINO - An Experimental Python Implementation of OpenVINO Inference Engine (minimum-set) Description The PyOpenVINO is a spin-off product from my

Yasunori Shimura 7 Oct 31, 2022
⚡ H2G-Net for Semantic Segmentation of Histopathological Images

H2G-Net This repository contains the code relevant for the proposed design H2G-Net, which was introduced in the manuscript "Hybrid guiding: A multi-re

André Pedersen 8 Nov 24, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
The Official Repository for "Generalized OOD Detection: A Survey"

Generalized Out-of-Distribution Detection: A Survey 1. Overview This repository is with our survey paper: Title: Generalized Out-of-Distribution Detec

Jingkang Yang 338 Jan 03, 2023
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 04, 2023