Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.

Overview

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.

Build Status

By Andres Milioto @ University of Bonn.

(for the new Pytorch version, go here)

Image of cityscapes Cityscapes Urban Scene understanding.

Image of Persons Person Segmentation

Image of cwc Crop vs. Weed Semantic Segmentation.

Description

This code provides a framework to easily add architectures and datasets, in order to train and deploy CNNs for a robot. It contains a full training pipeline in python using Tensorflow and OpenCV, and it also some C++ apps to deploy a frozen protobuf in ROS and standalone. The C++ library is made in a way which allows to add other backends (such as TensorRT and MvNCS), but only Tensorflow and TensorRT are implemented for now. For now, we will keep it this way because we are mostly interested in deployment for the Jetson and Drive platforms, but if you have a specific need, we accept pull requests!

The networks included is based of of many other architectures (see below), but not exactly a copy of any of them. As seen in the videos, they run very fast in both GPU and CPU, and they are designed with performance in mind, at the cost of a slight accuracy loss. Feel free to use it as a model to implement your own architecture.

All scripts have been tested on the following configurations:

  • x86 Ubuntu 16.04 with an NVIDIA GeForce 940MX GPU (nvidia-384, CUDA9, CUDNN7, TF 1.7, TensorRT3)
  • x86 Ubuntu 16.04 with an NVIDIA GTX1080Ti GPU (nvidia-375, CUDA9, CUDNN7, TF 1.7, TensorRT3)
  • x86 Ubuntu 16.04 and 14.04 with no GPU (TF 1.7, running on CPU in NHWC mode, no TensorRT support)
  • Jetson TX2 (full Jetpack 3.2)

We also provide a Dockerfile to make it easy to run without worrying about the dependencies, which is based on the official nvidia/cuda image containing cuda9 and cudnn7. In order to build and run this image with support for X11 (to display the results), you can run this in the repo root directory (nvidia-docker should be used instead of vainilla docker):

  $ docker pull tano297/bonnet:cuda9-cudnn7-tf17-trt304
  $ nvidia-docker build -t bonnet .
  $ nvidia-docker run -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/.Xauthority:/home/developer/.Xauthority -v /home/$USER/data:/shared --net=host --pid=host --ipc=host bonnet /bin/bash

-v /home/$USER/data:/share can be replaced to point to wherever you store the data and trained models, in order to include the data inside the container for inference/training.

Deployment

  • /deploy_cpp contains C++ code for deployment on robot of the full pipeline, which takes an image as input and produces the pixel-wise predictions as output, and the color masks (which depend on the problem). It includes both standalone operation which is meant as an example of usage and build, and a ROS node which takes a topic with an image and outputs 2 topics with the labeled mask and the colored labeled mask.

  • Readme here

Training

  • /train_py contains Python code to easily build CNN Graphs in Tensorflow, train, and generate the trained models used for deployment. This way the interface with Tensorflow can use the more complete Python API and we can easily work with files to augment datasets and so on. It also contains some apps for using models, which includes the ability to save and use a frozen protobuf, and to use the network using TensorRT, which reduces the time for inference when using NVIDIA GPUs.

  • Readme here

Pre-trained models

These are some models trained on some sample datasets that you can use with the trainer and deployer, but if you want to take time to write the parsers for another dataset (yaml file with classes and colors + python script to put the data into the standard dataset format) feel free to create a pull request.

If you don't have GPUs and the task is interesting for robots to exploit, I will gladly train it whenever I have some free GPU time in our servers.

  • Cityscapes:

    • 512x256 Link
    • 768x384 Link (inception-like model)
    • 768x384 Link (mobilenets-like model)
    • 1024x512 Link
  • Synthia:

  • Persons (+coco people):

  • Crop-Weed (CWC):

License

This software

Bonnet is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Bonnet is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Pretrained models

The pretrained models with a specific dataset keep the copyright of such dataset.

Citation

If you use our framework for any academic work, please cite its paper.

@InProceedings{milioto2019icra,
author = {A. Milioto and C. Stachniss},
title = {{Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics using CNNs}},
booktitle = {Proc. of the IEEE Intl. Conf. on Robotics \& Automation (ICRA)},
year = 2019,
codeurl = {https://github.com/Photogrammetry-Robotics-Bonn/bonnet},
videourl = {https://www.youtube.com/watch?v=tfeFHCq6YJs},
}

Our networks are strongly based on the following architectures, so if you use them for any academic work, please give a look at their papers and cite them if you think proper:

Other useful GitHub's:

  • OpenAI Checkpointed Gradients. Useful implementation of checkpointed gradients to be able to fit big models in GPU memory without sacrificing runtime.
  • Queueing tool: Very nice queueing tool to share GPU, CPU and Memory resources in a multi-GPU environment.
  • Tensorflow_cc: Very useful repo to compile Tensorflow either as a shared or static library using CMake, in order to be able to compile our C++ apps against it.

Contributors

Milioto, Andres

Special thanks to Philipp Lottes for all the work shared during the last year, and to Olga Vysotka and Susanne Wenzel for beta testing the framework :)

Acknowledgements

This work has partly been supported by the German Research Foundation under Germany's Excellence Strategy, EXC-2070 - 390732324 (PhenoRob). We also thank NVIDIA Corporation for providing a Quadro P6000 GPU partially used to develop this framework.

TODOs

  • Merge Crop-weed CNN with background knowledge into this repo.
  • Make multi-camera ROS node that exploits batching to make inference faster than sequentially.
  • Movidius Neural Stick C++ backends (plus others as they become available).
  • Inference node to show the classes selectively (e.g. with some qt visual GUI)
Owner
Photogrammetry & Robotics Bonn
Photogrammetry & Robotics Lab at the University of Bonn
Photogrammetry & Robotics Bonn
Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video 📹 Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment 12 Dec 18, 2022
PyTorch implementation of DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration (BMVC 2021)

DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration [video] [paper] [supplementary] [data] [thesis] Introduction De

Natalie Lang 10 Dec 14, 2022
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors   In order to facilitate the res

yujmo 11 Dec 12, 2022
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023
⚾🤖⚾ Automatic baseball pitching overlay in realtime

âš¾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

Tony Chou 240 Dec 05, 2022
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frede

Edresson Casanova 92 Dec 09, 2022
Learning where to learn - Gradient sparsity in meta and continual learning

Learning where to learn - Gradient sparsity in meta and continual learning In this paper, we investigate gradient sparsity found by MAML in various co

Johannes Oswald 28 Dec 09, 2022
Multi-objective gym environments for reinforcement learning.

MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo

Lucas Alegre 74 Jan 03, 2023
Music library streaming app written in Flask & VueJS

djtaytay This is a little toy app made to explore Vue, brush up on my Python, and make a remote music collection accessable through a web interface. I

Ryan Tasson 6 May 27, 2022
The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Representative Batch Normalization (RBN) with Feature Calibration The official implementation of the CVPR2021 oral paper: Representative Batch Normali

Open source projects of ShangHua-Gao 76 Nov 09, 2022
Implementation of "RaScaNet: Learning Tiny Models by Raster-Scanning Image" from CVPR 2021.

RaScaNet: Learning Tiny Models by Raster-Scanning Images Deploying deep convolutional neural networks on ultra-low power systems is challenging, becau

SAIT (Samsung Advanced Institute of Technology) 5 Dec 26, 2022
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

62 Dec 22, 2022
Constructing Neural Network-Based Models for Simulating Dynamical Systems

Constructing Neural Network-Based Models for Simulating Dynamical Systems Note this repo is work in progress prior to reviewing This is a companion re

Christian Møldrup Legaard 21 Nov 25, 2022
Group Fisher Pruning for Practical Network Compression(ICML2021)

Group Fisher Pruning for Practical Network Compression (ICML2021) By Liyang Liu*, Shilong Zhang*, Zhanghui Kuang, Jing-Hao Xue, Aojun Zhou, Xinjiang W

Shilong Zhang 129 Dec 13, 2022
iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

CompVis Heidelberg 36 Dec 25, 2022
This dlib-based facial login system

Facial-Login-System This dlib-based facial login system is a technology capable of matching a human face from a digital webcam frame capture against a

Mushahid Ali 3 Apr 23, 2022
An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine

A Gentle Introduction to Satellite Image Processing Welcome to this introductory course on Satellite Image Analysis! Satellite imagery has become a pr

Edward Oughton 32 Jan 03, 2023
Video2x - A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR.

Official Discussion Group (Telegram): https://t.me/video2x A Discord server is also available. Please note that most developers are only on Telegram.

K4YT3X 5.9k Dec 31, 2022
Training a Resilient Q-Network against Observational Interference, Causal Inference Q-Networks

Obs-Causal-Q-Network AAAI 2022 - Training a Resilient Q-Network against Observational Interference Preprint | Slides | Colab Demo | Environment Setup

23 Nov 21, 2022
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str

76 Nov 23, 2022