[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Overview

Panoptic Segmentation Forecasting

Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021

[Link to paper]

Animated gif showing visual comparison of our model's results compared against the hybrid baseline

We propose to study the novel task of ‘panoptic segmentation forecasting’: given a set of observed frames, the goal is to forecast the panoptic segmentation for a set of unobserved frames. We also propose a first approach to forecasting future panoptic segmentations. In contrast to typical semantic forecasting, we model the motion of individual object instances and the background separately. This makes instance information persistent during forecasting, and allows us to understand the motion of each moving object.

Image presenting the model diagram

⚙️ Setup

Dependencies

Install the code using the following command: pip install -e ./

Data

  • To run this code, the gtFine_trainvaltest dataset will need to be downloaded from the Cityscapes website into the data/ directory.
  • The remainder of the required data can be downloaded using the script download_data.sh. By default, everything is downloaded into the data/ directory.
  • Training the background model requires generating a version of the semantic segmentation annotations where foreground regions have been removed. This can be done by running the script scripts/preprocessing/remove_fg_from_gt.sh.
  • Training the foreground model requires additionally downloading a pretrained MaskRCNN model. This can be found at this link. This should be saved as pretrained_models/fg/mask_rcnn_pretrain.pkl.
  • Training the background model requires additionally downloading a pretrained HarDNet model. This can be found at this link. This should be saved as pretrained_models/bg/hardnet70_cityscapes_model.pkl.

Running our code

The scripts directory contains scripts which can be used to train and evaluate the foreground, background, and egomotion models. Specifically:

  • scripts/odom/run_odom_train.sh trains the egomotion prediction model.
  • scripts/odom/export_odom.sh exports the odometry predictions, which can then be used during evaluation by other models
  • scripts/bg/run_bg_train.sh trains the background prediction model.
  • scripts/bg/run_export_bg_val.sh exports predictions make by the background using input reprojected point clouds which come from using predicted egomotion.
  • scripts/fg/run_fg_train.sh trains the foreground prediction model.
  • scripts/fg/run_fg_eval_panoptic.sh produces final panoptic semgnetation predictions based on the trained foreground model and exported background predictions. This also uses predicted egomotion as input.

We provide our pretrained foreground, background, and egomotion prediction models. The data downloading script additionally downloads these models into the directory pretrained_models/

✏️ 📄 Citation

If you found our work relevant to yours, please consider citing our paper:

@inproceedings{graber-2021-panopticforecasting,
 title   = {Panoptic Segmentation Forecasting},
 author  = {Colin Graber and
            Grace Tsai and
            Michael Firman and
            Gabriel Brostow and
            Alexander Schwing},
 booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
 year = {2021}
}

👩‍⚖️ License

Copyright © Niantic, Inc. 2021. Patent Pending. All rights reserved. Please see the license file for terms.

Owner
Niantic Labs
Building technologies and ideas that move us
Niantic Labs
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Sparse Structure Learning via Graph Neural Networks for inductive document classification Make graph dataset create co-occurrence graph for datasets.

16 Dec 22, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python = 3.8 pytorch = 1.8.0

DV Lab 63 Dec 16, 2022
Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

Hasib Zunair 9 Jun 08, 2022
Personals scripts using ageitgey/face_recognition

HOW TO USE pip3 install requirements.txt Add some pictures of known people in the folder 'people' : a) Create a folder called by the name of the perso

Antoine Bollengier 1 Jan 06, 2022
Differentiable Abundance Matching With Python

shamnet Differentiable Stellar Population Synthesis Installation You can install shamnet with pip. Installation dependencies are numpy, jax, corrfunc,

5 Dec 17, 2021
*ObjDetApp* deploys a pytorch model for object detection

*ObjDetApp* deploys a pytorch model for object detection

Will Chao 1 Dec 26, 2021
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

117 Dec 28, 2022
DNA-RECON { Automatic Web Reconnaissance Tool }

ABOUT TOOL : DNA-RECON is an automatic web reconnaissance tool written in python. This tool made for reconnaissance and information gathering with an

NIKUNJ BHATT 25 Aug 11, 2021
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

Sea AI Lab 251 Dec 30, 2022
A vision library for performing sliced inference on large images/small objects

SAHI: Slicing Aided Hyper Inference A vision library for performing sliced inference on large images/small objects Overview Object detection and insta

Open Business Software Solutions 2.3k Jan 04, 2023
Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”

Plant extraction workflow Source code for the plant extraction workflow introduced in the paper "Agricultural Plant Cataloging and Establishment of a

Maurice Günder 0 Apr 22, 2022
BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer Project Page | Paper | Video State-of-the-art image-to-image translatio

47 Dec 06, 2022
AutoML library for deep learning

Official Website: autokeras.com AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras

Keras 8.7k Jan 08, 2023
🔀 Visual Room Rearrangement

AI2-THOR Rearrangement Challenge Welcome to the 2021 AI2-THOR Rearrangement Challenge hosted at the CVPR'21 Embodied-AI Workshop. The goal of this cha

AI2 55 Dec 22, 2022
LBBA-boosted WSOD

LBBA-boosted WSOD Summary Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN Sincerely thanks for your resources. Newer version of our code

Martin Dong 20 Sep 19, 2022
Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Ian Pointer 368 Dec 17, 2022
Annealed Flow Transport Monte Carlo

Annealed Flow Transport Monte Carlo Open source implementation accompanying ICML 2021 paper by Michael Arbel*, Alexander G. D. G. Matthews* and Arnaud

DeepMind 30 Nov 21, 2022
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation

DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation This repository is the implementation of DynaTune paper. This folder

4 Nov 02, 2022