PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Last update: Oct 11, 2021

Overview

Future urban scene generation through vehicle synthesis

This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Through Vehicle Synthesis" [arXiv]

Model architecture

Our framework is composed by two stages:

Interpretable information extraction: high level interpretable information is gathered from raw RGB frames (bounding boxes, trajectories, keypoints).
Novel view completion: condition a reprojected 3D model with the original 2D appearance.

Abstract

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite recent advances, generating the entire scene in an end-to-end fashion is still far from being achieved. Instead, here we follow a two stage approach, where interpretable information are included in the loop and each actor is modelled independently. We leverage a per-object novel view synthesis paradigm; i.e. generating a synthetic representation of an object undergoing a geometrical roto-translation in the 3D space. Our model can be easily conditioned with constraints (e.g. input trajectories) provided by state-of-the-art tracking methods or by the user.

Code

Code was tested with an Anaconda environment (Python version 3.6) on both Linux and Windows based systems.

Install

Run the following commands to install all requirements in a new virtual environment:

conda create -n <env_name> python=3.6
conda activate <env_name>
pip install -r requirements.txt

Install PyTorch package (version 1.3 or above).

How to run test

To run the demo of our project, please firstly download all the required data at this link and save them in a of your choice. We tested our pipeline on the Cityflow dataset that already have annotated bounding boxes and trajectories of vehicles.

The test script is run_test.py that expects some arguments as mandatory: video, 3D keypoints and checkpoints directories.

python run_test.py <data_dir>/<video_dir> <data_dir>/pascal_cads <data_dir>/checkpoints --det_mode ssd512|yolo3|mask_rcnn --track_mode tc|deepsort|moana --bbox_scale 1.15 --device cpu|cuda

Add the parameter --inpaint to use the inpainting on the vehicle instead of the static background.

Description and GUI usage

If everything went well, you should see the main GUI in which you can choose whichever vehicle you want that was detected in the video frame or change the video frame.

The commands working on this window are:

RIGHT ARROW = go to next frame
LEFT ARROW = go to previous frame
SINGLE MOUSE LEFT BUTTON CLICK = visualize car trajectory
BACKSPACE = delete the drawn trajectories
DOUBLE MOUSE LEFT BUTTON CLICK = select one of the vehicles bounding boxes

Once you selected some vehicles of your chioce by double-clicking in their bounding boxes, you can push the RUN button to start the inference. The resulting frames will be saved in ./results directory.

Cite

If you find this repository useful for your research, please cite the following paper:

@inproceedings{simoni2021future,
  title={Future urban scenes generation through vehicles synthesis},
  author={Simoni, Alessandro and Bergamini, Luca and Palazzi, Andrea and Calderara, Simone and Cucchiara, Rita},
  booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
  pages={4552--4559},
  year={2021},
  organization={IEEE}
}

PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Related tags

Overview

Future urban scene generation through vehicle synthesis

Model architecture

Abstract

Code

Install

How to run test

Description and GUI usage

Cite

Owner

Alessandro Simoni

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

A micro-game "flappy bird".

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

Full-featured Decision Trees and Random Forests learner.

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc.

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Learning to Self-Train for Semi-Supervised Few-Shot

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Code, Models and Datasets for OpenViDial Dataset

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

Complete the code of prefix-tuning in low data setting

🇰🇷 Text to Image in Korean

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

🛠️ Tools for Transformers compression using Lightning ⚡