Predicting future trajectories of people in cameras of novel scenarios and views.

Overview

Pedestrian Trajectory Prediction

Predicting future trajectories of pedestrians in cameras of novel scenarios and views.

This repository contains the code and models for the following ECCV'20 paper:

SimAug: Learning Robust Representatios from Simulation for Trajectory Prediction Junwei Liang, Lu Jiang, Alexander Hauptmann

Our Pipeline

Input: could be from a streaming camera or saved videos.

Detection: we used a pre-trained model called YOLO (You Only Look Once) to perform object detection, it uses convolutional neural networks to provide real-time object detection, it is popular for its speed and accuracy.

Tracking: we used a pre-trained model called Deep SORT (Simple Online and Realtime Tracking), it uses deep learning to perform object tracking in videos. It works by computing deep features for every bounding box and using the similarity between deep features to also factor into the tracking logic. It is known to work perfectly with YOLO and also popular for its speed and accuracy.

Resizing: at this step, we get the frames and resize them to the required shape which is 1920 X 1080.

Semantic Segmentation: we used a pre-trained model called Deep Lab (Deep Labeling) an algorithm made by Google, to perform the semantic segmentation task, this model works by assigning a predicted value for each pixel in an image or video with the help of deep neural network support. It performs a pixel-wise classification where each pixel is labeled by predicted value encoding its semantic class.

SimAug Model: Simulation as Augmentation, is a novel simulation data augmentation method for trajectory prediction. It augments the representation such that it is robust to the variances in semantic scenes and camera views, it predicts the trajectory in unseen camera views.

Predicted Trajectory: The output of the proposed pipeline.

Code

Fisrt you need to install packages according to the configuration file:

$ pip install -r requirements.txt

Running on video

Then download the deeplab ADE20k model(used for Semantic Segmentation):

$ wget http://download.tensorflow.org/models/deeplabv3_xception_ade20k_train_2018_05_29.tar.gz
$ tar -zxvf deeplabv3_xception_ade20k_train_2018_05_29.tar.gz

Then download SimAug-trained model:

$ wget https://next.cs.cmu.edu/data/packed_models_eccv2020.tgz
$ tar -zxvf packed_models_eccv2020.tgz

Run the pretrained YOLOv5 & DEEPSORT

get the annotations on a sample video many_people.mp4 from yolo and deepsort + resized images to 1920 x 1080

dataset_resize,changelst , annotation = detect('many_people.mp4')

Prepare the annotation

  • get box centre x,y for each person (traj_data)
  • person_box_data : boxes coordinates for all persons
  • other_box_data : boxes of other objects in the same frame with each targeted person
traj_data, person_box_data, other_box_data  = prepared_data_sdd(annotation,changelst)

Run the segmentation model

model_path= 'deeplabv3_xception_ade20k_train/frozen_inference_graph.pb'
seg_output= extract_scene_seg(dataset_resize,model_path,every =100)

Prepare all data for the SimAug model

making npz which contanins arrays for details of the segmentation with annotations and person ids
data=To_npz(8,12,traj_data,seg_output)
np.savez("prepro_fold1/data_test.npz", **data)

Test SimAug-Trained Model

!python Code/test.py prepro_fold1/ packed_models/ best_simaug_model \
--wd 0.001 --runId 0 --obs_len 8 --pred_len 12 --emb_size 32 --enc_hidden_size 256 \
--dec_hidden_size 256 --activation_func tanh --keep_prob 1.0 --num_epochs 30 \
--batch_size 12 --init_lr 0.3 --use_gnn --learning_rate_decay 0.95 --num_epoch_per_decay 5.0 \
--grid_loss_weight 1.0 --grid_reg_loss_weight 0.5 --save_period 3000 \
--scene_h 36 --scene_w 64 --scene_conv_kernel 3 --scene_conv_dim 64 \
--scene_grid_strides 2,4 --use_grids 1,0 --val_grid_num 0 --gpuid 0 --load_best \
--save_output sdd_out.p
To Run the pipeline from here

Demo

ITI.Moving.vehicle.mp4

Results

We capture streaming video that contains 1628 frames, processing time for stages is

• Yolo & Deep SORT: 20.7 f/s

• DeepLabv3: 4.66 f/s

• SimAug: 12.8 f/s

Video_Name Grid_acc minADE minFDE
Moving-ITI 0.6098 22.132 39.271

Dependencies

• Python 3.6 ; TensorFlow 1.15.0 ; Pytorch 1.7 ; Cuda 10

Code Contributors

References

@inproceedings{liang2020simaug,
  title={SimAug: Learning Robust Representations from Simulation for Trajectory Prediction},
  author={Liang, Junwei and Jiang, Lu and Hauptmann, Alexander},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  month = {August},
  year={2020}
}
Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

Unsupervised Attributed Multiplex Network Embedding (DMGI) Overview Nodes in a multiplex network are connected by multiple types of relations. However

Chanyoung Park 114 Dec 06, 2022
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

Multipath RefineNet A MATLAB based framework for semantic image segmentation and general dense prediction tasks on images. This is the source code for

Guosheng Lin 575 Dec 06, 2022
Deep learning based hand gesture recognition using LSTM and MediaPipie.

Hand Gesture Recognition Deep learning based hand gesture recognition using LSTM and MediaPipie. Demo video using PingPong Robot Files Pretrained mode

Brad 24 Nov 11, 2022
Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Privacy-Aware Inverse RL (PRIL) Analysis Framework Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based

1 Dec 06, 2021
Survival analysis in Python

What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu

Cameron Davidson-Pilon 2k Jan 08, 2023
UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down. UpChecker - just run file and use project easy

UpChecker UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down.

Yan 4 Apr 07, 2022
Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021.

SphereRPN Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021. Authors: Th

Thang Vu 15 Dec 02, 2022
This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

Delayed-cellular-neural-network This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability. There is als

4 Apr 28, 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑‍🎓 👩‍⚖️ Dataset Summary Inspired by the recent widespread use of th

95 Dec 08, 2022
Misc YOLOL scripts for use in the Starbase space sandbox videogame

starbase-misc Misc YOLOL scripts for use in the Starbase space sandbox videogame. Each directory contains standalone YOLOL scripts. They don't really

4 Oct 17, 2021
Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

Belinda Li 20 May 17, 2022
A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

What is This This is a simple re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"(1). Only Sections

102 Dec 14, 2022
CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

CO-PILOT CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum, NeurIPS 2021, Shuang Ao, Tianyi Zhou, Guodong Long, Qingh

Shuang Ao 1 Feb 18, 2022
Adds timm pretrained backbone to pytorch's FasterRcnn model

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

Mriganka Nath 12 Dec 03, 2022
An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

MetaICL: Learning to Learn In Context This includes an original implementation of "MetaICL: Learning to Learn In Context" by Sewon Min, Mike Lewis, Lu

Meta Research 141 Jan 07, 2023
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023
Auto HMM: Automatic Discrete and Continous HMM including Model selection

Auto HMM: Automatic Discrete and Continous HMM including Model selection

Chess_champion 29 Dec 07, 2022
[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

Shape As Points (SAP) Paper | Project Page | Short Video (6 min) | Long Video (12 min) This repository contains the implementation of the paper: Shape

394 Dec 30, 2022
Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Muzammal Naseer 155 Dec 27, 2022