Code repository for the paper "Tracking People with 3D Representations"

Related tags

Deep LearningT3DP
Overview

Tracking People with 3D Representations

Code repository for the paper "Tracking People with 3D Representations" (paper link) (project site).
Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik.
Neural Information Processing Systems (NeurIPS), 2021.

This code repository provides a code implementation for our paper T3DP, with installation, preparing datasets, and evaluating on datasets, and a demo code to run on any youtube videos.

Abstract : We present a novel approach for tracking multiple people in video. Unlike past approaches which employ 2D representations, we focus on using 3D representations of people, located in three-dimensional space. To this end, we develop a method, Human Mesh and Appearance Recovery (HMAR) which in addition to extracting the 3D geometry of the person as a SMPL mesh, also extracts appearance as a texture map on the triangles of the mesh. This serves as a 3D representation for appearance that is robust to viewpoint and pose changes. Given a video clip, we first detect bounding boxes corresponding to people, and for each one, we extract 3D appearance, pose, and location information using HMAR. These embedding vectors are then sent to a transformer, which performs spatio-temporal aggregation of the representations over the duration of the sequence. The similarity of the resulting representations is used to solve for associations that assigns each person to a tracklet. We evaluate our approach on the Posetrack, MuPoTs and AVA datasets. We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

Installation

We recommend creating a clean conda environment and install all dependencies. You can do this as follows:

conda env create -f _environment.yml

After the installation is complete you can activate the conda environment by running:

conda activate T3DP

Install PyOpenGL from this repository:

pip uninstall pyopengl
git clone https://github.com/mmatl/pyopengl.git
pip install ./pyopengl

Additionally, install Detectron2 from the official repository, if you need to run demo code on a local machine. We provide detections inside the _DATA folder, so for running the tracker on posetrack or mupots, you do not need to install Detectron2.

Download Data

We provide preprocessed files for PoseTrack and MuPoTs datasets (AVA files will be released soon!). Please download this folder and extract inside the main repository.

Training

To train the transformer model with posetrack data run,

python train_t3dp.py
--learning_rate 0.001
--lr_decay_epochs 10000,20000
--epochs 100000
--tags T3PO
--train_dataset posetrack_2018
--test_dataset posetrack_2018
--train_batch_size 32
--feature APK
--train

WANDB will create unique names for each run, and save the model names accordingly. Use this name for evaluation. We have also provided pretrained weights inside the _DATA folder.

Testing

Once the posetrack dataset is downloaded at "_DATA/Posetrack_2018/", run the following command to run our tracker on all validation videos.

python test_t3dp.py
--dataset "posetrack"
--dataset_path "_DATA/Posetrack_2018/"
--storage_folder "Videos_Final"
--render True
--save True

Evaluation

To evaluate the tracking performance on ID switches, MOTA, and IDF1 metrics, please run the following command.

python3 evaluate_t3dp.py out/Videos_Final/results/ t3dp posetrack

Demo

Please run the following command to run our method on a youtube video. This will download the youtube video from a given ID, and extract frames, run Detectron2, run HMAR and finally run our tracker and renders the video.

python3 demo.py

Results (Project site)

We evaluated our method on PoseTrack, MuPoTs and AVA datasets. Our results show significant improvements over the state-of-the-art methods on person tracking. For more results please visit our website.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Contact

Jathushan Rajasegaran - [email protected] or [email protected]
To ask questions or report issues, please open an issue on the issues tracker.
Discussions, suggestions and questions are welcome!

Citation

If you find this code useful for your research or the use data generated by our method, please consider citing the following paper:

@Inproceedings{rajasegaran2021tracking,
  title     = {Tracking People with 3D Representations},
  author    = {Rajasegaran, Jathushan and Pavlakos, Georgios and Kanazawa, Angjoo and Malik, Jitendra},
  Booktitle = {NeurIPS},
  year      = {2021}
}

Owner
Jathushan Rajasegaran
Jathushan Rajasegaran
Create UIs for prototyping your machine learning model in 3 minutes

Note: We just launched Hosted, where anyone can upload their interface for permanent hosting. Check it out! Welcome to Gradio Quickly create customiza

Gradio 11.7k Jan 07, 2023
Implementation of ๐Ÿฆฉ Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

๐Ÿฆฉ Flamingo - Pytorch Implementation of Flamingo, state-of-the-art few-shot visual question answering attention net, in Pytorch. It will include the p

Phil Wang 630 Dec 28, 2022
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

COCO-LM This repository contains the scripts for fine-tuning COCO-LM pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: COCO-LM: Correcting an

Microsoft 106 Dec 12, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 09, 2022
Introduction to Statistics and Basics of Mathematics for Data Science - The Hacker's Way

HackerMath for Machine Learning โ€œStudy hard what interests you the most in the most undisciplined, irreverent and original manner possible.โ€ โ€• Richard

Amit Kapoor 1.4k Dec 22, 2022
Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

Data Framework for Semantic/Instance Segmentation Bunch of different tools which helps visualizing, transforming and annotating images for semantic/in

Bruno Fernandes Carvalho 5 Dec 21, 2022
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

Qiming Hu 31 Dec 20, 2022
HGCN: Harmonic Gated Compensation Network For Speech Enhancement

HGCN The official repo of "HGCN: Harmonic Gated Compensation Network For Speech Enhancement", which was accepted at ICASSP2022. How to use step1: Calc

ScorpioMiku 33 Nov 14, 2022
Face and other object detection using OpenCV and ML Yolo

Object-and-Face-Detection-Using-Yolo- Opencv and YOLO object and face detection is implemented. You only look once (YOLO) is a state-of-the-art, real-

Happy N. Monday 3 Feb 15, 2022
A little Python application to auto tag your photos with the power of machine learning.

Tag Machine A little Python application to auto tag your photos with the power of machine learning. Report a bug or request a feature Table of Content

Florian Torres 14 Dec 21, 2022
Convert human motion from video to .bvh

video_to_bvh Convert human motion from video to .bvh with Google Colab Usage 1. Open video_to_bvh.ipynb in Google Colab Go to https://colab.research.g

Dene 306 Dec 10, 2022
FishNet: One Stage to Detect, Segmentation and Pose Estimation

FishNet FishNet: One Stage to Detect, Segmentation and Pose Estimation Introduction In this project, we combine target detection, instance segmentatio

1 Oct 05, 2022
MoveNet Single Pose on DepthAI

MoveNet Single Pose tracking on DepthAI Running Google MoveNet Single Pose models on DepthAI hardware (OAK-1, OAK-D,...). A convolutional neural netwo

64 Dec 29, 2022
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

TechSEO Crawler Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index. Play with the r

JR Oakes 57 Nov 24, 2022
Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

Unofficial pytorch implementation of the paper "Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective"

16 Nov 21, 2022
An educational AI robot based on NVIDIA Jetson Nano.

JetBot Looking for a quick way to get started with JetBot? Many third party kits are now available! JetBot is an open-source robot based on NVIDIA Jet

NVIDIA AI IOT 2.6k Dec 29, 2022
Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Noah Getz 3 Jun 22, 2022
ReSSL: Relational Self-Supervised Learning with Weak Augmentation

ReSSL: Relational Self-Supervised Learning with Weak Augmentation This repository contains PyTorch evaluation code, training code and pretrained model

mingkai 45 Oct 25, 2022
ไธ€ไธช่ฟ่กŒๅœจ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๆˆ– ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็ญ‰ๅฎšๆ—ถ้ขๆฟ็š„็ญพๅˆฐ้กน็›ฎ

ๅฎšๆ—ถ้ขๆฟไธŠ็š„็ญพๅˆฐ็›’ ไธ€ไธช่ฟ่กŒๅœจ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๆˆ– ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็ญ‰ๅฎšๆ—ถ้ขๆฟ็š„็ญพๅˆฐ้กน็›ฎ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็‰นๅˆซๅฃฐๆ˜Ž ๆœฌไป“ๅบ“ๅ‘ๅธƒ็š„่„šๆœฌๅŠๅ…ถไธญๆถ‰ๅŠ็š„ไปปไฝ•่งฃ้”ๅ’Œ่งฃๅฏ†ๅˆ†ๆž่„šๆœฌ๏ผŒไป…็”จไบŽๆต‹่ฏ•ๅ’Œๅญฆไน ็ ”็ฉถ๏ผŒ็ฆๆญข็”จไบŽๅ•†ไธš็”จ้€”๏ผŒไธ่ƒฝไฟ่ฏๅ…ถๅˆ

Leon 1.1k Dec 30, 2022
Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem

Benchmarking nearest neighbors Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem, but so far t

Erik Bernhardsson 3.2k Jan 03, 2023