Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

Overview

Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

This is the official code for DyReg model inroduced in Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

Citation

Please use the following BibTeX to cite our work.

@incollection{duta2021dynamic_dyreg_gnn_neurips2021,
title = {Discovering Dynamic Salient Regions with Spatio-Temporal Graph
Neural Networks},
author = {Duta, Iulia and Nicolicioiu, Andrei and Leordeanu, Marius},
booktitle = {Advances in Neural Information Processing Systems 34},
year = {2021}
}

@article{duta2020dynamic_dyreg,
title = {Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning},
author = {Duta, Iulia and Nicolicioiu, Andrei and Leordeanu, Marius},
journal = {NeurIPS 2020 Workshop on Object Representations for Learning and Reasoning},
year = {2020},
}

Requirements

The code was developed using:

- python 3.7
- matplotlib
- torch 1.7.1
- script
- pandas
- torchvision
- moviepy
- ffmpeg

Overview:

The repository contains the Pytorch implementation of the DyReg-GNN model. The model is defined and trained in the following files:

  • ops/dyreg.py - code for our DyReg module

  • ops/rstg.py - code for the Spatio-temporal GNN (RSTG) used to process the graph extracted using DyReg

  • create_model.py - two examples how to integrate the DyReg-GNN module inside an existing backbone

  • main_standard.py - code to train a model on Smt-Smt dataset

  • test_models.py - code for multi-clip evaluation

Scripts for preparing the data, training and testing the model:

Prepare dataset

For Something Something dataset:

  • the json files containing meta-data should be stored in ./data/smt-smt-V2/tsm_data
  • the zip files containing the videos should be stored in ./data/smt-smt-V2/

  1. To extract the videos from the zip files run:

cat 20bn-something-something-v2-?? | tar zx

  1. To extract the frames from videos run:

python tools/vid2img_sthv2.py

→ The videos will be stored in $FRAME_ROOT (default './data/smt-smt-V2/tmp_smt-smt-V2-frames')

💡 If you already have the dataset as frames, place them under ./data/smt-smt-V2/smt-smt-V2-frames/, one folder for each video
💡 💡 If you need to change the path for datasets modify $ROOT_DATASET in dataset_config.py

  1. To generate the labels file in the required format please run:

python tools/gen_label_sthv2.py

→ The resulting txt files, for each split, will be stored in $DATA_UTILS_ROOT (default './data/smt-smt-V2/tsm_data/')

How to run the model

DyReg-GNN module can be simply inserted into any space-time model.

import torch
from torch.nn import functional as F
from ops.dyreg import DynamicGraph, dyregParams

class SpaceTimeModel(torch.nn.Module):
    def __init__(self):
        super(SpaceTimeModel, self).__init__()
        dyreg_params = dyregParams()
        dyregParams.offset_lstm_dim = 32
        self.dyreg = DynamicGraph(dyreg_params,
                    backbone_dim=32, node_dim=32, out_num_ch=32,
                    H=16, W=16, 
                    iH=16, iW=16,
                    project_i3d=False,
                    name='lalalal')


        self.fc = torch.nn.Linear(32, 10)

    def forward(self, x):
        dx = self.dyreg(x)
        # you can initialize the dyreg branch as identity function by normalisation, 
        #   as done in DynamicGraphWrapper found in ./ops/dyreg.py 
        x = x + dx
        # average over time and space: T, H, W
        x = x.mean(-1).mean(-1).mean(-2)
        x = self.fc(x)
        return x


B = 8
T = 10
C = 32
H = 16
W = 16
x = torch.ones(B,T,C,H,W)
st_model = SpaceTimeModel()
out = st_model(x)

For another example of how to integrate DyReg (DynamicGraph module) inside your model please look at create_model.py or run:

python create_model.py

Something-Something experiments

Training a model

To train a model on smt-smt v2 dataset please run

./start_main_standard.sh model_name

For default hyperparameters check opts.py. For example, place_graph flag controls how many DyReg-GNN modules to use and where to place them inside the backbone:

# for a model with 3 DyReg-GNN modules placed after layer 2-block 2, layer 3-block 4 and layer 4-block 1 of the backbone
--place_graph=layer2.2_layer3.4_layer4.1 
# for a model with 1 dyreg module placed after layer 3 block 4 of the backbone
--place_graph=layer3.4                   

Single clip evaluation

Train a model with the above script or download a pre-trained DyReg-GNN model from here and put the checkpoint in ./ckeckpoints/

To evaluate a model on smt-smt v2 dataset on a single 224 x 224 central crop, run:

./start_main_standard_test.sh model_name

The flag $RESUME_CKPT indicate the the checkpoint used for evaluation.

Multi clips evaluation

To evaluate a model in the multi-clips setup (3 spatials clips x 2 temporal samplings) on Smt-Smt v2 dataset please run

./evaluate_model.sh model_name

The flag $RESUME_CKPT indicate the the checkpoint used for evaluation.

TSM Baseline

This repository adds DyReg-GNN modules to a TSM backbone based on code from here.

Owner
Bitdefender Machine Learning
Machine Learning Research @ Bitdefender
Bitdefender Machine Learning
Implementation of parameterized soft-exponential activation function.

Soft-Exponential-Activation-Function: Implementation of parameterized soft-exponential activation function. In this implementation, the parameters are

Shuvrajeet Das 1 Feb 23, 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
Jupyter notebooks for using & learning Keras

deep-learning-with-keras-notebooks 這個github的repository主要是個人在學習Keras的一些記錄及練習。希望在學習過程中發現到一些好的資訊與範例也可以對想要學習使用 Keras來解決問題的同好,或是對深度學習有興趣的在學學生可以有一些方便理解與上手範例

ErhWen Kuo 2.1k Dec 27, 2022
Python port of R's Comprehensive Dynamic Time Warp algorithm package

Welcome to the dtw-python package Comprehensive implementation of Dynamic Time Warping algorithms. DTW is a family of algorithms which compute the loc

Dynamic Time Warping algorithms 154 Dec 26, 2022
Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel

Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel This repository is the official PyTorch implementation of BSRDM w

Zongsheng Yue 69 Jan 05, 2023
A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

443 Jan 06, 2023
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

11 May 19, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

RegSeg The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation" Paper: arxiv D block Decoder Setup Install the

Roland 61 Dec 27, 2022
Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image This repository contains the code for the following paper: R. Hu,

Meta Research 37 Jan 04, 2023
DiSECt: Differentiable Simulator for Robotic Cutting

DiSECt: Differentiable Simulator for Robotic Cutting Website | Paper | Dataset | Video | Blog post DiSECt is a simulator for the cutting of deformable

NVIDIA Research Projects 73 Oct 29, 2022
Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Offline Meta-Reinforcement Learning with Advantage Weighting (MACAW) MACAW code used for the experiments in the ICML 2021 paper. Installing the enviro

Eric Mitchell 28 Jan 01, 2023
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network

MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network This repository is the official implementation of MatchGAN: A S

Justin Sun 12 Dec 27, 2022
A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

mocap4face by Facemoji mocap4face by Facemoji is a free, multiplatform SDK for real-time facial motion capture based on Facial Action Coding System or

Facemoji 591 Dec 27, 2022
A simplified framework and utilities for PyTorch

Here is Poutyne. Poutyne is a simplified framework for PyTorch and handles much of the boilerplating code needed to train neural networks. Use Poutyne

GRAAL/GRAIL 534 Dec 17, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 07, 2023
Code repository for EMNLP 2021 paper 'Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods'

Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods This is the code repository to accompany the EMNLP 2021 paper on ad

Peru Bhardwaj 7 Sep 25, 2022
a generic C++ library for image analysis

VIGRA Computer Vision Library Copyright 1998-2013 by Ullrich Koethe This file is part of the VIGRA computer vision library. You may use,

Ullrich Koethe 378 Dec 30, 2022