Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Last update: Dec 07, 2022

Related tags

Overview

Relational Self-Attention: What's Missing in Attention for Video Understanding

This repository is the official implementation of "Relational Self-Attention: What's Missing in Attention for Video Understanding" by Manjin Kim*, Heeseung Kwon*, Chunyu Wang, Suha Kwak, and Minsu Cho (*equal contribution).

Requirements

Python: 3.7.9
Pytorch: 1.6.0
TorchVision: 0.2.1
Cuda: 10.1
Conda environment environment.yml

To install requirements:

    conda env create -f environment.yml
    conda activate rsa

Dataset Preparation

Download Something-Something v1 & v2 (SSv1 & SSv2) datasets and extract RGB frames. Download URLs: SSv1, SSv2
Make txt files that define training & validation splits. Each line in txt files is formatted as [video_path] [#frames] [class_label]. Please refer to any txt files in ./data directory.

Training

To train RSANet-R50 on SSv1 or SSv2 datasets in the paper, run this command:

    # For SSv1
    ./scripts/train_Something_v1.sh 
    
    
     
    # example: ./scripts/train_Something_v1.sh RSA_R50_SSV1_16frames 16
    
    # For SSv2
    ./scripts/train_Something_v2.sh 
      
      
       
    # example: ./scripts/train_Something_v2.sh RSA_R50_SSV2_16frames 16

Evaluation

To evaluate RSANet-R50 on SSv2 dataset in the paper, run:

    # For SSv1
    ./scripts/test_Something_v1.sh 
    
     
     
      
    # example: ./scripts/test_Something_v1.sh RSA_R50_SSV1_16frames resnet_rgb_model_best.pth.tar 16
    
    # For SSv2
    ./scripts/test_Something_v2.sh 
       
        
        
          # example: ./scripts/test_Something_v2.sh RSA_R50_SSV2_16frames resnet_rgb_model_best.pth.tar 16

Results

Our model achieves the following performance on Something-Something-V1 and Something-Something-V2:

model	dataset	frames	top-1 / top-5	logs	checkpoints
RSANet-R50	SSV1	16	54.0 % / 81.1 %	[log]	[checkpoint]
RSANet-R50	SSV2	16	66.0 % / 89.9 %	[log]	[checkpoint]

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Related tags

Overview

Relational Self-Attention: What's Missing in Attention for Video Understanding

Requirements

Dataset Preparation

Training

Evaluation

Results

Qualitative Results

Owner

mandos

Implementation of ML models like Decision tree, Naive Bayes, Logistic Regression and many other

Segmentation vgg16 fcn - cityscapes

Custom IMDB Dataset is extracted between 2020-2021 and custom distilBERT model is trained for movie success probability prediction

CUDA Python Low-level Bindings

Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI

[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

Low-code/No-code approach for deep learning inference on devices

scikit-learn inspired API for CRFsuite

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

A python code to convert Keras pre-trained weights to Pytorch version

Hydra Lightning Template for Structured Configs

Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Distributing reference energies for SMIRNOFF implementations

Official PyTorch implementation of Spatial Dependency Networks.