Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Related tags

Deep LearningMIGCN
Overview

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Model Pipeline

model-pipeline

Usage

Environment Settings

We use the PyTorch framework.

  • Python version: 3.7.0
  • PyTorch version: 1.4.0

Get Code

Clone the repository:

git clone https://github.com/zmzhang2000/MIGCN.git
cd MIGCN

Data Preparation

Charades-STA

ActivityNet

  • Download the preprocessed annotations of ActivityNet.
  • Download the C3D features of ActivityNet.
  • Process the C3D feature according to process_activitynet_c3d() in data/preprocess/preprocess.py.
  • Save them in data/activitynet.

Pre-trained Models

  • Download the checkpoints of Charades-STA and ActivityNet.
  • Save them in checkpoints

Data Generation

We provide the generation procedure of all MIGCN data.

  • The raw data is listed in data/raw_data/download.sh.
  • The preprocess code is in data/preprocess.

Training

Train MIGCN on Charades-STA with I3D feature:

python main.py --dataset charades --feature i3d

Train MIGCN on ActivityNet with C3D feature:

python main.py --dataset activitynet --feature c3d

Testing

Test MIGCN on Charades-STA with I3D feature:

python main.py --dataset charades --feature i3d --test --model_load_path checkpoints/$MODEL_CHECKPOINT

Test MIGCN on ActivityNet with C3D feature:

python main.py --dataset activitynet --feature c3d --test --model_load_path checkpoints/$MODEL_CHECKPOINT

Other Hyper-parameters

List other hyper-parameters by:

python main.py -h

Reference

Please cite the following paper if MIGCN is helpful for your research

@ARTICLE{9547801,
  author={Zhang, Zongmeng and Han, Xianjing and Song, Xuemeng and Yan, Yan and Nie, Liqiang},
  journal={IEEE Transactions on Image Processing}, 
  title={Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos}, 
  year={2021},
  volume={30},
  number={},
  pages={8265-8277},
  doi={10.1109/TIP.2021.3113791}}
Owner
Zongmeng Zhang
Zongmeng Zhang
Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Created by Olga Sutyrina, Sarah Elemili, Abduragim Shtanchaev and Artur Bille Individual Tree Crown classification on WorldView-2 Images using Autoenc

2 Dec 08, 2022
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

Qiming Hu 31 Dec 20, 2022
Hardware accelerated, batchable and differentiable optimizers in JAX.

JAXopt Installation | Examples | References Hardware accelerated (GPU/TPU), batchable and differentiable optimizers in JAX. Installation JAXopt can be

Google 621 Jan 08, 2023
Implementation for Panoptic-PolarNet (CVPR 2021)

Panoptic-PolarNet This is the official implementation of Panoptic-PolarNet. [ArXiv paper] Introduction Panoptic-PolarNet is a fast and robust LiDAR po

Zixiang Zhou 126 Jan 01, 2023
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

Aaron (Yinghao) Li 282 Jan 01, 2023
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection 🤖 Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

Prem Kumar 86 Aug 03, 2022
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Tröster 1 Oct 24, 2021
Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the

Ofir Press 138 Apr 15, 2022
[NeurIPS 2021] Official implementation of paper "Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization".

Code for Coordinated Policy Optimization Webpage | Code | Paper | Talk (English) | Talk (Chinese) Hi there! This is the source code of the paper “Lear

DeciForce: Crossroads of Machine Perception and Autonomy 81 Dec 19, 2022
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech

Disong Wang 262 Dec 31, 2022
Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

Terbe Dániel 138 Dec 17, 2022
Awesome Weak-Shot Learning

Awesome Weak-Shot Learning In weak-shot learning, all categories are split into non-overlapped base categories and novel categories, in which base cat

BCMI 162 Dec 30, 2022
MRI reconstruction (e.g., QSM) using deep learning methods

deepMRI: Deep learning methods for MRI Authors: Yang Gao, Hongfu Sun This repo is devloped based on Pytorch (1.8 or later) and matlab (R2019a or later

Hongfu Sun 17 Dec 18, 2022
Patch-Diffusion Code (AAAI2022)

Patch-Diffusion This is an official PyTorch implementation of "Patch Diffusion: A General Module for Face Manipulation Detection" in AAAI2022. Require

H 7 Nov 02, 2022
TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

Sayak Paul 67 Dec 20, 2022
Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

SPICE: Semantic Pseudo-labeling for Image Clustering By Chuang Niu and Ge Wang This is a Pytorch implementation of the paper. (In updating) SOTA on 5

Chuang Niu 154 Dec 15, 2022
Code for Paper Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning

Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning (c) Tianyu Han and Daniel Truhn, RWTH Aachen University, 20

Tianyu Han 7 Nov 22, 2022
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023
Hybrid Neural Fusion for Full-frame Video Stabilization

FuSta: Hybrid Neural Fusion for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

Yu-Lun Liu 430 Jan 04, 2023