VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Related tags

Deep LearningVLG-Net
Overview

VLG-Net: Video-Language Graph Matching Networks for Video Grounding

Introduction

Official repository for VLG-Net: Video-Language Graph Matching Networks for Video Grounding. [ArXiv Preprint]

The paper is accepted to the first edition fo the ICCV workshop: AI for Creative Video Editing and Understanding (CVEU).

Installation

Clone the repository and move to folder:

git clone https://github.com/Soldelli/VLG-Net.git
cd VLG-Net

Install environmnet:

conda env create -f environment.yml

If installation fails, please follow the instructions in file doc/environment.md (link).

Data

Download the following resources and extract the content in the appropriate destination folder. See table.

Resource Download Link File Size Destination Folder
StandfordCoreNLP-4.0.0 link (~0.5GB) ./datasets/
TACoS link (~0.5GB) ./datasets/
ActivityNet-Captions link (~29GB) ./datasets/
DiDeMo link (~13GB) ./datasets/
GCNeXt warmup link (~0.1GB) ./datasets/
Pretrained Models link (~0.1GB) ./models/

The folder structure should be as follows:

.
├── configs
│
├── datasets
│   ├── activitynet1.3
│   │    ├── annotations
│   │    └── features
│   ├── didemo
│   │    ├── annotations
│   │    └── features
│   ├── tacos
│   │    ├── annotations
│   │    └── features
│   ├── gcnext_warmup
│   └── standford-corenlp-4.0.0
│
├── doc
│
├── lib
│   ├── config
│   ├── data
│   ├── engine
│   ├── modeling
│   ├── structures
│   └── utils
│
├── models
│   ├── activitynet
│   └── tacos
│
├── outputs
│
└── scripts

Training

Copy paste the following commands in the terminal.

Load environment:

conda activate vlg
  • For ActivityNet-Captions dataset, run:
python train_net.py --config-file configs/activitynet.yml OUTPUT_DIR outputs/activitynet
  • For TACoS dataset, run:
python train_net.py --config-file configs/tacos.yml OUTPUT_DIR outputs/tacos

Evaluation

For simplicity we provide scripts to automatically run the inference on pretrained models. See script details if you want to run inference on a different model.

Load environment:

conda activate vlg

Then run one of the following scripts to launch the evaluation.

  • For ActivityNet-Captions dataset, run:
    bash scripts/activitynet.sh
  • For TACoS dataset, run:
    bash scripts/tacos.sh

Expected results:

After cleaning the code and fixing a couple of minor bugs, performance changed (slightly) with respect to reported numbers in the paper. See below table.

ActivityNet [email protected] [email protected] [email protected] [email protected]
Paper 46.32 29.82 77.15 63.33
Current 46.32 29.79 77.19 63.36

TACoS [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
Paper 57.21 45.46 34.19 81.80 70.38 56.56
Current 57.16 45.56 34.14 81.48 70.13 56.34

Citation

If any part of our paper and code is helpful to your work, please cite with:

@inproceedings{soldan2021vlg,
  title={VLG-Net: Video-Language Graph Matching Network for Video Grounding},
  author={Soldan, Mattia and Xu, Mengmeng and Qu, Sisi and Tegner, Jesper and Ghanem, Bernard},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3224--3234},
  year={2021}
}
Owner
Mattia Soldan
PhD student @ KAUST. Working at the intersection between language and video. #Deeplearning #MachineLearning
Mattia Soldan
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 24 Dec 12, 2022
SmartSim Infrastructure Library.

Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten

Cray Labs 139 Jan 01, 2023
A PyTorch library for Vision Transformers

VFormer A PyTorch library for Vision Transformers Getting Started Read the contributing guidelines in CONTRIBUTING.rst to learn how to start contribut

Society for Artificial Intelligence and Deep Learning 142 Nov 28, 2022
PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

Zixuan Ke 176 Jan 05, 2023
Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

Facebook Research 1.4k Dec 29, 2022
Just Go with the Flow: Self-Supervised Scene Flow Estimation

Just Go with the Flow: Self-Supervised Scene Flow Estimation Code release for the paper Just Go with the Flow: Self-Supervised Scene Flow Estimation,

Himangi Mittal 50 Nov 22, 2022
PyTorch reimplementation of Diffusion Models

PyTorch pretrained Diffusion Models A PyTorch reimplementation of Denoising Diffusion Probabilistic Models with checkpoints converted from the author'

Patrick Esser 265 Jan 01, 2023
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台 本项目是一个简易的人脸检测识别平台,提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js,后端采用 pytorch,

Xuhua Huang 5 Aug 02, 2022
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yon

Forest 117 Apr 01, 2022
Library for fast text representation and classification.

fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme

Facebook Research 24.1k Jan 01, 2023
天勤量化开发包, 期货量化, 实时行情/历史数据/实盘交易

TqSdk 天勤量化交易策略程序开发包 TqSdk 是一个由信易科技发起并贡献主要代码的开源 python 库. 依托快期多年积累成熟的交易及行情服务器体系, TqSdk 支持用户使用极少的代码量构建各种类型的量化交易策略程序, 并提供包含期货、期权、股票的 历史数据-实时数据-开发调试-策略回测-

信易科技 2.8k Dec 30, 2022
Simulate genealogical trees and genomic sequence data using population genetic models

msprime msprime is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consis

Tskit developers 150 Dec 14, 2022
Semi-supevised Semantic Segmentation with High- and Low-level Consistency

Semi-supevised Semantic Segmentation with High- and Low-level Consistency This Pytorch repository contains the code for our work Semi-supervised Seman

123 Dec 30, 2022
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

97 Dec 17, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
Code for project: "Learning to Minimize Remainder in Supervised Learning".

Learning to Minimize Remainder in Supervised Learning Code for project: "Learning to Minimize Remainder in Supervised Learning". Requirements and Envi

Yan Luo 0 Jul 18, 2021
Official Pytorch implementation of "Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral)"

Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral): Official Project Webpage This repository provides the off

Kakao Enterprise Corp. 68 Dec 17, 2022
Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

Wei Wu 531 Dec 04, 2022
CTC segmentation python package

CTC segmentation CTC segmentation can be used to find utterances alignments within large audio files. This repository contains the ctc-segmentation py

Ludwig Kürzinger 217 Jan 04, 2023