Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).

Related tags

Deep LearningPSVL
Overview

Zero-shot Natural Language Video Localization (ZSNLVL) by Pseudo-Supervised Video Localization (PSVL)

This repository is for Zero-shot Natural Language Video Localization. (ICCV 2021, Oral)

We first propose a novel task of zero-shot natural language video localization. The proposed task setup does not require any paired annotation cost for NLVL task but only requires easily available text corpora, off-the-shelf object detector, and a collection of videos to localize. To address the task, we propose a Pseudo-Supervised Video Localization method, called PSVL, that can generate pseudo-supervision for training an NLVL model. Benchmarked on two widely used NLVL datasets, the proposed method exhibits competitive performance and performs on par or outperforms the models trained with stronger supervision.

task_nlvl


Environment

This repository is implemented base on PyTorch with Anaconda.
Refer to below instruction or use Docker (dcahn/psvl:latest).

Get the code

  • Clone this repo with git, please use:
git clone https://github.com/gistvision/PSVL.git
  • Make your own environment (If you use docker envronment, you just clone the code and execute it.)
conda create --name PSVL --file requirements.txt
conda activate PSVL

Working environment

  • RTX2080Ti (11G)
  • Ubuntu 18.04.5
  • pytorch 1.5.1

Download

Dataset & Pretrained model

  • This link is connected for downloading video features used in this paper.
    : After downloading the video feature, you need to set the data path in a config file.

  • This link is connected for downloading pre-trained model.

Evaluating pre-trained models

If you want to evaluate the pre-trained model, you can use below command.

python inference.py --model CrossModalityTwostageAttention --config "YOUR CONFIG PATH" --pre_trained "YOUR MODEL PATH"

Training models from scratch

To train PSVL, run train.py with below command.

# Training from scratch
python train.py --model CrossModalityTwostageAttention --config "YOUR CONFIG PATH"
# Evaluation
python inference.py --model CrossModalityTwostageAttention --config "YOUR CONFIG PATH" --pre_trained "YOUR MODEL PATH"

Lisence

MIT Lisence

Citation

If you use this code, please cite:

@inproceedings{nam2021zero,
  title={Zero-shot Natural Language Video Localization},
  author={Nam, Jinwoo and Ahn, Daechul and Kang, Dongyeop and Ha, Seong Jong and Choi, Jonghyun},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1470-1479},
  year={2021}
}

Contact

If you have any questions, please send e-mail to me ([email protected], [email protected])

Owner
Computer Vision Lab. @ GIST
Some useful codes for computer vision and machine learning.
Computer Vision Lab. @ GIST
Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

Multi-Domain Incremental Learning for Semantic Segmentation This is the Pytorch implementation of our work "Multi-Domain Incremental Learning for Sema

Pgxo20 24 Jan 02, 2023
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

25 Nov 09, 2022
Personalized Federated Learning using Pytorch (pFedMe)

Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020) This repository implements all experiments in the paper Personalized Federated Le

Charlie Dinh 226 Dec 30, 2022
Diagnostic tests for linguistic capacities in language models

LM diagnostics This repository contains the diagnostic datasets and experimental code for What BERT is not: Lessons from a new suite of psycholinguist

61 Jan 02, 2023
Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

Belinda Li 39 Nov 03, 2022
Graph Attention Networks

GAT Graph Attention Networks (Veličković et al., ICLR 2018): https://arxiv.org/abs/1710.10903 GAT layer t-SNE + Attention coefficients on Cora Overvie

Petar Veličković 2.6k Jan 05, 2023
Convert Pytorch model to onnx or tflite, and the converted model can be visualized by Netron

Convert Pytorch model to onnx or tflite, and the converted model can be visualized by Netron

Roxbili 5 Nov 19, 2022
A package to predict protein inter-residue geometries from sequence data

trRosetta This package is a part of trRosetta protein structure prediction protocol developed in: Improved protein structure prediction using predicte

Ivan Anishchenko 185 Jan 07, 2023
Pytorch implementation of Generative Models as Distributions of Functions 🌿

Generative Models as Distributions of Functions This repo contains code to reproduce all experiments in Generative Models as Distributions of Function

Emilien Dupont 117 Dec 29, 2022
SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out

2 Jan 26, 2022
Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

NBFNet: Neural Bellman-Ford Networks This is the official codebase of the paper Neural Bellman-Ford Networks: A General Graph Neural Network Framework

MilaGraph 136 Dec 21, 2022
PyTorch implementation of the paper: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features

Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features Estimate the noise transition matrix with f-mutual information. This co

<a href=[email protected]"> 1 Jun 05, 2022
Line-level Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Line-level Handwritten Text Recognition with TensorFlow This model is an extended version of the Simple HTR system implemented by @Harald Scheidl and

Hoàng Tùng Lâm (Linus) 72 May 07, 2022
EdiBERT, a generative model for image editing

EdiBERT, a generative model for image editing EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The

16 Dec 07, 2022
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a

ImagePy 1.2k Dec 29, 2022
Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

Alexander Hirner 169 Jun 29, 2022
This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset.

FACT This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset. To cite, please use:

105 Dec 17, 2022
DP-CL(Continual Learning with Differential Privacy)

DP-CL(Continual Learning with Differential Privacy) This is the official implementation of the Continual Learning with Differential Privacy. If you us

Phung Lai 3 Nov 04, 2022
Neural Koopman Lyapunov Control

Neural-Koopman-Lyapunov-Control Code for our paper: Neural Koopman Lyapunov Control Requirements dReal4: v4.19.02.1 PyTorch: 1.2.0 The learning framew

Vrushabh Zinage 6 Dec 24, 2022