VIsually-Pivoted Audio and(N) Text

Last update: Nov 04, 2022

Related tags

Overview

VIP-ANT: VIsually-Pivoted Audio and(N) Text

Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer.

Data

AudioSet can be downloaded and preprocessed via this tool.

Vision-Audio (VA) Pre-training

Check out the running script bash/run_bimodal_va.sh.

Audio-Text (AT) Fine-tuning

Check out the running script bash/run_bimodal_at.sh.

Dependencies

Dockerfile defines the minimum dependencies of the repo.

Citing VIP-ANT

@misc{vip-ant,
      title={Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer},
      author={Yanpeng Zhao and Jack Hessel and Youngjae Yu and Ximing Lu and Rowan Zellers and Yejin Choi},
      url={https://arxiv.org/abs/2112.08995},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      eprint={2112.08995},
      year={2021},
}

License

MIT

Owner

Yän.PnG

GitHub Repository https://arxiv.org/abs/2112.08995

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introdu

360 Dec 28, 2022

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

10 Dec 02, 2022

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Documentation: https://mmsegmentation.readthedocs.io/ English | 简体中文 Introduction MMSegmentation is an open source semantic segmentation toolbox based

5k Dec 31, 2022

Massively parallel Monte Carlo diffusion MR simulator written in Python.

Disimpy Disimpy is a Python package for generating simulated diffusion-weighted MR signals that can be useful in the development and validation of dat

16 Nov 11, 2022

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet buil

3.4k Jan 07, 2023

Embodied Intelligence via Learning and Evolution

Embodied Intelligence via Learning and Evolution This is the code for the paper Embodied Intelligence via Learning and Evolution Agrim Gupta, Silvio S

111 Dec 13, 2022

Object detection using yolo-tiny model and opencv used as backend

Object detection Algorithm used : Yolo algorithm Backend : opencv Library required: opencv = 4.5.4-dev' Quick Overview about structure 1) main.py Load

2 Jul 06, 2022

Illuminated3D This project participates in the Nasa Space Apps Challenge 2021.

1 Oct 09, 2021

Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements [paper (NeurIPS 2021)] [paper (arXiv)] [code] Authors: Zinan Lin, Vyas Sekar, Gi

32 Dec 16, 2022

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

31 Nov 22, 2022

VIsually-Pivoted Audio and(N) Text

Related tags

Overview

VIP-ANT: VIsually-Pivoted Audio and(N) Text

Data

Vision-Audio (VA) Pre-training

Audio-Text (AT) Fine-tuning

Dependencies

Citing VIP-ANT

License

Owner

Yän.PnG

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

[ICCV 2021] Deep Hough Voting for Robust Global Registration

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Massively parallel Monte Carlo diffusion MR simulator written in Python.

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

Embodied Intelligence via Learning and Evolution

Object detection using yolo-tiny model and opencv used as backend

Illuminated3D This project participates in the Nasa Space Apps Challenge 2021.

Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

DP-CL(Continual Learning with Differential Privacy)

Physics-informed Neural Operator for Learning Partial Differential Equation

Efficient Online Bayesian Inference for Neural Bandits

A Structured Self-attentive Sentence Embedding

Quantify the difference between two arbitrary curves in space

SANet: A Slice-Aware Network for Pulmonary Nodule Detection

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Short and long time series classification using convolutional neural networks

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.

A Simple Key-Value Data-store written in Python