Deep Multimodal Neural Architecture Search

Related tags

Deep Learningmmnas
Overview

MMNas: Deep Multimodal Neural Architecture Search

This repository corresponds to the PyTorch implementation of the MMnas for visual question answering (VQA), visual grounding (VGD), and image-text matching (ITM) tasks.

example-image

Prerequisites

Software and Hardware Requirements

You may need a machine with at least 4 GPU (>= 8GB), 50GB memory for VQA and VGD and 150GB for ITM and 50GB free disk space. We strongly recommend to use a SSD drive to guarantee high-speed I/O.

You should first install some necessary packages.

  1. Install Python >= 3.6

  2. Install Cuda >= 9.0 and cuDNN

  3. Install PyTorch >= 0.4.1 with CUDA (Pytorch 1.x is also supported).

  4. Install SpaCy and initialize the GloVe as follows:

    $ pip install -r requirements.txt
    $ wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
    $ pip install en_vectors_web_lg-2.1.0.tar.gz

Dataset Preparations

Please follow the instructions in dataset_setup.md to download the datasets and features.

Search

To search an optimal architecture for a specific task, run

$ python3 search_[vqa|vgd|vqa].py

At the end of each searching epoch, we will output the optimal architecture (choosing operators with largest architecture weight for every block) accroding to current architecture weights. When the optimal architecture doesn't change for several continuous epochs, you can kill the searching process manually.

Training

The following script will start training network with the optimal architecture that we've searched by MMNas:

$ python3 train_[vqa|vgd|itm].py --RUN='train' --ARCH_PATH='./arch/train_vqa.json'

To add:

  1. --VERSION=str, e.g.--VERSION='mmnas_vqa' to assign a name for your this model.

  2. --GPU=str, e.g.--GPU='0, 1, 2, 3' to train the model on specified GPU device.

  3. --NW=int, e.g.--NW=8 to accelerate I/O speed.

  1. --RESUME to start training with saved checkpoint parameters.

  2. --ARCH_PATH can use the different searched architectures.

If you want to evaluate an architecture that you got from seaching stage, for example, it's the output architecture at the 50-th searching epoch for vqa model, you can run

$ python3 train_vqa.py --RUN='train' --ARCH_PATH='[PATH_TO_YOUR_SEARCHING_LOG]' --ARCH_EPOCH=50

Validation and Testing

Offline Evaluation

It's convenient to modify follows args: --RUN={'val', 'test'} --CKPT_PATH=[Your Model Path] to Run val or test Split.

Example:

$ python3 train_vqa.py --RUN='test' --CKPT_PATH=[Your Model Path] --ARCH_PATH=[Searched Architecture Path]

Online Evaluation (ONLY FOR VQA)

Test Result files will stored in ./logs/ckpts/result_test/result_train_[Your Version].json

You can upload the obtained result file to Eval AI to evaluate the scores on test-dev and test-std splits.

Pretrained Models

We provide the pretrained models in pretrained_models.md to reproduce the experimental results in our paper.

Citation

If this repository is helpful for your research, we'd really appreciate it if you could cite the following paper:

@article{yu2020mmnas,
  title={Deep Multimodal Neural Architecture Search},
  author={Yu, Zhou and Cui, Yuhao and Yu, Jun and Wang, Meng and Tao, Dacheng and Tian, Qi},
  journal={Proceedings of the 28th ACM International Conference on Multimedia},
  pages = {3743--3752},
  year={2020}
}
Owner
Vision and Language Group@ MIL
Hangzhou Dianzi University
Vision and Language Group@ MIL
Code accompanying "Evolving spiking neuron cellular automata and networks to emulate in vitro neuronal activity," accepted to IEEE SSCI ICES 2021

Evolving-spiking-neuron-cellular-automata-and-networks-to-emulate-in-vitro-neuronal-activity Code accompanying "Evolving spiking neuron cellular autom

SOCRATES: Self-Organizing Computational substRATES 2 Dec 02, 2022
NumQMBasic - A mini-course offered to Undergrad physics students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 35 Dec 05, 2022
Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

NVIDIA Research Projects 77 Dec 29, 2022
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth

Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth This codebase implements the loss function described in: Insta

209 Dec 07, 2022
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 07, 2023
Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022
NIMA: Neural IMage Assessment

PyTorch NIMA: Neural IMage Assessment PyTorch implementation of Neural IMage Assessment by Hossein Talebi and Peyman Milanfar. You can learn more from

Kyryl Truskovskyi 293 Dec 30, 2022
Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

Christopher T. Chubb 35 Dec 21, 2022
Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

Weakly- and Semi-Supervised Panoptic Segmentation by Qizhu Li*, Anurag Arnab*, Philip H.S. Torr This repository demonstrates the weakly supervised gro

Qizhu Li 159 Dec 20, 2022
A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

70 Jul 12, 2022
PyTorch implementation of Glow

glow-pytorch PyTorch implementation of Glow, Generative Flow with Invertible 1x1 Convolutions (https://arxiv.org/abs/1807.03039) Usage: python train.p

Kim Seonghyeon 433 Dec 27, 2022
Lucid Sonic Dreams syncs GAN-generated visuals to music.

Lucid Sonic Dreams Lucid Sonic Dreams syncs GAN-generated visuals to music. By default, it uses NVLabs StyleGAN2, with pre-trained models lifted from

731 Jan 02, 2023
RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

RRxIO - Robust Radar Visual/Thermal Inertial Odometry RRxIO offers robust and accurate state estimation even in challenging visual conditions. RRxIO c

Christopher Doer 64 Dec 29, 2022
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

235 Dec 26, 2022
Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Natural Posterior Network This repository provides the official implementation o

Oliver Borchert 54 Dec 06, 2022
Official code for "Maximum Likelihood Training of Score-Based Diffusion Models", NeurIPS 2021 (spotlight)

Maximum Likelihood Training of Score-Based Diffusion Models This repo contains the official implementation for the paper Maximum Likelihood Training o

Yang Song 84 Dec 12, 2022
Facebook Research 605 Jan 02, 2023
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
Multiple-Object Tracking with Transformer

TransTrack: Multiple-Object Tracking with Transformer Introduction TransTrack: Multiple-Object Tracking with Transformer Models Training data Training

Peize Sun 537 Jan 04, 2023