[NeurIPS'20] Multiscale Deep Equilibrium Models

Related tags

Deep Learningmdeq
Overview

Multiscale Deep Equilibrium Models

💥 💥 💥 💥

This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simpler & more efficient) implementation of MDEQ with the same set of tasks as here is now available in the DEQ repo.

We STRONGLY recommend using with the MDEQ-Vision code in the DEQ repo (which also supports Jacobian-related analysis).

💥 💥 💥 💥


This repository contains the code for the multiscale deep equilibrium (MDEQ) model proposed in the paper Multiscale Deep Equilibrium Models by Shaojie Bai, Vladlen Koltun and J. Zico Kolter.

Is implicit deep learning relevant for general, large-scale pattern recognition tasks? We propose the multiscale deep equilibrium (MDEQ) model, which expands upon the DEQ formulation substantially to introduce simultaneous equilibrium modeling of multiple signal resolutions. Specifically, MDEQ solves for and backpropagates through synchronized equilibria of multiple feature representation streams. Such structure rectifies one of the major drawbacks of DEQ, and provide natural hierarchical interfaces for auxiliary losses and compound training procedures (e.g., pretraining and finetuning). Our experiment demonstrate for the first time that "shallow" implicit models can scale to and achieve near-SOTA results on practical computer vision tasks (e.g., megapixel images on Cityscapes segmentation).

We provide in this repo the implementation and the links to the pretrained classification & segmentation MDEQ models.

If you find thie repository useful for your research, please consider citing our work:

@inproceedings{bai2020multiscale,
    author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},
    title     = {Multiscale Deep Equilibrium Models},
    booktitle   = {Advances in Neural Information Processing Systems (NeurIPS)},
    year      = {2020},
}

Overview

The structure of a multiscale deep equilibrium model (MDEQ) is shown below. All components of the model are shown in this figure (in practice, we use n=4).

Examples

Some examples of MDEQ segmentation results on the Cityscapes dataset.

Requirements

PyTorch >=1.4.0, torchvision >= 0.4.0

Datasets

  • CIFAR-10: We download the CIFAR-10 dataset using PyTorch's torchvision package (included in this repo).
  • ImageNet We follow the implementation from the PyTorch ImageNet Training repo.
  • Cityscapes: We download the Cityscapes dataset from its official website and process it according to this repo. Cityscapes dataset additionally require a list folder that aligns each original image with its corresponding labeled segmented image. This list folder can be downloaded here.

All datasets should be downloaded, processed and put in the respective data/[DATASET_NAME] directory. The data/ directory should look like the following:

data/
  cityscapes/
  imagenet/
  ...          (other datasets)
  list/        (see above)

Usage

All experiment settings are provided in the .yaml files under the experiments/ folder.

To train an MDEQ classification model on ImageNet/CIFAR-10, do

python tools/cls_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

To train an MDEQ segmentation model on Cityscapes, do

python -m torch.distributed.launch --nproc_per_node=4 tools/seg_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

where you should provide the pretrained ImageNet model path in the corresponding configuration (.yaml) file. We provide a sample pretrained model extractor in pretrained_models/, but you can also write your own script.

Similarly, to test the model and generate segmentation results on Cityscapes, do

python tools/seg_test.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

You can (and probably should) initiate the Cityscapes training with an ImageNet-pretrained MDEQ. You need to extract the state dict from the ImageNet checkpointed model, and set the MODEL.PRETRAINED entry in Cityscapes yaml file to this state dict on disk.

The model implementation and MDEQ's algorithmic components (e.g., L-Broyden's method) can be found in lib/.

Pre-trained Models

We provide some reasonably good pre-trained weights here so that one can quickly play with DEQs without training from scratch.

Description Task Dataset Model
MDEQ-XL ImageNet Classification ImageNet download (.pkl)
MDEQ-XL Cityscapes(val) Segmentation Cityscapes download (.pkl)
MDEQ-Small ImageNet Classification ImageNet download (.pkl)
MDEQ-Small Cityscapes(val) Segmentation Cityscapes download (.pkl)

I. Example of how to evaluate the pretrained ImageNet model:

  1. Download the pretrained ImageNet .pkl file. (I recommend using the gdown command!)
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. Run the MDEQ classification validation command:
python tools/cls_valid.py --testModel pretrained_models/[FILENAME] --cfg experiments/imagenet/cls_mdeq_[SIZE].yaml

For example, for MDEQ-Small, you should get >75% top-1 accuracy.

II. Example of how to use the pretrained ImageNet model to train on Cityscapes:

  1. Download the pretrained ImageNet .pkl file.
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set MODEL.PRETRAINED to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation training command (see the "Usage" section above):
python -m torch.distributed.launch --nproc_per_node=[N_GPUS] tools/seg_train.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

III. Example of how to use the pretrained Cityscapes model for inference:

  1. Download the pretrained Cityscapes .pkl file
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set TEST.MODEL_FILE to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation testing command (see the "Usage" section above):
python tools/seg_test.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

Tips:

  • To load the Cityscapes pretrained model, download the .pkl file and specify the path in config.[TRAIN/TEST].MODEL_FILE (which is '' by default) in the .yaml files. This is different from setting MODEL.PRETRAINED, see the point below.
  • The difference between [TRAIN/TEST].MODEL_FILE and MODEL.PRETRAINED arguments in the yaml files: the former is used to load all of the model parameters; the latter is for compound training (e.g., when transferring from ImageNet to Cityscapes, we want to discard the final classifier FC layers).
  • The repo supports checkpointing of models at each epoch. One can resume from a previously saved checkpoint by turning on the TRAIN.RESUME argument in the yaml files.
  • Just like DEQs, the MDEQ models can be slower than explicit deep networks, and even more so as the image size increases (because larger images typically require more Broyden iterations to converge well; see Figure 5 in the paper). But one can play with the forward and backward thresholds to adjust the runtime.

Acknowledgement

Some utilization code (e.g., model summary and yaml processing) of this repo were modified from the HRNet repo and the DEQ repo.

Owner
CMU Locus Lab
Zico Kolter's Research Group
CMU Locus Lab
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Photogrammetry & Robotics Bonn 701 Dec 30, 2022
InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing

InsTrim The paper: InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing Build Prerequisite llvm-8.0-dev clang-8.0 cmake = 3.2 Make git cl

75 Dec 23, 2022
Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Jun Chen 139 Dec 21, 2022
Accelerate Neural Net Training by Progressively Freezing Layers

FreezeOut A simple technique to accelerate neural net training by progressively freezing layers. This repository contains code for the extended abstra

Andy Brock 203 Jun 19, 2022
Deep Learning to Create StepMania SM FIles

StepCOVNet Running Audio to SM File Generator Currently only produces .txt files. Use SMDataTools to convert .txt to .sm python stepmania_note_generat

Chimezie Iwuanyanwu 8 Jan 08, 2023
Ensemble Visual-Inertial Odometry (EnVIO)

Ensemble Visual-Inertial Odometry (EnVIO) Authors : Jae Hyung Jung, Yeongkwon Choe, and Chan Gook Park 1. Overview This is a ROS package of Ensemble V

Jae Hyung Jung 95 Jan 03, 2023
Source code for our paper "Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures"

Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Code for the Multiplex Molecular Graph Neural Network (M

shzhang 59 Dec 10, 2022
Action Recognition for Self-Driving Cars

Action Recognition for Self-Driving Cars This repo contains the codes for the 2021 Fall semester project "Action Recognition for Self-Driving Cars" at

VITA lab at EPFL 3 Apr 07, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 09, 2022
FwordCTF 2021 Infrastructure and Source code of Web/Bash challenges

FwordCTF 2021 You can find here the source code of the challenges I wrote (Web and Bash) in FwordCTF 2021 and the source code of the platform with our

Kahla 5 Nov 25, 2022
Pseudo-Visual Speech Denoising

Pseudo-Visual Speech Denoising This code is for our paper titled: Visual Speech Enhancement Without A Real Visual Stream published at WACV 2021. Autho

Sindhu 94 Oct 22, 2022
The FIRST GANs-based omics-to-omics translation framework

OmiTrans Please also have a look at our multi-omics multi-task DL freamwork 👀 : OmiEmbed The FIRST GANs-based omics-to-omics translation framework Xi

Xiaoyu Zhang 6 Dec 14, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

52 Nov 20, 2022
Official Pytorch Implementation of GraphiT

GraphiT: Encoding Graph Structure in Transformers This repository implements GraphiT, described in the following paper: Grégoire Mialon*, Dexiong Chen

Inria Thoth 80 Nov 27, 2022
Read number plates with https://platerecognizer.com/

HASS-plate-recognizer Read vehicle license plates with https://platerecognizer.com/ which offers free processing of 2500 images per month. You will ne

Robin 69 Dec 30, 2022
Open AI's Python library

OpenAI Python Library The OpenAI Python library provides convenient access to the OpenAI API from applications written in the Python language. It incl

Pavan Ananth Sharma 3 Jul 10, 2022
Code for "Localization with Sampling-Argmax", NeurIPS 2021

Localization with Sampling-Argmax [Paper] [arXiv] [Project Page] Localization with Sampling-Argmax Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-

JeffLi 71 Dec 17, 2022
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Lyndon Chan 48 Dec 18, 2022
Huawei Hackathon 2021 - Sweden (Stockholm)

huawei-hackathon-2021 Contributors DrakeAxelrod Challenge Requirements: python=3.8.10 Standard libraries (no importing) Important factors: Data depend

Drake Axelrod 32 Nov 08, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022