[ICLR 2021 Spotlight] Pytorch implementation for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

Overview

RIDE: Long-tailed Recognition by Routing Diverse Distribution-Aware Experts.

by Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu and Stella X. Yu at UC Berkeley/ICSI and NTU

International Conference on Learning Representations (ICLR), 2021. Spotlight Presentation

Project Page | PDF | Preprint | OpenReview | Slides | Citation

This repository contains an official re-implementation of RIDE from the authors, while also has plans to support other works on long-tailed recognition. Further information please contact Xudong Wang and Long Lian.

Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star and a citation.

@inproceedings{wang2021longtailed,
  title={Long-tailed Recognition by Routing Diverse Distribution-Aware Experts},
  author={Xudong Wang and Long Lian and Zhongqi Miao and Ziwei Liu and Stella Yu},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=D9I3drBz4UC}
}

Supported Methods for Long-tailed Recognition:

  • RIDE
  • Cross-Entropy (CE) Loss
  • Focal Loss
  • LDAM Loss
  • Decouple: cRT (limited support for now)
  • Decouple: tau-normalization (limited support for now)

Updates

[04/2021] Pre-trained models are avaliable in model zoo.

[12/2020] We added an approximate GFLops counter. See usages below. We also refactored the code and fixed a few errors.

[12/2020] We have limited support on cRT and tau-norm in load_stage1 option and t-normalization.py, please look at the code comments for instructions while we are still working on it.

[12/2020] Initial Commit. We re-implemented RIDE in this repo. LDAM/Focal/Cross-Entropy loss is also re-implemented (instruction below).

Table of contents

Requirements

Packages

  • Python >= 3.7, < 3.9
  • PyTorch >= 1.6
  • tqdm (Used in test.py)
  • tensorboard >= 1.14 (for visualization)
  • pandas
  • numpy

Hardware requirements

8 GPUs with >= 11G GPU RAM are recommended. Otherwise the model with more experts may not fit in, especially on datasets with more classes (the FC layers will be large). We do not support CPU training, but CPU inference could be supported by slight modification.

Dataset Preparation

CIFAR code will download data automatically with the dataloader. We use data the same way as classifier-balancing. For ImageNet-LT and iNaturalist, please prepare data in the data directory. ImageNet-LT can be found at this link. iNaturalist data should be the 2018 version from this repo (Note that it requires you to pay to download now). The annotation can be found at here. Please put them in the same location as below:

data
├── cifar-100-python
│   ├── file.txt~
│   ├── meta
│   ├── test
│   └── train
├── cifar-100-python.tar.gz
├── ImageNet_LT
│   ├── ImageNet_LT_open.txt
│   ├── ImageNet_LT_test.txt
│   ├── ImageNet_LT_train.txt
│   ├── ImageNet_LT_val.txt
│   ├── test
│   ├── train
│   └── val
└── iNaturalist18
    ├── iNaturalist18_train.txt
    ├── iNaturalist18_val.txt
    └── train_val2018

How to get pretrained checkpoints

We have a model zoo available.

Training and Evaluation Instructions

Imbalanced CIFAR 100/CIFAR100-LT

RIDE Without Distill (Stage 1)
python train.py -c "configs/config_imbalance_cifar100_ride.json" --reduce_dimension 1 --num_experts 3

Note: --reduce_dimension 1 means set reduce dimension to True. The template has an issue with bool arguments so int argument is used here. However, any non-zero value will be equivalent to bool True.

RIDE With Distill (Stage 1)
python train.py -c "configs/config_imbalance_cifar100_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

Distillation is not required but could be performed if you'd like further improvements.

RIDE Expert Assignment Module Training (Stage 2)
python train.py -c "configs/config_imbalance_cifar100_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

Note: different runs will result in different EA modules with different trade-off. Some modules give higher accuracy but require higher FLOps. Although the only difference is not underlying ability to classify but the "easiness to satisfy and stop". You can tune the pos_weight if you think the EA module consumes too much compute power or is using too few expert.

ImageNet-LT

RIDE Without Distill (Stage 1)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_ride.json" --reduce_dimension 1 --num_experts 3
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_ride.json" --reduce_dimension 1 --num_experts 3
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_ride.json" --reduce_dimension 1 --num_experts 3

RIDE With Distill (Stage 1)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

RIDE Expert Assignment Module Training (Stage 2)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

iNaturalist

RIDE Without Distill (Stage 1)

python train.py -c "configs/config_iNaturalist_resnet50_ride.json" --reduce_dimension 1 --num_experts 3

RIDE With Distill (Stage 1)

python train.py -c "configs/config_iNaturalist_resnet50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

RIDE Expert Assignment Module Training (Stage 2)

python train.py -c "configs/config_iNaturalist_resnet50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

Using Other Methods with RIDE

  • Focal Loss: switch the loss to Focal Loss
  • Cross Entropy: switch the loss to Cross Entropy Loss

Test

To test a checkpoint, please put it with the corresponding config file.

python test.py -r path_to_checkpoint

Please see the pytorch template that we use for additional more general usages of this project (e.g. loading from a checkpoint, etc.).

GFLops calculation

We provide an experimental support for approximate GFLops calculation. Please open an issue if you encounter any problem or meet inconsistency in GFLops.

You need to install thop package first. Then, according to your model, run python -m utils.gflops (args) in the project directory.

Examples and explanations

Use python -m utils.gflops to see the documents as well as explanations for this calculator.

ImageNet-LT
python -m utils.gflops ResNeXt50Model 0 --num_experts 3 --reduce_dim True --use_norm False

To change model, switch ResNeXt50Model to the ones used in your config. use_norm comes with LDAM-based methods (including RIDE). reduce_dim is used in default RIDE models. The 0 in the command line indicates the dataset.

All supported datasets:

  • 0: ImageNet-LT
  • 1: iNaturalist
  • 2: Imbalance CIFAR 100
iNaturalist
python -m utils.gflops ResNet50Model 1 --num_experts 3 --reduce_dim True --use_norm True
Imbalance CIFAR 100
python -m utils.gflops ResNet32Model 2 --num_experts 3 --reduce_dim True --use_norm True
Special circumstances: calculate the approximate GFLops in models with expert assignment module

We provide a ea_percentage for specifying the percentage of data that pass each expert. Note that you need to switch to the EA model as well since you actually use EA model instead of the original model in training and inference.

An example:

python -m utils.gflops ResNet32EAModel 2 --num_experts 3 --reduce_dim True --use_norm True --ea_percentage 40.99,9.47,49.54

FAQ

See FAQ.

How to get support from us?

If you have any general questions, feel free to email us at longlian at berkeley.edu and xdwang at eecs.berkeley.edu. If you have code or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).

Pytorch template

This is a project based on this pytorch template. The readme of the template explains its functionality, although we try to list most frequently used ones in this readme.

License

This project is licensed under the MIT License. See LICENSE for more details. The parts described below follow their original license.

Acknowledgements

This is a project based on this pytorch template. The pytorch template is inspired by the project Tensorflow-Project-Template by Mahmoud Gemy

The ResNet and ResNeXt in fb_resnets are based on from Classifier-Balancing/Decouple. The ResNet in ldam_drw_resnets/LDAM loss/CIFAR-LT are based on LDAM-DRW. KD implementation takes references from CRD/RepDistiller.

Owner
Xudong (Frank) Wang
Ph.D. Student @ EECS, UC Berkeley; Graduate Student Researcher @ International Computer Science Institute, Berkeley, USA
Xudong (Frank) Wang
Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.

Accurately generate all possible forms of an English word Word forms can accurately generate all possible forms of an English word. It can conjugate v

Dibya Chakravorty 570 Dec 31, 2022
A CSRankings-like index for speech researchers

Speech Rankings This project mimics CSRankings to generate an ordered list of researchers in speech/spoken language processing along with their possib

Mutian He 19 Nov 26, 2022
NLP-SentimentAnalysis - Coursera Course ( Duration : 5 weeks ) offered by DeepLearning.AI

Coursera Natural Language Processing Specialization This repository contains material related to Coursera Natural Language Processing Specialization.

Nishant Sharma 1 Jun 05, 2022
A raytrace framework using taichi language

ti-raytrace The code use Taichi programming language Current implement acceleration lvbh disney brdf How to run First config your anaconda workspace,

蕉太狼 73 Dec 11, 2022
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Wav2Vec2CTC With KenLM Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription. Assuming you'v

farisalasmary 65 Sep 21, 2022
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

CTC Decoding Algorithms Update 2021: installable Python package Python implementation of some common Connectionist Temporal Classification (CTC) decod

Harald Scheidl 736 Jan 03, 2023
A Fast Command Analyser based on Dict and Pydantic

Alconna Alconna 隶属于ArcletProject, 在Cesloi内有内置 Alconna 是 Cesloi-CommandAnalysis 的高级版,支持解析消息链 一般情况下请当作简易的消息链解析器/命令解析器 文档 暂时的文档 Example from arclet.alcon

19 Jan 03, 2023
List of GSoC organisations with number of times they have been selected.

Welcome to GSoC Organisation Frequency And Details 👋 List of GSoC organisations with number of times they have been selected, techonologies, topics,

Shivam Kumar Jha 41 Oct 01, 2022
This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

About CappuccinoJs This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini! Este conversor criar

Arthur Ottoni Ribeiro 48 Nov 15, 2022
Lattice methods in TensorFlow

TensorFlow Lattice TensorFlow Lattice is a library that implements constrained and interpretable lattice based models. It is an implementation of Mono

504 Dec 20, 2022
NLPShala , the best IDE for all Natural language processing tasks.

The revolutionary IDE for all NLP (Natural language processing) stuffs on the internet.

Abhi 3 Aug 08, 2021
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group 8.4k Dec 30, 2022
PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for t

922 Dec 31, 2022
AI_Assistant - This is a Python based Voice Assistant.

This is a Python based Voice Assistant. This was programmed to increase my understanding of python and also how the in-general Voice Assistants work.

1 Jan 06, 2022
🤕 spelling exceptions builder for lazy people

🤕 spelling exceptions builder for lazy people

Vlad Bokov 3 May 12, 2022
In this Notebook I've build some machine-learning and deep-learning to classify corona virus tweets, in both multi class classification and binary classification.

Hello, This Notebook Contains Example of Corona Virus Tweets Multi Class Classification. - Classes is: Extremely Positive, Positive, Extremely Negativ

Khaled Tofailieh 3 Dec 06, 2022
华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

HUAWEI STORE GO 2021 说明 基于Python3+Selenium的华为商城抢购爬虫脚本,修改自近两年没更新的项目BUY-HW,为女神抢Nova 8(什么时候华为开始学小米玩饥饿营销了?) 原项目的登陆以及抢购部分已经不可用,本项目对原项目进行了改正以适应新华为商城,并增加一些功能

ZhangLiang 111 Dec 22, 2022
Host your own GPT-3 Discord bot

GPT3 Discord Bot Host your own GPT-3 Discord bot i'd host and make the bot invitable myself, however GPT3 terms of service prohibit public use of GPT3

[something hillarious here] 8 Jan 07, 2023
History Aware Multimodal Transformer for Vision-and-Language Navigation

History Aware Multimodal Transformer for Vision-and-Language Navigation This repository is the official implementation of History Aware Multimodal Tra

Shizhe Chen 46 Nov 23, 2022