Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Related tags

Deep LearningDisAlign
Overview

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

[Paper][Code]

We implement the classification, object detection and instance segmentation tasks based on our cvpods. The users should install cvpods first and run the experiments in this repo.

Changelog

  • 4.23.2021 Update the DisAlign on LVIS v0.5(Mask R-CNN + Res50)
  • 4.12.2021 Update the README

0. How to Use

  • Step-1: Install the latest cvpods.
  • Step-2: cd cvpods
  • Step-3: Prepare dataset for different tasks.
  • Step-4: git clone https://github.com/Megvii-BaseDetection/DisAlign playground_disalign
  • Step-5: Enter one folder and run pods_train --num-gpus 8
  • Step-6: Use pods_test --num-gpus 8 to evaluate the last the checkpoint

1. Image Classification

We support the the following three datasets:

  • ImageNet-LT Dataset
  • iNaturalist-2018 Dataset
  • Place-LT Dataset

We refer the user to CLS_README for more details.

2. Object Detection/Instance Segmentation

We support the two versions of the LVIS dataset:

  • LVIS v0.5
  • LVIS v1.0

Highlight

  1. To speedup the evaluation on LVIS dataset, we provide the C++ optimized evaluation api by modifying the coco_eval(C++) in cvpods.
  • The C++ version lvis_eval API will save ~30% time when calculating the mAP.
  1. We provide support for the metric of AP_fixed and AP_pool proposed in large-vocab-devil
  2. We will support more recent works on long-tail detection in this project(e.g. EQLv2, CenterNet2, etc.) in the future.

We refer the user to DET_README for more details.

3. Semantic Segmentation

We adopt the mmsegmentation as the codebase for runing all experiments of DisAlign. Currently, the user should use DisAlign_Seg for the semantic segmentation experiments. We will add the support for these experiments in cvpods in the future.

Acknowledgement

Thanks for the following projects:

Citing DisAlign

If you are using the DisAlign in your research or with to refer to the baseline results publised in this repo, please use the following BibTex entry.

@inproceedings{zhang2021disalign,
  title={Distribution Alignment: A Unified Framework for Long-tail Visual Recognition.},
  author={Zhang, Songyang and Li, Zeming and Yan, Shipeng and He, Xuming and Sun, Jian},
  booktitle={CVPR},
  year={2021}
}

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Comments
  • scale in cosine classifier

    scale in cosine classifier

    Hi, thanks for your great work! I notice you use the cosine classifier in many experiments and it can get a better baseline. The formula is as follows

    image

    I am wondering the value of s?

    opened by L1aoXingyu 5
  •  Is it correct to freeze the weight and bias of the DisAlign Linear Layer as well?

    Is it correct to freeze the weight and bias of the DisAlign Linear Layer as well?

    Hello. Thank you for your project! I'm testing your code on my custom dataset. My task is classification. I have a question about your code implementation.

    https://github.com/Megvii-BaseDetection/DisAlign/blob/a2fc3500a108cb83e3942293a5675c97ab3a2c6e/classification/imagenetlt/resnext50/resx50.scratch.imagenet_lt.224size.90e.disalign.10e/net.py#L56-L62

    From my understanding, in stage 2, remove the linear layer used in stage 1 and add DisAlign Linear Layer. And freeze all parts except for logit_scale, logit_bias, and confidence_layer. At this time, the weight and bias of DisAlignLinear are also frozen. (self.weight, self.bias) Is my understanding correct?

    If so, are the weight and bias of DisAlignLinearLayer fixed after the initialization? (The weight and bias of the linear layer in stage 1 are not copied either)

    If my understanding is correct, why is the weight of DisAlignLinear also frozen?

    I will wait for your reply. thanks!

    opened by jeongHwarr 4
  • Where is the DisAlignLinear module?

    Where is the DisAlignLinear module?

    Hello. Thank you for your impressive project!

    I want to apply DisAlign to classification. However, an error occurs in the import part. https://github.com/Megvii-BaseDetection/DisAlign/blob/a2fc3500a108cb83e3942293a5675c97ab3a2c6e/classification/imagenetlt/resnext50/resx50.scratch.imagenet_lt.224size.90e.disalign.10e/net.py#L7 I coudn't find the DisAlignLinear in cvpods.layers. and there also isn't exist at https://github.com/Megvii-BaseDetection/cvpods/tree/master/cvpods/layers How can I solve this problem?

    Thank you!

    opened by jeongHwarr 4
  • Can someone kindly share their codes of Classification task on ImageNet_LT?

    Can someone kindly share their codes of Classification task on ImageNet_LT?

    I tried to train the proposed method on ImageNet_LT, but I can only get an average testing rate about 49%, which is far from the rate described in the paper (52.9). Some of the details regarding my implementations are given as follows: (1) The feature extractor is ResNexT-50 and the head classifier is a linear classifier. The testing accuracy in Stage-One is 43.9%, which is OK.

    (2) The testing accuracy of adopting cRT method in Stage-Two is 49.6%, which is identical to one reported in other papers. (3) When fine-tuning the model in Stage-2, both the feature-extractor and head classifier are frozen, and a DisAliLinear model (which is implemented in CVPODs) is retrained. The testing accuracy can only reach 48.8%, which is far away from the one reported in your paper.

    opened by smallcube 4
  • The code for semantic segmentation is missing

    The code for semantic segmentation is missing

    Hi, thank you for the nice work, but the code for semantic segmentation is missing and the URL for it in the README could not be opened. Could you please fix this issue?

    opened by curiosity654 3
  • About the reference Distribution p_r in Eq. (10)

    About the reference Distribution p_r in Eq. (10)

    Hi, Thank you for providing your code. Here I was wondering the Equation (10) in your paper (The definition of p_r), which seems not to be a distribution. Since every x_i can only have one label, the reference distribution p_r(y| x_i) will be the distribution like (0, 0, 0,...,w_c, 0, 0,...,0). And the sum of this distribution is w_c, but not 1.

    Could you help me understand this equation? Thanks in advance.

    opened by Kevinz-code 3
  • import error

    import error

    Hi, thanks for the great work. Maybe I missed it, but it seems that the code for this project has been incorporated into cvpods. I couldn't launch any experiments due to ImportErrors like: from cvpods.layers import DisAlignLinear ImportError: cannot import name 'DisAlignLinear' from 'cvpods.layers' Also, I didn't find the corresponding functions in cvpods.

    Any help will be appreciated. Thanks.

    opened by YUE-FAN 2
  • about the confidence score σ(x)

    about the confidence score σ(x)

    In the paper, the σ(x) is implemented as a linear layer followed by a non-linear activation function (e.g., sigmoid function) for all input x. How to understand the input x?the matrix of raw iamge, or the extracted features, even or cls_score? Thank you!

    opened by lzed2399 2
  • exp_reweight = exp_reweight / np.sum(exp_reweight) * num_foreground

    exp_reweight = exp_reweight / np.sum(exp_reweight) * num_foreground

    Dear author, I have some questions about the code and paper:

    1. exp_reweight = exp_reweight / np.sum(exp_reweight) * num_foreground Why "exp_reweight" is multiplied by the coefficient "num_foreground"? It is not mentioned in the paper.
    2. Is "K" in the empirical class frequencies r = [r1, · · · , rK] on the training set in the paper the same as the class number C of the training set?
    opened by Liu-wanbing 2
  • The DisAlign_Seg page can't open

    The DisAlign_Seg page can't open

    opened by Kittywyk 1
  • Do you use validation dataset?

    Do you use validation dataset?

    https://github.com/Megvii-BaseDetection/DisAlign/blob/main/classification/imagenetlt/resnext50/resx50.scratch.imagenet_lt.224size.90e.disalign.10e/config.py#L31

    It seems that you only use test dataset? What is the reason for that?

    opened by qianlanwyd 1
  • How can I test and augtest the trained semseg DisAlign model?

    How can I test and augtest the trained semseg DisAlign model?

    opened by jh151170 0
  • the code question in semantic_seg

    the code question in semantic_seg

    Hi, I have a questation about the logit_scale and logit_bias in semantic_seg. The shape of the above parameter is (1, num_classes, 1, 1), why not is (1, num_classes, 512, 512) which is matched the input image size for semantic segmenation.

    opened by Ianresearch 8
  • Value of the learned scale and bias vector?

    Value of the learned scale and bias vector?

    Hi, did you check the value change of the learned scale and bias vector throughout the training process? I find the value of them change in the first few iterations and remain stable in the rest time on my own classification dataset. I wonder how the learned vectors look like in your paper? Thanks!

    opened by Jacobew 1
Owner
BaseDetection Team of Megvii
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

Daniele Grattarola 37 Dec 13, 2022
[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks Code for NeurIPS 2021 Paper "Exploring Architectural Ingredients of A

Hanxun Huang 26 Dec 01, 2022
(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Wasserstein Distances for Stereo Disparity Estimation Accepted in NeurIPS 2020 as Spotlight. [Project Page] Wasserstein Distances for Stereo Disparity

Divyansh Garg 92 Dec 12, 2022
details on efforts to dump the Watermelon Games Paprium cart

Reminder, if you like these repos, fork them so they don't disappear https://github.com/ArcadeHustle/WatermelonPapriumDump/fork Big thanks to Fonzie f

Hustle Arcade 29 Dec 11, 2022
Little tool in python to watch anime from the terminal (the better way to watch anime)

ani-cli Script working again :), thanks to the fork by Dink4n for the alternative approach to by pass the captcha on gogoanime A cli to browse and wat

Harshith 4.5k Dec 31, 2022
Supplementary materials for ISMIR 2021 LBD paper "Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes"

Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes Supplementary materials for ISMIR 2021 LBD submission: K. N. W

Karn Watcharasupat 2 Oct 25, 2021
NeROIC: Neural Object Capture and Rendering from Online Image Collections

NeROIC: Neural Object Capture and Rendering from Online Image Collections This repository is for the source code for the paper NeROIC: Neural Object C

Snap Research 647 Dec 27, 2022
API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023
Markov Attention Models

Introduction This repo contains code for reproducing the results in the paper Graphical Models with Attention for Context-Specific Independence and an

Vicarious 0 Dec 09, 2021
Learning What and Where to Draw

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee This is the code for our NIPS 201

Scott Ellison Reed 337 Nov 18, 2022
PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning PyTorch code for our ACL 2020 paper "MART: Memory-Augmented Recur

Jie Lei 雷杰 151 Jan 06, 2023
imbalanced-DL: Deep Imbalanced Learning in Python

imbalanced-DL: Deep Imbalanced Learning in Python Overview imbalanced-DL (imported as imbalanceddl) is a Python package designed to make deep imbalanc

NTUCSIE CLLab 19 Dec 28, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022
GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

GraPE GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both of

AnacletoLab 194 Dec 29, 2022
Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

Captcha Solving Using TensorFlow Introduction Solve captcha using TensorFlow. Learn CNN and TensorFlow by a practical project. Follow the steps, run t

Jackon Yang 869 Jan 06, 2023
Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21

Skeletal-GNN Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21 Various deep learning techniques have been propose

37 Oct 23, 2022
Evaluation suite for large-scale language models.

This repo contains code for running the evaluations and reproducing the results from the Jurassic-1 Technical Paper (see blog post), with current support for running the tasks through both the AI21 S

71 Dec 17, 2022
Contains code for Deep Kernelized Dense Geometric Matching

DKM - Deep Kernelized Dense Geometric Matching Contains code for Deep Kernelized Dense Geometric Matching We provide pretrained models and code for ev

Johan Edstedt 83 Dec 23, 2022