TailCalibX : Feature Generation for Long-tail Classification
by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi
[arXiv] [Code] [pip Package] [Video] 
Table of contents
- 
๐ฃ Easy Usage (Recommended way to use our method)
- 
๐งช Advanced Usage
- 
๐๏ธโโ๏ธ Trained weights
- 
๐ช Results on a Toy Dataset
- 
๐ด Directory Tree
- 
๐ Citation
- 
๐ Contributing
- 
โค About me
- 
โจ Extras
- 
๐ License
๐ฃ
  Easy Usage (Recommended way to use our method)
 
- Collect all the features from your dataloader.
- Use the tailcalibpackage to make the features balanced by generating samples.
- Train the classifier.
- Repeat.
๐ป
  Installation
Use the package manager pip to install tailcalib.
pip install tailcalib
๐จโ๐ป
  Example Code
Check the instruction here for a much more detailed python package information.
# Import
from tailcalib import tailcalib
# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"
# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))
# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)
# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")
๐งช
  Advanced Usage
โ
  Things to do before you run the code from this repo
- Change the data_rootfor your dataset inmain.py.
- If you are using wandb logging (Weights & Biases), make sure to change the wandb.initinmain.pyaccordingly.
๐
  How to use?
- For just the methods proposed in this paper :
- For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
- For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
 
- For CIFAR100-LT: 
- For all the results show in the paper :
- For CIFAR100-LT: run_all_CIFAR100-LT.sh
- For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh
 
- For CIFAR100-LT: 
๐
  How to create the mini-ImageNet-LT dataset?
Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.
โ
  Arguments
-  --seed: Select seed for fixing it.- Default : 1
 
- Default : 
-  --gpu: Select the GPUs to be used.- Default : "0,1,2,3"
 
- Default : 
-  --experiment: Experiment number (Check 'libs/utils/experiment_maker.py').- Default : 0.1
 
- Default : 
-  --dataset: Dataset number.- Choices : 0 - CIFAR100, 1 - mini-imagenet
- Default : 0
 
- Choices : 
-  --imbalance: Select Imbalance factor.- Choices : 0: 1, 1: 100, 2: 50, 3: 10
- Default : 1
 
- Choices : 
-  --type_of_val: Choose which dataset split to use.- Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
- Default : "vit"
 
- Choices: 
-  --cv1to--cv9: Custom variable to use in experiments - purpose changes according to the experiment.- Default : "1"
 
- Default : 
-  --train: Run training sequence- Default : False
 
- Default : 
-  --generate: Run generation sequence- Default : False
 
- Default : 
-  --retraining: Run retraining sequence- Default : False
 
- Default : 
-  --resume: Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.- Default : False
 
- Default : 
-  --save_features: Collect feature representations.- Default : False
 
- Default : 
-  --save_features_phase: Dataset split of representations to collect.- Choices : "train", "val", "test"
- Default : "train"
 
- Choices : 
-  --config: If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.- Default : None
 
- Default : 
๐๏ธโโ๏ธ
  Trained weights
| Experiment | CIFAR100-LT (ResNet32, seed 1, Imb 100) | mini-ImageNet-LT (ResNeXt50) | 
|---|---|---|
| TailCalib | Git-LFS | Git-LFS | 
| TailCalibX | Git-LFS | Git-LFS | 
| CBD + TailCalibX | Git-LFS | Git-LFS | 
๐ช
  Results on a Toy Dataset
The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.
Check this notebook to play with the toy example from which the plot below was generated. 
๐ด
  Directory Tree
TailCalibX
โโโ libs
โ   โโโ core
โ   โ   โโโ ce.py
โ   โ   โโโ core_base.py
โ   โ   โโโ ecbd.py
โ   โ   โโโ modals.py
โ   โ   โโโ TailCalib.py
โ   โ   โโโ TailCalibX.py
โ   โโโ data
โ   โ   โโโ dataloader.py
โ   โ   โโโ ImbalanceCIFAR.py
โ   โ   โโโ mini-imagenet
โ   โ       โโโ 0.01_test.txt
โ   โ       โโโ 0.01_train.txt
โ   โ       โโโ 0.01_val.txt
โ   โโโ loss
โ   โ   โโโ CosineDistill.py
โ   โ   โโโ SoftmaxLoss.py
โ   โโโ models
โ   โ   โโโ CosineDotProductClassifier.py
โ   โ   โโโ DotProductClassifier.py
โ   โ   โโโ ecbd_converter.py
โ   โ   โโโ ResNet32Feature.py
โ   โ   โโโ ResNext50Feature.py
โ   โ   โโโ ResNextFeature.py
โ   โโโ samplers
โ   โ   โโโ ClassAwareSampler.py
โ   โโโ utils
โ       โโโ Default_config.yaml
โ       โโโ experiments_maker.py
โ       โโโ globals.py
โ       โโโ logger.py
โ       โโโ utils.py
โโโ LICENSE
โโโ main.py
โโโ Notebooks
โ   โโโ Create_mini-ImageNet-LT.ipynb
โ   โโโ toy_example.ipynb
โโโ readme_assets
โ   โโโ method.svg
โ   โโโ toy_example_output.svg
โโโ README.md
โโโ run_all_CIFAR100-LT.sh
โโโ run_all_mini-ImageNet-LT.sh
โโโ run_TailCalibX_CIFAR100-LT.sh
โโโ run_TailCalibX_mini-imagenet-LT.sh
Ignored tailcalib_pip as it is for the tailcalib pip package.
๐
  Citation
@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}
๐
  Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
โค
  About me
โจ
  Extras
