ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

Overview

The ImageNet-CoG Benchmark

Project Website Paper (arXiv)

Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalization in Visual Representation Learning" (ICCV 2021). It contains code for reproducing all the experiments reported in the paper, as well as instructions on how to evaluate any custom model on the ImageNet-CoG Benchmark.

@InProceedings{sariyildiz2021conceptgeneralization,
    title={Concept Generalization in Visual Representation Learning},
    author={Sariyildiz, Mert Bulent and Kalantidis, Yannis and Larlus, Diane and Alahari, Karteek},
    booktitle={International Conference on Computer Vision},
    year={2021}
}

Contents of the Readme file:

Prerequisites: Benchmark data and code

Installation

We developed the benchmark using:

  • GCC 7.5.0
  • Python 3.7.9
  • PyTorch 1.7.1
  • Torchvision 0.8.2
  • CUDA 10.2.89
  • Optuna 2.3.0
  • YACS 0.1.8
  • termcolor 1.1.0

We recommend creating a separate conda environment for the benchmark:

conda create -n cog python=3.7.9
conda activate cog
conda install -c pytorch pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2
conda install -c conda-forge optuna=2.3
conda install termcolor
pip install yacs

Note: To reproduce the results for SimCLR-v2 and BYOL, you will also need TensorFlow, as well as the repos that contain code for converting the SimCLR-v2 and BYOL pre-trained models into PyTorch. More info can be found in the corresponding scripts under prepare_models/.

Data

To evaluate a model on the ImageNet-CoG benchmark you will need the following data:

The full ImageNet dataset (IN-21K)

The levels of ImageNet-CoG consist of a selection of 5K synsets in the full ImageNet. To download the full ImageNet, you need to create an account on the ImageNet website. Then, under the "Download" section, you will find the "Winter 2021" release (a tar file of size 1.1TB). Once you download it, extract it to folder <imagenet_images_root> (you will need this path later when extracting features). After extracting you should see the following folder structure, i.e., a separate folder of images per synset:

<imagenet_images_root>
...
|--- n07696728
|   |--- *.JPEG
|--- n11944954
|   |--- *.JPEG
...

The ILSVRC-2012 dataset (IN-1K)

To evaluate models on the pretraining dataset, you will need the ubiquitous ILSVRC-2012 subset of ImageNet, which we also refer to as IN-1K. It is also available on the ImageNet website. You will download two tar files for training (138GB) and validation (6.3GB) images, and extract them under <in1k_images_root> (again, you will need this path later). We expect the following folder structure for IN-1K:

<in1k_images_root>
...
|--- train
|   |--- n11939491
|       |--- *.JPEG
|   |--- n07836838
|       |--- *.JPEG
|--- val
|   |--- n11939491
|       |--- *.JPEG
|   |--- n07836838
|       |--- *.JPEG
...

The CoG level files

These are files that contain the concepts and data splits for ImageNet-CoG, and can be directly downloaded from the links below:

(Note: If clicking on the file names does not open a pop-up window for download, try 1) entering the file URLs directly on the address bar of your browser, or 2) using wget by giving the file URLs as arguments.)

Evaluating a model on the ImageNet-CoG benchmark

After installing the required packages and downloading the ImageNet data, you can follow the steps below:

  1. CoG step-1: Prepare the model you want to evaluate.
  2. CoG step-2: Extract image features for IN-1K and the CoG levels using the frozen backbone of the model.
  3. CoG step-3: Train linear classifiers on the pre-extracted features and measure accuracy.

CoG step-1: Model preparation

The ImageNet-CoG benchmark is designed to evaluate models that are pre-trained on IN-1K (the ILSVRC-2012 dataset).

Models evaluated in the paper

To reproduce the results for any of the models evaluated in our paper you can follow the instructions for downloading the their checkpoints. You will end up with all the checkpoints under the folder <models_root_dir>. Note that every model has a unique name that should be passed with the --model argument in all the scripts below. You can find the model names we used for all the models evaluated in our paper in this table.

Custom models

Follow these three steps to prepare your model for evaluation on the CoG benchmark.

  1. Give a name to your model: Every model has a unique name that is passed with the --model=<model_name> argument in all the scripts below. The model name consists of two parts <model_name>="<model_title>_<architecture_name>". You need to give such a name to your model, e.g., <myModel_myArchitecture>.

  2. Write a model loader function: To be able to extract features from your custom model, the load_pretrained_backbone() function must be able to load your pretrained model correctly. To do so, you will need to add a custom model loader function in model_loaders.py, that takes as input an arguments dict init_args and returns the backbone module and its forward function, e.g., backbone, forward = load_my_model(init_args)

  3. Register your model: Add a new element in MODEL_LOADER_DICT in the model_loader_dict.py script specifying the name and loader function for your model, e.g.:

[...]
"myModel_myArchitecture": load_my_model,
[...]

CoG step-2: Feature extraction

In this step, you will extract image features for the 6 datasets (IN-1K and our CoG levels L1, L2, L3, L4, L5). We provide for you a bash script to automatize feature extraction ./bash-scripts/extract_features.sh for a given model. As you notice, there are variables in this script that you need to set properly; Moreover, this script calls extract_features.py with the provided arguments for each of 6 datasets. In our GPU cluster, each feature extraction experiment takes ~60min using a V100 GPU and 8 CPU cores.

Please see below for the documentation of the arguments and several examples for extract_features.py.

(Click to see the details) Documentation of the arguments for extract_features.py:
  • Model arguments:

    • --model=<model_name>: The name of the model using which you would like to extract features. It can be one of the models we report in the paper (see the table with model names for those) or your custom model. This argument will point the code to the proper checkpoint loader function for your model.
    • For loading checkpoint files you have two options (it is enough to set one of them):
      • --models_root_dir=<models_root_dir>: If set, the script will look for a model checkpoint at path <models_root_dir>/<model_title>/<architecture_name>/model.ckpt,
      • --ckpt_file=<ckpt_file>: The full path of any valid checkpoint of the model <model_name>.
  • Dataset and split arguments: You can specify the dataset and split by using the following arguments:

    • --dataset=<dataset>: The dataset to extract features from ("in1k", "cog_l1", "cog_l2", "cog_l3", "cog_l4", "cog_l5")
    • --split=<split>: The split ("train" or "test"). Note that for the IN-1K dataset, "test" will extract features from the official validation set.
  • Arguments for the dataset (IN-1K and IN-21K) folders, and the CoG level files:

    • --in1k_images_root=<in1k_images_root>: The full path to the folder you downloaded IN-1K. This argument is required only when extracting features from IN-1K.
    • For the CoG levels (required only when extracting features from any of the CoG levels):
      • --imagenet_images_root=<imagenet_images_root>: The full path to the folder with IN-21K.
      • --cog_levels_mapping_file=cog_levels_mapping_file.pkl: The full path to the CoG concept-to-level mapping file that you can download from the section above.
      • --cog_concepts_split_file=cog_concepts_split_file.pkl: The full path to the CoG concepts split file that you can download from the section above.
  • Arguments for an output folder for features: For specifying where to save features you have two options (it is enough to set one of them):

    • --models_root_dir=<models_root_dir>: If set, features will be saved to the path <models_root_dir>/<model_title>/<architecture_name>/<dataset>/features_<split>/X_Y.pth.
    • --output_dir=<output_dir>: The full path to the folder where features will be stored (i.e., <output_dir>/X_Y.pth)
(Click to see the details) Examples for running extract_features.py:

To extract features for the training set of IN-1K from the "ResNet-50" model (which is the supervised baseline we use in the paper), run this command:

python extract_features.py \
    --model="sup_resnet50" \
    --dataset="in1k" \
    --split="train" \
    --models_root_dir=<models_root_dir> \
    --in1k_images_root=<in1k_images_root>

Note that --models_root_dir is set. Therefore, this command will load the pretrained weights of "ResNet50" from <models_root_dir>/sup/resnet50/model.ckpt and save the features into the file <models_root_dir>/sup/resnet50/in1k/features_train/X_Y.pth.

To extract the features for the test set of L5 from the "SwAV" model, run this command:

python extract_features.py \
    --model="swav_resnet50" \
    --dataset="cog_l5" \
    --split="test" \
    --models_root_dir=<models_root_dir> \
    --imagenet_images_root=<imagenet_images_root> \
    --cog_levels_mapping_file=cog_levels_mapping_file.pkl \
    --cog_concepts_split_file=cog_concepts_split_file.pkl

This script will load the pretrained weights of "SwAV" from <models_root_dir>/swav/resnet50/model.ckpt and save the features into the file <models_root_dir>/swav/resnet50/cog_l5/features_test/X_Y.pth

Also don't forget to check the bash script ./bash-scripts/extract_features.sh.

CoG step-3: Learning linear classifiers and testing

After extracting features for the 6 datasets (for their both training and test sets), we train linear classifiers on them. We divide the classification experiments into two settings:

  1. Learning with all available images: Training classifiers with all available training data for the concepts. (This setting is to reproduce the scores we report in Section 4.2.1 - Generalization to unseen concepts of our paper.) We train 5 classifiers with different seeds on each CoG level and IN-1K separately, then report their average score. So, for this setting, you will need to train 30 (6 datasets, 5 seeds) logistic regression classifiers using all available training data for the concepts. We provide for you a bash script ./bash-scripts/logreg.sh to automatize running 5 classifiers on a specific dataset (or on a specific training and test set features). There are variables in this script that you need to set properly. Then, this script calls logreg.py with the provided arguments for 5 seeds. Finally, it prints the average score of these 5 experiments. In our GPU cluster, each experiment takes ~75min using a V100 GPU and 8 CPU cores.

  2. Few-shot: Training classifiers with N in {1, 2, 4, 8, 16, 32, 64 or 128} training samples per concept. (This setting is to reproduce the scores we report in Section 4.2.2 - How fast can models adapt to unseen concepts? of our paper.) We again train 5 classifiers with different seeds on each dataset separately, then report their average score. But this time, we also change the number of training samples per concept. So, for this setting, you will need to train 240 (8 x 6 x 5) logistic regression classifiers. We provide for you a bash script ./bash-scripts/fewshot.sh to automatize running 5 classifiers on a specific dataset for N in {1, 2, 4, 8, 16, 32, 64 or 128}. There are variables in this script that you need to set properly. Then this script calls logreg.py with the provided arguments for 5 seeds and each N value. Finally, it prints the average score for each N.

Note on hyper-parameter tuning. To minimize performance differences due to sub-optimal hyper-parameters, we use the Optuna hyperparameter optimization framework to tune the learning rate and weight decay hyper-parameters when training a classifier. We sample 30 learning rate and weight decay pairs and perform hyper-parameter tuning by partitioning the training set into two temporary subsets used for hyper-parameter training and validation. Once the optimal hyper-parameters are found, then we train the final classifier with these hyper-parameters on the complete training set and report the accuracy on the test set. We combine hyper-parameter tuning, classifier training and testing in a single script; one just needs to run the logreg.py script.

Please see below for the documentation of the arguments and several examples for logreg.py.

(Click to see the details) Documentation of the arguments for logreg.py:
  • Arguments for training and test set features: To train a classifier, you need to load training and test set features extracted for a particular dataset (i.e., "X_Y.pth" files you obtained in the feature extraction step above). You have two options for specifying from where to load pre-extracted features (it is enough to set the arguments for one option):

    • Option-1: Loading pre-extracted features for a particular dataset extracted from a particular model located under <models_root_dir>. By setting the three arguments below, the script will look for features files <models_root_dir>/<model_name>/<architecture_name>/<dataset>/features_{train and test}/X_Y.pth. Please refer to the feature extraction part above to see how to set these arguments correctly.
      • --model=<model_name>: The name of the model.
      • --dataset=<dataset>: The name of the dataset.
      • --models_root_dir=<models_root_dir>: The full path to the models directory.
    • Option-2: Loading pre-extracted features from arbitrary paths. With these two arguments set, the script will look for training and test set features files <train_set_features_dir>/X_Y.pth and <test_set_features_dir>/X_Y.pth, respectively.
      • --train_features_dir=<train_features_dir>: The full path to the folder containing training set features.
      • --test_features_dir=<test_features_dir>: The full path to the folder containing test set features.
  • Arguments for an output folder for logs: logreg.py produces output logs that can be read by the print_results.py script to print the average of the classification results obtained with multiple seeds. For specifying where to save logs you have two options (it is enough to set one of them):

    • --models_root_dir=<models_root_dir>: If set, the output logs will be saved under the folder <models_root_dir>/<model_name>/<architecture_name>/<dataset>/eval_logreg/seed-<seed>/.
    • --output_dir=<output_dir>: The full path to the output folder where logs will be saved.
  • Seed for reproducibility:

    • EVAL.SEED <seed_number>: Seed for Python's random module, NumPy and PyTorch. Its default value is 22. Note that this is not an argument but a config entry. So you need to append EVAL.SEED <seed_number> to the arguments list to overwrite the default config value (see the example below). We run all classification experiments 5 times, and report the mean and variance of these runs and strongly recommend you to do the same.
  • Number of training sample per concept (for few-shot learning):

    • CLF.N_SHOT <img_per_class>: The number of training samples per concept to use for training a few-shot classifier. Note that this is not an argument but a config entry. So you need to append CLF.N_SHOT <img_per_class> to the arguments list to overwrite the default config value (see the example below).

Averaging results over multiple seeds. The logreg.py script stores top-1 and top-5 accuracies per epoch in a Python list and saves them into a pickle file named final-*/logs.pkl under the output logs folder. For getting the final mean and variance over multiple runs, i.e., after running the logreg.py script with multiple seeds, one can run the print_results.py script by providing the output logs directory with the --logreg_root_dir argument.

(Click to see the details) Examples for running logreg.py:

To train a linear classifier (including its hyper-parameter tuning phase) on top of L5 features extracted by SwAV, run:

python logreg.py \
    --model="swav_resnet50" \
    --dataset="cog_l5" \
    --models_root_dir=<models_root_dir> \
    EVAL.SEED 55  # --> training the classifier with seed 55

This script will load pre-extracted features from <models_root_dir>/swav/resnet50/cog_l5/features_{train and test}/X_Y.pth and save output logs under <models_root_dir>/swav/resnet50/cog_l5/eval_logreg/seed-55. It will also print the final top1/top5 accuracies for this run. To print the averaged scores over 5 seeds for SwAV for L5, and assuming one used the default output paths, the command is:

python print_results.py \
    --logreg_root_dir="<models_root_dir>/swav/resnet50/cog_l5/eval_logreg"

To simulate few-shot learning scenarios, you can overwrite the config entry:

python logreg.py \
    --model="swav_resnet50" \
    --dataset="cog_l5" \
    --models_root_dir=<models_root_dir> \
    CLF.N_SHOT 8 # --> training the classifier with 8 random training samples per concept

This script will use 8 random training samples per concept to train linear classifiers and save output logs under <models_root_dir>/swav/resnet50/cog_l5/eval_logreg_N8/seed-22.

Also don't forget to check the bash scripts ./bash-scripts/logreg.sh and ./bash-scripts/fewshot.sh.

This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

RUAS This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision" A prelimin

Vision & Optimization Group (VOG) 2 May 05, 2022
上海交通大学全自动抢课脚本,支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Welcome to Course-Bullying-in-SJTU-v3.1! 2021/6/8 紧急更新v3.1 更新说明 为了更好地保护用户隐私,将原来用户名+密码的登录方式改为微信扫二维码+cookie登录方式,不再需要配置使用pytesseract。在使用扫码登录模式时,请稍等,二维码将马

87 Sep 13, 2022
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

ERTIS Research Group 7 Aug 01, 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model About This repository contains the code to replicate the syn

Haruka Kiyohara 12 Dec 07, 2022
ML-Decoder: Scalable and Versatile Classification Head

ML-Decoder: Scalable and Versatile Classification Head Paper Official PyTorch Implementation Tal Ridnik, Gilad Sharir, Avi Ben-Cohen, Emanuel Ben-Baru

189 Jan 04, 2023
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program

50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

komal_lamba 22 Dec 09, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 06, 2023
social humanoid robots with GPGPU and IoT

Social humanoid robots with GPGPU and IoT Social humanoid robots with GPGPU and IoT Paper Authors Mohsen Jafarzadeh, Stephen Brooks, Shimeng Yu, Balak

0 Jan 07, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Alpha Zero General (any game, any framework!) A simplified, highly flexible, commented and (hopefully) easy to understand implementation of self-play

Surag Nair 3.1k Jan 05, 2023
📖 Deep Attentional Guided Image Filtering

📖 Deep Attentional Guided Image Filtering [Paper] Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao ,Xiangyang Ji Harbin Institute of Technology,

9 Dec 23, 2022
Prototype-based Incremental Few-Shot Semantic Segmentation

Prototype-based Incremental Few-Shot Semantic Segmentation Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 20

Fabio Cermelli 21 Dec 29, 2022
Reliable probability face embeddings

ProbFace, arxiv This is a demo code of training and testing [ProbFace] using Tensorflow. ProbFace is a reliable Probabilistic Face Embeddging (PFE) me

Kaen Chan 34 Dec 31, 2022
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

9 Feb 23, 2022
An open source Jetson Nano baseboard and tools to design your own.

My Jetson Nano Baseboard This basic baseboard gives the user the foundation and the flexibility to design their own baseboard for the Jetson Nano. It

NVIDIA AI IOT 57 Dec 29, 2022
ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Real-Time Semantic Segmentation in TensorFlow Perform pixel-wise semantic segmentation on high-resolution images in real-time with Image Cascade Netwo

Oles Andrienko 219 Nov 21, 2022
This repository provides code for "On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness".

On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness This repository provides the code for the paper On Interaction B

Meta Research 33 Dec 08, 2022
Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Leaded Gradient Method (LGM) This repository contains the PyTorch implementation for paper Dynamics-aware Adversarial Attack of 3D Sparse Convolution

An Tao 2 Oct 18, 2022