[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Overview

Amplitude-Phase Recombination (ICCV'21)

Official PyTorch implementation of "Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain", Guangyao Chen, Peixi Peng, Li Ma, Jia Li, Lin Du, and Yonghong Tian.

Paper: https://arxiv.org/abs/2108.08487

Abstract: Recently, the generalization behavior of Convolutional Neural Networks (CNN) is gradually transparent through explanation techniques with the frequency components decomposition. However, the importance of the phase spectrum of the image for a robust vision system is still ignored. In this paper, we notice that the CNN tends to converge at the local optimum which is closely related to the high-frequency components of the training images, while the amplitude spectrum is easily disturbed such as noises or common corruptions. In contrast, more empirical studies found that humans rely on more phase components to achieve robust recognition. This observation leads to more explanations of the CNN's generalization behaviors in both adversarial attack and out-of-distribution detection, and motivates a new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image. That is, the generated samples force the CNN to pay more attention on the structured information from phase components and keep robust to the variation of the amplitude. Experiments on several image datasets indicate that the proposed method achieves state-of-the-art performances on multiple generalizations and calibration tasks, including adaptability for common corruptions and surface variations, out-of-distribution detection and adversarial attack.

Highlights

Fig. 1: More empirical studies found that humans rely on more phase components to achieve robust recognition. However, CNN without effective training restrictions tends to converge at the local optimum related to the amplitude spectrum of the image, leading to generalization behaviors counter-intuitive to humans (the sensitive to various corruptions and the overconfidence of OOD). main hypothesis of the paper

Examples of the importance of phase spectrum to explain the counter-intuitive behavior of CNN

Fig. 2: Four pairs of testing sampless selected from in-distribution CIFAR-10 and OOD SVHN that help explain that CNN capture more amplitude specturm than phase specturm for classification: First, in (a) and (b), the model correctly predicts the original image (1st column in each panel), but the predicts are also exchanged after switching amplitude specturm (3rd column in each panel) while the human eye can still give the correct category through the contour information. Secondly, the model is overconfidence for the OOD samples in (c) and (d). Similarly, after the exchange of amplitude specturm, the label with high confidence is also exchanged.

Fig. 3: Two ways of the proposed Amplitude-Phase Recombination: APR-P and APR-S. Motivated by the powerful generalizability of the human, we argue that reducing the dependence on the amplitude spectrum and enhancing the ability to capture phase spectrum can improve the robustness of CNN.

Citation

If you find our work, this repository and pretrained adversarial generators useful. Please consider giving a star and citation.

@inproceedings{chen2021amplitude,
    title={Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain},
    author={Chen, Guangyao and Peng, Peixi and Ma, Li and Li, Jia and Du, Lin and Tian, Yonghong},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision},
    year={2021}
}

1. Requirements

Environments

Currently, requires following packages

  • python 3.6+
  • torch 1.7.1+
  • torchvision 0.5+
  • CUDA 10.1+
  • scikit-learn 0.22+

Datasets

For even quicker experimentation, there is CIFAR-10-C and CIFAR-100-C. please download these datasets to ./data/CIFAR-10-C and ./data/CIFAR-100-C.

2. Training & Evaluation

To train the models in paper, run this command:

python main.py --aug <augmentations>

Option --aug can be one of None/APR-S. The default training method is APR-P. To evaluate the model, add --eval after this command.

APRecombination for APR-S and mix_data for APR-P can plug and play in other training codes.

3. Results

Fourier Analysis

The standard trained model is highly sensitive to additive noise in all but the lowest frequencies. APR-SP could substantially improve robustness to most frequency perturbations. The code of Heat maps is developed upon the following project FourierHeatmap.

ImageNet-C

  • Results of ResNet-50 models on ImageNet-C:
+(APR-P) +(APR-S) +(APR-SP) +DeepAugMent+(ARP-SP)
mCE 70.5 69.3 65.0 57.5
Owner
Guangyao Chen
Ph.D student @ PKU
Guangyao Chen
Unofficial PyTorch implementation of the Adaptive Convolution architecture for image style transfer

AdaConv Unofficial PyTorch implementation of the Adaptive Convolution architecture for image style transfer from "Adaptive Convolutions for Structure-

65 Dec 22, 2022
"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Graph Neural Controlled Differential Equations for Traffic Forecasting Setup Python environment for STG-NCDE Install python environment $ conda env cr

Jeongwhan Choi 55 Dec 28, 2022
Nodule Generation Algorithm Baseline and template code for node21 generation track

Nodule Generation Algorithm This codebase implements a simple baseline model, by following the main steps in the paper published by Litjens et al. for

node21challenge 10 Apr 21, 2022
An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022
Feature extraction made simple with torchextractor

torchextractor: PyTorch Intermediate Feature Extraction Introduction Too many times some model definitions get remorselessly copy-pasted just because

Antoine Broyelle 89 Oct 31, 2022
Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

Official-PyTorch-Implementation-of-TransMEF Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fu

117 Dec 27, 2022
Walk with fastai

Shield: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Walk with fastai What is this p

Walk with fastai 124 Dec 10, 2022
Tools for computational pathology

A toolkit for computational pathology and machine learning. View documentation Please cite our paper Installation There are several ways to install Pa

254 Dec 12, 2022
fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Ali Abdalla 34 Jan 05, 2023
Long Expressive Memory (LEM)

Long Expressive Memory for Sequence Modeling This repository contains the implementation to reproduce the numerical experiments of the paper Long Expr

Konstantin Rusch 47 Dec 17, 2022
🐾 Semantic segmentation of paws from cute pet images (PyTorch)

🐾 paw-segmentation 🐾 Semantic segmentation of paws from cute pet images 🐾 Semantic segmentation of paws from cute pet images (PyTorch) 🐾 Paw Segme

Zabir Al Nazi Nabil 3 Feb 01, 2022
PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
Facial Expression Detection In The Realtime

The human's facial expressions is very important to detect thier emotions and sentiment. It can be very efficient to use to make our computers make interviews. Furthermore, we have robots now can det

Adel El-Nabarawy 4 Mar 01, 2022
The Simplest DCGAN Implementation

DCGAN in TensorLayer This is the TensorLayer implementation of Deep Convolutional Generative Adversarial Networks. Looking for Text to Image Synthesis

TensorLayer Community 310 Dec 13, 2022
The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

ReCo - Regional Contrast This repository contains the source code of ReCo and baselines from the paper, Bootstrapping Semantic Segmentation with Regio

Shikun Liu 128 Dec 30, 2022
This is an official pytorch implementation of Fast Fourier Convolution.

Fast Fourier Convolution (FFC) for Image Classification This is the official code of Fast Fourier Convolution for image classification on ImageNet. Ma

pkumi 199 Jan 03, 2023
A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

Imaging Transcriptomics Imaging transcriptomics is a methodology that allows to identify patterns of correlation between gene expression and some prop

Alessio Giacomel 10 Dec 27, 2022
Wav2Vec for speech recognition, classification, and audio classification

Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your

Mehrdad Farahani 140 Dec 15, 2022
Tutorial page of the Climate Hack, the greatest hackathon ever

Tutorial page of the Climate Hack, the greatest hackathon ever

UCL Artificial Intelligence Society 12 Jul 02, 2022
Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

chuhaojin 47 Nov 01, 2022