4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

Last update: Nov 09, 2022

Overview

A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR)

Challenge Site

Overview

Synthetic Aperture Radar (SAR) has received more attention due to its complementary superiority on capturing significant information in the remote sensing area. However, for an Aerial View Object Classification (AVOC) task, SAR images still suffer from the long-tailed distribution of the aerial view objects. This disparity dampens the performance of classification methods, especially for the datasensitive deep learning models. In this paper, we propose a two-stage shake-shake network to tackle the long-tailed learning problem. Specifically, it decouples the learning procedure into the representation learning stage and the classification learning stage. Moreover, we apply the test time augmentation (TTA) and a post-processing approach (CAN) to improve the accuracy. In the PBVS 2022 Multi-modal Aerial View Object Classification Challenge Track 1, our method achieves 21.82% and 27.97% accuracy in the development phase and testing phase respectively, which achieves the top-tier among all the participants.

Requirements

Ubuntu (It's only tested on Ubuntu, so it may not work on Windows.)
Python >= 3.7
PyTorch >= 1.4.0
torchvision
```
pip install -r requirements.txt
```

Usage

The first stage training

python train.py --config ./configs/sar10/shake_shake.yaml

You need to change the value of “dataset_dir”, “dataset_dir_val”, under the “dataset” field and “output_dir” under the “train” field in the file “./configs/sar10/shake_shake.yaml”。

The second stage training

python train.py --config ./configs/sar10/shake_shake_fc.yaml

You need to change the value of “dataset_dir”, “dataset_dir_val” under the “dataset” field and “output_dir”, “checkpoint” under the “train” field in the file “./configs/sar10/shake_shake_fc.yaml”。

Test

python predict_TTA.py

You need to change the value of “dataset_dir”, “checkpoint”, under the “test” field in the file “./configs/sar10/shake_shake.yaml”, then you can find the results in file “.result/results.csv”。
You can download the trained model here.

Acknowledge

The codes borrow heavily from hysts/pytorch_image_classification.

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

Related tags

Overview

A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects

Overview

Requirements

Usage

The first stage training

The second stage training

Test

Acknowledge

Owner

LinpengPan

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

This application is the basic of automated online-class-joiner(for YıldızEdu) within the right time. Gets the ZOOM link by scheduled date and time.

Tooling for converting STAC metadata to ODC data model

FaRL for Facial Representation Learning

This repository contains part of the code used to make the images visible in the article "How does an AI Imagine the Universe?" published on Towards Data Science.

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

This repo tries to recognize faces in the dataset you created

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

Replication attempt for the Protein Folding Model

This repository contains the files for running the Patchify GUI.

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Official implementation of ETH-XGaze dataset baseline

Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

Dynamic View Synthesis from Dynamic Monocular Video

Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"