SwinTransformer + OBBDet

The sixth place winning solution (6/220) in the track of Fine-grained Object Recognition in High-Resolution Optical Images, 2021 Gaofen Challenge on Automated High-Resolution Earth Observation Image Interpretation.

Members

Qi Ming, Junjie Song, Yunpeng Dong.

Solution

Off-line date augmentation
We use random combination of affine transformation, flip, scaling, optical distortion for data augmentation.
Multi-scale training and testing
The training images are resized into sizes of 600, 800, and 1024 for training and testing.
Strong backbone
Swin transformer is adopt in ORCNN and RoI Transformer for better performance.
Model ensemble
We have merged the results from RoI Transformer, ORCNN, S2ANet, and ReDet.
Lower confidence
Set the output threshold into 0.005.

Tried but didn't work

Soft-NMS.
Adjust NMS threshold.
Class-agnostic NMS.
Mosaic, and mix up for data augmentation.
Oversample the categories with fewer instances.
Train the detectors for specific classes with low AP.
Multi-scale training and testing on SwinTransformer-based detectors (even dropped by about 1% mAP).

The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

Related tags

Overview

SwinTransformer + OBBDet

Members

Solution

Tried but didn't work

Detections

Owner

ming71

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

Tooling for the Common Objects In 3D dataset.

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

KITTI-360 Annotation Tool is a framework that developed based on python(cherrypy + jinja2 + sqlite3) as the server end and javascript + WebGL as the front end.

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Code for database and frontend of webpage for Neural Fields in Visual Computing and Beyond.

Code to reproduce the results for Compositional Attention

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

YOLOX-Paddle - A reproduction of YOLOX by PaddlePaddle

Python scripts using the Mediapipe models for Halloween.

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

SPTAG: A library for fast approximate nearest neighbor search

SemiNAS: Semi-Supervised Neural Architecture Search

Replication attempt for the Protein Folding Model

Use CLIP to represent video for Retrieval Task