Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv)

This repository is built upon DeiT and timm

Usage

First, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models 0.3.2:

conda install -c pytorch pytorch torchvision
pip install timm==0.3.2

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Training

To train Conformer-S on ImageNet on a single node with 8 gpus for 300 epochs run:

Conformer-S

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
OUTPUT='./output/Conformer_small_patch16_batch_1024_lr1e-3_300epochs'

python -m torch.distributed.launch --master_port 50130 --nproc_per_node=8 --use_env main.py \
                                   --model Conformer_small_patch16 \
                                   --data-set IMNET \
                                   --batch-size 128 \
                                   --lr 0.001 \
                                   --num_workers 4 \
                                   --data-path /data/user/Dataset/ImageNet_ILSVRC2012/ \
                                   --output_dir ${OUTPUT} \
                                   --epochs 300

Model Zoo

Model	Parameters	MACs	Top-1 Acc	Link
Conformer-Ti	23.5 M	5.2 G	81.3 %	baidu(code: hzhm) google
Conformer-S	37.7 M	10.6 G	83.4 %	baidu(code: qvu8) google
Conformer-B	83.3 M	23.3 G	84.1 %	baidu(code: b4z9) google

Citation

@article{peng2021conformer,
      title={Conformer: Local Features Coupling Global Representations for Visual Recognition}, 
      author={Zhiliang Peng and Wei Huang and Shanzhi Gu and Lingxi Xie and Yaowei Wang and Jianbin Jiao and Qixiang Ye},
      journal={arXiv preprint arXiv:2105.03889},
      year={2021},
}

Conformer: Local Features Coupling Global Representations for Visual Recognition

Related tags

Overview

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv)

Usage

Data preparation

Training

Model Zoo

Citation

Owner

Zhiliang Peng

Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

A library for uncertainty quantification based on PyTorch

Parameter Efficient Deep Probabilistic Forecasting

Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

PyTorch implementation of GLOM

A PyTorch Implementation of SphereFace.

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Official implementation of "Articulation Aware Canonical Surface Mapping"

The Simplest DCGAN Implementation

code for Fast Point Cloud Registration with Optimal Transport

DNA-RECON { Automatic Web Reconnaissance Tool }

This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

This repo contains the code required to train the multivariate time-series Transformer.

Waymo motion prediction challenge 2021: 3rd place solution

VQGAN+CLIP Colab Notebook with user-friendly interface.

A Transformer-Based Siamese Network for Change Detection