CMT: Convolutional Neural Networks Meet Vision Transformers

Overview

CMT: Convolutional Neural Networks Meet Vision Transformers

[arxiv]

1. Introduction

model This repo is the CMT model which impelement with pytorch, no reference source code so this is a non-official version.

2. Enveriments

  • python 3.7+
  • pytorch 1.7.1
  • pillow
  • apex
  • opencv-python

You can see this repo to find how to install the apex

3. DataSet

  • Trainig
    /data/home/imagenet/train/xxx.jpeg, 0
    /data/home/imagenet/train/xxx.jpeg, 1
    ...
    /data/home/imagenet/train/xxx.jpeg, 999
    
  • Testing
    /data/home/imagenet/test/xxx.jpeg, 0
    /data/home/imagenet/test/xxx.jpeg, 1
    ...
    /data/home/imagenet/test/xxx.jpeg, 999
    

4. Training & Inference

  1. Training

    CMT-Tiny

    #!/bin/bash
    OMP_NUM_THREADS=1
    MKL_NUM_THREADS=1
    export OMP_NUM_THREADS
    export MKL_NUM_THREADS
    cd CMT-pytorch;
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -W ignore -m torch.distributed.launch --nproc_per_node 8 train.py --batch_size 512 --num_workers 48 --lr 6e-3 --optimizer_name "adamw" --tf_optimizer 1 --cosine 1 --model_name cmtti --max_epochs 300 \
    --warmup_epochs 5 --num-classes 1000 --input_size 184 \ --crop_size 160 --weight_decay 1e-1 --grad_clip 0 --repeated-aug 0 --max_grad_norm 5.0 
    --drop_path_rate 0.1 --FP16 0 --qkv_bias 1 
    --ape 0 --rpe 1 --pe_nd 0 --mode O2 --amp 1 --apex 0 \ 
    --train_file $file_folder$/train.txt \
    --val_file $file_folder$/val.txt \
    --log-dir $save_folder$/log_dir \
    --checkpoints-path $save_folder$/checkpoints
    

    Note: If you use the bs 128 * 8 may be get more accuracy, balance the acc & speed.

  2. Inference

    #!/bin/bash
    cd CMT-pytorch;
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -W ignore test.py \
    --dist-url 'tcp://127.0.0.1:9966' --dist-backend 'nccl' --multiprocessing-distributed=1 --world-size=1  --rank=0 
    --batch-size 128 --num-workers 48 --num-classes 1000 --input_size 184 --crop_size 160 \
    --ape 0 --rpe 1 --pe_nd 0 --qkv_bias 1 --swin 0 --model_name cmtti --dropout 0.1 --emb_dropout 0.1 \
    --test_file $file_folder$/val.txt \
    --checkpoints-path $save_folder$/checkpoints/xxx.pth.tar \
    --save_folder $save_folder$/acc_logits/
  3. calculate acc

    python utils/calculate_acc.py --logits_file $save_folder$/acc_logits/

5. Imagenet Result

model-name input_size FLOPs Params [email protected]_crop(ours) acc(papers) weights
CMT-T 160x160 516M 11.3M 75.124% 79.2% weights
CMT-T 224x224 1.01G 11.3M 78.4% - weights
CMT-XS 192x192 - - - 81.8% -
CMT-S 224x224 - - - 83.5% -
CMT-L 256x256 - - - 84.5% -

6. TODO

  • Other result may comming sonn if someone need.
  • Release the CMT-XS result on the imagenet.
  • Check the diff with papers, author give the hyparameters on the issue
  • Adjusting the best hyperparameters for CMT or transformers

Supplementary

If you want to know more, I give the CMT explanation, as well as the tuning and training process on here.

Owner
FlyEgle
JOYY AI GROUP - Machine Learning Engineer(Computer Vision)
FlyEgle
Pyramid addon for OpenAPI3 validation of requests and responses.

Validate Pyramid views against an OpenAPI 3.0 document Peace of Mind The reason this package exists is to give you peace of mind when providing a REST

Pylons Project 79 Dec 30, 2022
7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

8 Jul 09, 2021
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambig

王皓波 147 Jan 07, 2023
DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction This is the implementation of DeepSTD in

5 Sep 26, 2022
Self-Supervised Methods for Noise-Removal

SSMNR | Self-Supervised Methods for Noise Removal Image denoising is the task of removing noise from an image, which can be formulated as the task of

1 Jan 16, 2022
Convolutional Neural Networks

Darknet Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. D

Joseph Redmon 23.7k Jan 05, 2023
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 46 Dec 26, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

TL;DR Ignite is a high-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently. Click on the image to

4.2k Jan 01, 2023
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Easy-to-use toolkit for retrieval-based Chatbot Recent Activity Our released RRS corpus can be found here. Our released BERT-FP post-training checkpoi

GMFTBY 32 Nov 13, 2022
Cycle Consistent Adversarial Domain Adaptation (CyCADA)

Cycle Consistent Adversarial Domain Adaptation (CyCADA) A pytorch implementation of CyCADA. If you use this code in your research please consider citi

Hyunwoo Ko 2 Jan 10, 2022
FairyTailor: Multimodal Generative Framework for Storytelling

FairyTailor: Multimodal Generative Framework for Storytelling

Eden Bens 172 Dec 30, 2022
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 06, 2022
Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Geometry Processing with Neural Fields Pytorch implementation for the NeurIPS 2021 paper: Geometry Processing with Neural Fields Guandao Yang, Serge B

Guandao Yang 162 Dec 16, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022
The code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention.

CrossFormer This repository is the code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention. Introduction Existin

cheerss 238 Jan 06, 2023
CMP 414/765 course repository for Spring 2022 semester

CMP414/765: Artificial Intelligence Spring2021 This is the GitHub repository for course CMP 414/765: Artificial Intelligence taught at The City Univer

ch00226855 4 May 16, 2022
Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Learning Motion Priors for 4D Human Body Capture in 3D Scenes (LEMO) Official Pytorch implementation for 2021 ICCV (oral) paper "Learning Motion Prior

165 Dec 19, 2022
Text-Based Ideal Points

Text-Based Ideal Points Source code for the paper: Text-Based Ideal Points by Keyon Vafa, Suresh Naidu, and David Blei (ACL 2020). Update (June 29, 20

Keyon Vafa 37 Oct 09, 2022