Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Last update: Jan 01, 2023

Related tags

Deep Learning PS-ViT

Overview

Vision Transformer with Progressive Sampling

This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Installation Instructions

Clone this repo:

git clone [email protected]:yuexy/PS-ViT.git
cd PS-ViT

Create a conda virtual environment and activate it:

conda create -n ps_vit python=3.7 -y
conda activate ps_vit

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.4, einops, pyyaml:

pip3 install timm=0.3.4, einops, pyyaml

Install Apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Install PS-ViT:

python setup.py build_ext --inplace

Results and Models

All models listed below are evaluated with input size 224x224

Model	Top1 Acc	#params	FLOPS	Download
PS-ViT-Ti/14	75.6	4.8M	1.6G	Coming Soon
PS-ViT-B/10	80.6	21.3M	3.1G	Coming Soon
PS-ViT-B/14	81.7	21.3M	5.4G	Google Drive
PS-ViT-B/18	82.3	21.3M	8.8G	Google Drive

Evaluation

To evaluate a pre-trained PS-ViT on ImageNet val, run:

python3 main.py <data-root> --model <model-name> -b <batch-size> --eval_checkpoint <path-to-checkpoint>

Training from scratch

To train a PS-ViT on ImageNet from scratch, run:

bash ./scripts/train_distributed.sh <job-name> <config-path> <num-gpus>

Citing PS-ViT

@article{psvit,
  title={Vision Transformer with Progressive Sampling},
  author={Yue, Xiaoyu and Sun, Shuyang and Kuang, Zhanghui and Wei, Meng and Torr, Philip and Zhang, Wayne and Lin, Dahua},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Contact

If you have any questions, don't hesitate to contact Xiaoyu Yue. You can easily reach him by sending an email to [email protected].

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Related tags

Overview

Vision Transformer with Progressive Sampling

Installation Instructions

Results and Models

Evaluation

Training from scratch

Citing PS-ViT

Contact

Owner

yuexy

CSPML (crystal structure prediction with machine learning-based element substitution)

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

Deep GPs built on top of TensorFlow/Keras and GPflow

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

QA-GNN: Question Answering using Language Models and Knowledge Graphs

TensorRT examples (Jetson, Python/C++)(object detection)

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Official code for UnICORNN (ICML 2021)

[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

structured-generative-modeling

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

RIM: Reliable Influence-based Active Learning on Graphs.

Learning to Segment Instances in Videos with Spatial Propagation Network

deep learning for image processing including classification and object-detection etc.

neural image generation

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch