SwinTransformerV2-TensorFlow

A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of SwinTransformerV1 and their paper on V2.

Paper on Version 2 (18/11/2021): [arXiv]

Paper on Version 1 (17/08/2021): [arXiv]

Features:

TensorFlow 2 implementation of version 1 and 2 of the SwinTransformer, a state-of-the-art backbone for many contemporaty tasks in computer vision. A brief overview of the architectural changes made in version 2:

A pre-norm configuration replaces the previous post-norm configuration, meant to improve training stability in larger models.
A scaled cosine attention replaces the dot product attention in V1, with a learnable scaler.
A continuous log-spaced relative position bias is used instead of the previous parametric table approach. This is implemented here as a small MLP network and a log transform on the relative coordinates bias.

Requirements:

numpy==1.21.4
tensorflow==2.7.0
tensorflow_addons==0.15.0

Getting started

Currently writing up.

License

This project is licensed under the MIT license.

Citation

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Implementation of SwinTransformerV2 in TensorFlow.

Related tags

Overview

SwinTransformerV2-TensorFlow

Features:

Requirements:

Getting started

License

Citation

Owner

Phan Nguyen

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

eXPeditious Data Transfer

Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution.

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Wav2Vec for speech recognition, classification, and audio classification

Trustworthy AI related projects

ONNX Command-Line Toolbox

An Api for Emotion recognition.

Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”

An efficient and easy-to-use deep learning model compression framework

Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Sign Language Transformers (CVPR'20)

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

Fantasy Points Prediction and Dream Team Formation