Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Overview

Segformer - Pytorch

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch.

Install

$ pip install segformer-pytorch

Usage

For example, MiT-B0

import torch
from segformer_pytorch import Segformer

model = Segformer(
    patch_size = 4,                 # patch size
    dims = (32, 64, 160, 256),      # dimensions of each stage
    heads = (1, 2, 5, 8),           # heads of each stage
    ff_expansion = (8, 8, 4, 4),    # feedforward expansion factor of each stage
    reduction_ratio = (8, 4, 2, 1), # reduction ratio of each stage for efficient attention
    num_layers = 2,                 # num layers of each stage
    decoder_dim = 256,              # decoder dimension
    num_classes = 4                 # number of segmentation classes
)

x = torch.randn(1, 3, 256, 256)
pred = model(x) # (1, 4, 64, 64)  # output is (H/4, W/4) map of the number of segmentation classes

Make sure the keywords are at most a tuple of 4, as this repository is hard-coded to give the MiT 4 stages as done in the paper.

Citations

@misc{xie2021segformer,
    title   = {SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers}, 
    author  = {Enze Xie and Wenhai Wang and Zhiding Yu and Anima Anandkumar and Jose M. Alvarez and Ping Luo},
    year    = {2021},
    eprint  = {2105.15203},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
You might also like...
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

MLP-Like Vision Permutator for Visual Recognition (PyTorch)
MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

Pytorch implementation of
Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

Unofficial Implementation of MLP-Mixer in TensorFlow
Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

Implementation of
Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

Unofficial Implementation of MLP-Mixer, Image Classification Model
Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Mixer Unoffical Implementation of MLP-Mixer, easy to use with terminal. Train and test easly. https://arxiv.org/abs/2105.01601 MLP-Mixer is an arc

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Comments
  • Something is wrong with your implementation.

    Something is wrong with your implementation.

    Hello!

    First of all, I really like the repo. The implementation is clean and so much easier to understand than the official repo. But after doing some digging, I realized that the number of parameters and layers (especially conv2d) is quite different from the official implementation. This is the case for all variants I have tested (B0 and B5).

    Check out the README in my repo here, and you'll see what I mean. I also included images of the execution graphs of the two different implementations in the 'src' folder, which could help to debug.

    I don't quite have time to dig into the source of the problem, but I just thought I'd share my observations with you.

    opened by camlaedtke 0
  • Models weights + model output HxW

    Models weights + model output HxW

    Hi,

    Could you please add the models weights so we can start training from them?

    Also, why you choose to train models with an output of size (H/4,W/4) and not the original (HxW) size?

    Great job for the paper, very interesting :)

    opened by isega24 2
  • The model configurations for all the SegFormer B0 ~ B5

    The model configurations for all the SegFormer B0 ~ B5

    Hello How are you? Thanks for contributing to this project. Is the model configuration in README MiT-B0 correctly? That's because the total number of params for the model is 36M. Could u provide all the model configurations for SegFormer B0 ~ B5?

    opened by rose-jinyang 5
  • a question about kv reshape in Efficient Self-Attention

    a question about kv reshape in Efficient Self-Attention

    Thanks for sharing your work, your code is so elegant, and inspired me a lot. Here is a question about the implementation of Efficient Self-Attention

    It seems you use a "mean op" to reshape k,v. and the official implementation uses a (learnable) linear mapping to reshape k,v

    may I ask, whether this difference significantly matters in your experiment ?

    in your code:

    k, v = map(lambda t: reduce(t, 'b c (h r1) (w r2) -> b c h w', 'mean', r1 = r, r2 = r), (k, v))
    

    the original implementation uses:

    self.kv = nn.Linear(dim, dim * 2, bias=qkv_bias)
    self.sr = nn.Conv2d(dim, dim, kernel_size=sr_ratio, stride=sr_ratio)
    self.norm = nn.LayerNorm(dim)
    
    x_ = x.permute(0, 2, 1).reshape(B, C, H, W)
    x_ = self.sr(x_).reshape(B, C, -1).permute(0, 2, 1)
    x_ = self.norm(x_)
    kv = self.kv(x_).reshape(B, -1, 2, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
    k, v = kv[0], kv[1]
    
    opened by masszhou 1
Releases(0.0.6)
Owner
Phil Wang
Working with Attention
Phil Wang
This repo contains code to reproduce all experiments in Equivariant Neural Rendering

Equivariant Neural Rendering This repo contains code to reproduce all experiments in Equivariant Neural Rendering by E. Dupont, M. A. Bautista, A. Col

Apple 83 Nov 16, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
Sequential model-based optimization with a `scipy.optimize` interface

Scikit-Optimize Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements

Scikit-Optimize 2.5k Jan 04, 2023
Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

unfoldedVBA Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution This repository contains the Pytorch implementation of the unrolled

Yunshi HUANG 2 Jul 10, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

Viewpoint Invariant Dense Matching for Visual Geolocalization: PyTorch implementation This is the implementation of the ICCV21 paper: G Berton, C. Mas

Gabriele Berton 44 Jan 03, 2023
Rocket-recycling with Reinforcement Learning

Rocket-recycling with Reinforcement Learning Developed by: Zhengxia Zou I have long been fascinated by the recovery process of SpaceX rockets. In this

Zhengxia Zou 202 Jan 03, 2023
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

63 Oct 17, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
A simple pytorch pipeline for semantic segmentation.

SegmentationPipeline -- Pytorch A simple pytorch pipeline for semantic segmentation. Requirements : torch=1.9.0 tqdm albumentations=1.0.3 opencv-pyt

petite7 4 Feb 22, 2022
This is a simple plugin for Vim that allows you to use OpenAI Codex.

🤖 Vim Codex An AI plugin that does the work for you. This is a simple plugin for Vim that will allow you to use OpenAI Codex. To use this plugin you

Tom Dörr 195 Dec 28, 2022
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ

Yin Cui 187 Oct 01, 2022
Steerable discovery of neural audio effects

Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often

Christian J. Steinmetz 182 Dec 29, 2022
Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

TFLite-HITNET-Stereo-depth-estimation Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite. Stereo depth e

Ibai Gorordo 22 Oct 20, 2022
Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

gym_multirotor Gym to train reinforcement learning agents on UAV platforms Quadrotor Tiltrotor Requirements This package has been tested on Ubuntu 18.

Aditya M. Deshpande 19 Dec 29, 2022
Implementation of light baking system for ray tracing based on Activision's UberBake

Vulkan Light Bakary MSU Graphics Group Student's Diploma Project Treefonov Andrey [GitHub] [LinkedIn] Project Goal The goal of the project is to imple

Andrey Treefonov 7 Dec 27, 2022
Source code for Fixed-Point GAN for Cloud Detection

FCD: Fixed-Point GAN for Cloud Detection PyTorch source code of Nyborg & Assent (2020). Abstract The detection of clouds in satellite images is an ess

Joachim Nyborg 8 Dec 22, 2022
A parametric soroban written with CADQuery.

A parametric soroban written in CADQuery The purpose of this project is to demonstrate how "code CAD" can be intuitive to learn. See soroban.py for a

Lee 4 Aug 13, 2022