Implementation of "A MLP-like Architecture for Dense Prediction"

Related tags

Deep LearningCycleMLP
Overview

A MLP-like Architecture for Dense Prediction (arXiv)

License: MIT Python 3.8

    

Updates

  • (22/07/2021) Initial release.

Model Zoo

We provide CycleMLP models pretrained on ImageNet 2012.

Model Parameters FLOPs Top 1 Acc. Download
CycleMLP-B1 15M 2.1G 78.9% model
CycleMLP-B2 27M 3.9G 81.6% model
CycleMLP-B3 38M 6.9G 82.4% model
CycleMLP-B4 52M 10.1G 83.0% model
CycleMLP-B5 76M 12.3G 83.2% model

Usage

Install

  • PyTorch 1.7.0+ and torchvision 0.8.1+
  • timm:
pip install 'git+https://github.com/rwightman/[email protected]'

or

git clone https://github.com/rwightman/pytorch-image-models
cd pytorch-image-models
git checkout c2ba229d995c33aaaf20e00a5686b4dc857044be
pip install -e .
  • fvcore (optional, for FLOPs calculation)
  • mmcv, mmdetection, mmsegmentation (optional)

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is:

│path/to/imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Evaluation

To evaluate a pre-trained CycleMLP-B5 on ImageNet val with a single GPU run:

python main.py --eval --model CycleMLP_B5 --resume path/to/CycleMLP_B5.pth --data-path /path/to/imagenet

Training

To train CycleMLP-B5 on ImageNet on a single node with 8 gpus for 300 epochs run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model CycleMLP_B5 --batch-size 128 --data-path /path/to/imagenet --output_dir /path/to/save

Acknowledgement

This code is based on DeiT and pytorch-image-models. Thanks for their wonderful works

Citing

@article{chen2021cyclemlp,
  title={CycleMLP: A MLP-like Architecture for Dense Prediction},
  author={Chen, Shoufa and Xie, Enze and Ge, Chongjian and Liang, Ding and Luo, Ping},
  journal={arXiv preprint arXiv:2107.10224},
  year={2021}
}

License

CycleMLP is released under MIT License.

Comments
  • detection result

    detection result

    Applying PVT detection framework, I tried a CycleMLP-B1 based detector with RetinaNet 1x. I got AP=27.1, fairly inferior to the reported 38.6. Could you give some advices to reproduce the reported result?

    The specific configure is as follows

    base = [ 'base/models/retinanet_r50_fpn.py', 'base/datasets/coco_detection.py', 'base/schedules/schedule_1x.py', 'base/default_runtime.py' ] #optimizer model = dict( pretrained='./pretrained/CycleMLP_B1.pth', backbone=dict( type='CycleMLP_B1_feat', style='pytorch'), neck=dict( type='FPN', in_channels=[64, 128, 320, 512], out_channels=256, start_level=1, add_extra_convs='on_input', num_outs=5)) #optimizer optimizer = dict(delete=True, type='AdamW', lr=0.0001, weight_decay=0.0001) optimizer_config = dict(grad_clip=None)

    find_unused_parameters = True

    opened by mountain111 6
  • Compiling CycleMLP

    Compiling CycleMLP

    Thank you for this great repo and interesting paper.

    I tried compiling CycleMLP to onnx and not surpassingly the process failed since CycleMLP include dynamic offset creation in https://github.com/ShoufaChen/CycleMLP/blob/main/cycle_mlp.py#L132 and as such cannot be converted to a frozen graph. Were you able to convert CycleMLP to onnx or any other frozen graph framework?

    Thanks in advance.

    opened by shairoz-deci 6
  • Questions about offset calculation

    Questions about offset calculation

    Hi, thanks for your wonderful work.

    I'm currently studying your work, and come up with some question about the offset calculations.

    I understood the offset calculation mentioned on the paper, but can't understand about how generated offset is being used in the code.

    For ex) if $S_H \times S_W : 3 \times 1$; I understood how the offset is applied in this figure 스크린샷 2022-06-13 오후 9 18 20

    by calculate like this: 스크린샷 2022-06-13 오후 9 19 57

    However, when I run the offset generating code, I can't figure out how this offset is being used in deform_conv2d 스크린샷 2022-06-13 오후 9 21 57

    Can you provide more detailed information about this??

    And also, the paper contains how $S_H \times S_W: 3 \times 3$ works, but in the code, it seems like either one ofkernel_size[0] or kernel_size[1] has to be 1. So, if I want to use $S_H \times S_W : 3 \times 3$, do I have to make $3 \times 1$ and $1 \times 3$ offsets and add those together?

    Thank you again for your work. I really learned a lot.

    opened by tae-mo 5
  • Example of CycleMLP Configuration for Dense Prediction

    Example of CycleMLP Configuration for Dense Prediction

    Hello.

    First of all, thank you for curating this interesting work. I was wondering, are there any working examples of how I can use CycleMLP for dense prediction while maintaining the original input size (e.g., predict a 0 or 1 value for each pixel in an input image)? In addition, I am interested in only a single ("annotated") output image, although I noticed the model definitions given in this repository output multiple downsampled versions of the original input image. Any thoughts on this?

    Thank you in advance for your time.

    opened by amorehead 2
  • Swin-B vs CycleMLP-B on image classification

    Swin-B vs CycleMLP-B on image classification

    For classificaion on ImageNet-1k, the acuracy of Swin-B is 83.5, which is 0.1 higher than the proposed CycleMLP-B. But, in this paper, the authors reprot that the accuracy of Swin-B is 83.3, which is 0.1 lower than the proposed CycleMLP-B. Why are these accuracies different?

    opened by hkzhang91 1
  • question about the offset

    question about the offset

    Thanks for your work!

    The implementation of this code inspired me. But the calculation of offset here is confusing. Although this issue (https://github.com/ShoufaChen/CycleMLP/issues/10) has asked similar questions, I haven't found a reasonable explanation.

    https://github.com/ShoufaChen/CycleMLP/blob/2f76a1f6e3cc6672143fdac46e3db5f9a7341253/cycle_mlp.py#L127-L136

    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
    offset.reshape(num_channels, 2)
    
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    

    the results are different with the figure in paper:

    image

    Some codes for verification:

    import torch
    from torchvision.ops import deform_conv2d
    
    num_channels = 6
    
    data = torch.arange(1, 6).reshape(1, 1, 1, 5).expand(-1, num_channels, -1, -1)
    data
    """
    tensor([[[[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]]]])
    """
    
    weight = torch.eye(num_channels).reshape(num_channels, num_channels, 1, 1)
    weight.reshape(num_channels, num_channels)
    """
    tensor([[1., 0., 0., 0., 0., 0.],
            [0., 1., 0., 0., 0., 0.],
            [0., 0., 1., 0., 0., 0.],
            [0., 0., 0., 1., 0., 0.],
            [0., 0., 0., 0., 1., 0.],
            [0., 0., 0., 0., 0., 1.]])
    """
    
    offset = torch.empty(1, 2 * num_channels * 1 * 1, 1, 1)
    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (
            (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
        )
    offset.reshape(num_channels, 2)
    """
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    """
    
    deform_conv2d(
        data.float(), 
        offset=offset.expand(-1, -1, -1, 5).float(), 
        weight=weight.float(), 
        bias=None,
    )
    """
    tensor([[[[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]],
             [[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]]]])
    """
    
    opened by lartpang 1
  • question about the offset

    question about the offset

    Hi, thank you very much for your excellent work. In Fig.4 of your paper, you show the pseudo-kernel when kernel size is 1x3. But I when I find that function "gen_offset" does not generate the same offset as Fig.4. The offset it generates is "0,1,0,-1,0,0,0,1..." instead of "0,1,0,-1,0,1,0,-1', which is shown in Fig.4. So could you please tell me the reason? image image

    opened by linjing7 1
  • About

    About "crop_pct"

    Hi, thanks for your great work and code. I wonder the parameter crop_pct actually works in which part of code. When I go throught the timm, I can't find out how this crop_pct is loaded.

    opened by ggjy 1
  • How to deploy CycleMLP-T for training?

    How to deploy CycleMLP-T for training?

    Thank you very much for such a wonderful work!

    After learning the cycle_mlp source code in the repository, I am very confused to deploy CycleMLP Block based on Swin Transformer. Is it convenient for you to release swin-based CycleMLP? Looking forward to your reply, Thanks!

    opened by Pak287 0
Owner
Shoufa Chen
Shoufa Chen
Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form This is supplementary code for the TISMIR paper Sliding-Window Pitch-Class H

1 Nov 27, 2021
Security evaluation module with onnx, pytorch, and SecML.

🚀 🐼 🔥 PandaVision Integrate and automate security evaluations with onnx, pytorch, and SecML! Installation Starting the server without Docker If you

Maura Pintor 11 Apr 12, 2022
Very Deep Convolutional Networks for Large-Scale Image Recognition

pytorch-vgg Some scripts to convert the VGG-16 and VGG-19 models [1] from Caffe to PyTorch. The converted models can be used with the PyTorch model zo

Justin Johnson 217 Dec 05, 2022
Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern Estimation

Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern Estimation Introduction 📋 Official implementation of Explainable Robust Learnin

JeongEun Park 6 Apr 19, 2022
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022
Table-Extractor 表格抽取

(t)able-(ex)tractor 本项目旨在实现pdf表格抽取。 Models 版面分析模块(Yolo) 表格结构抽取(ResNet + Transformer) 文字识别模块(CRNN + CTC Loss) Acknowledgements TableMaster attention-i

2 Jan 15, 2022
Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

Shintaro Shiba 63 Sep 25, 2022
Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

VITON-HD — Official PyTorch Implementation VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization Seunghwan Choi*1, Sunghyun Pa

Seunghwan Choi 250 Jan 06, 2023
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
Auxiliary data to the CHIIR paper Searching to Learn with Instructional Scaffolding

Searching to Learn with Instructional Scaffolding This is the data and analysis code for the paper "Searching to Learn with Instructional Scaffolding"

Arthur Câmara 2 Mar 02, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
Reporting and Visualization for Hazardous Events

Reporting and Visualization for Hazardous Events

Jv Kyle Eclarin 2 Oct 03, 2021
Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

Unofficial pytorch implementation of the paper "Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective"

16 Nov 21, 2022
YOLOv3 in PyTorch > ONNX > CoreML > TFLite

This repository represents Ultralytics open-source research into future object detection methods, and incorporates lessons learned and best practices

Ultralytics 9.3k Jan 07, 2023
Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions

Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions Usage Clone the code to local. https://github.com/tanlab/MI

Computational Biology and Machine Learning lab @ TOBB ETU 3 Oct 18, 2022
List of all dependencies affected by node-ipc malicious commit

node-ipc-dependencies-list List of all dependencies affected by node-ipc malicious commit as of 17/3/2022 - 19/3/2022 (timestamp) Please improve upon

99 Oct 15, 2022
nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation "

nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation ". Please

jsguo 610 Dec 28, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
Car Parking Tracker Using OpenCv

Car Parking Vacancy Tracker Using OpenCv I used basic image processing methods i

Adwait Kelkar 30 Dec 03, 2022