Implementation of ConvMixer in TensorFlow and Keras

Last update: Oct 03, 2022

Overview

ConvMixer

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Official GitHub Link: https://github.com/tmp-iclr/convmixer

Paper Link: https://openreview.net/pdf?id=TVHS5Y4dNvM

Note: Paper is under review for ICLR 2022

Model Architechture

Installation

pip install -q tensorflow-addons

Note: We are using TensorFlow-Addons for using the AdamW optimizer and GeLU activation function.

Results

TensorBoard Link: https://tensorboard.dev/experiment/bkhqOz0RQ1Cv5dwrDQySMQ/

Note: Trained 25 Epochs and got a top-5-accuracy of 64.41%

Future Work

To train on 150 epochs
To train model on ImageNet dataset

Citation

@inproceedings{
anonymous2022patches,
title={Patches Are All You Need?},
author={Anonymous},
booktitle={Submitted to The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=TVHS5Y4dNvM},
note={under review}
}

License

MIT License

Copyright (c) 2021 Sayan Nath

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

You might also like...

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

5.7k Jan 9, 2023

Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

12 Nov 11, 2022

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

22.5k Jan 4, 2023

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

1 Dec 18, 2021

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

1 Dec 19, 2021

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Image Classification in Python Implementing image classification in Flask using Keras. The VGG16 is a convolution neural network model architecture th

19 Dec 12, 2022

Releases(0.0.1)

0.0.1(Oct 15, 2021)

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

View the TensorBoard here.

Note: Trained on 25 Epochs.
Source code(tar.gz)
Source code(zip)
convmixer-model.h5(6.94 MB)
convmixer.zip(6.21 MB)
train-logs.csv(2.94 KB)

Implementation of ConvMixer in TensorFlow and Keras

Related tags

Overview

ConvMixer

Model Architechture

Installation

Results

Future Work

Citation

License

You might also like...

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Face Mask Detection on Image and Video using tensorflow and keras

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Graph Neural Networks with Keras and Tensorflow 2.

Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Deep GPs built on top of TensorFlow/Keras and GPflow

Releases(0.0.1)

0.0.1(Oct 15, 2021)

Owner

Sayan Nath

TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling

PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

RobustVideoMatting and background composing in one model by using onnxruntime.

A program that can analyze videos according to the weights you select

SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches

Navigating StyleGAN2 w latent space using CLIP

This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Train CNNs for the fruits360 data set in NTOU CS「Machine Vision」class.

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Applying PVT to Semantic Segmentation

EMNLP 2021 Findings' paper, SCICAP: Generating Captions for Scientific Figures

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

An easier way to build neural search on the cloud

Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"

SoK: Vehicle Orientation Representations for Deep Rotation Estimation

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

The code release of paper Low-Light Image Enhancement with Normalizing Flow

A clear, concise, simple yet powerful and efficient API for deep learning.

DAN: Unfolding the Alternating Optimization for Blind Super Resolution