This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Overview

Orientation independent Möbius CNNs





This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Background (tl;dr)

All derivations and a detailed description of the models are found in Section 5 of our paper. What follows is an informal tl;dr, summarizing the central aspects of Möbius CNNs.

Feature fields on the Möbius strip: A key characteristic of the Möbius strip is its topological twist, making it a non-orientable manifold. Convolutional weight sharing on the Möbius strip is therefore only well defined up to a reflection of kernels. To account for the ambiguity of kernel orientations, one needs to demand that the kernel responses (feature vectors) transform in a predictable way when different orientations are chosen. Mathematically, this transformation is specified by a group representation ρ of the reflection group. We implement three different feature field types, each characterized by a choice of group representation:

  • scalar fields are modeled by the trivial representation. Scalars stay invariant under reflective gauge transformations:

  • sign-flip fields transform according to the sign-flip representation of the reflection group. Reflective gauge transformations negate the single numerical coefficient of a sign-flip feature:

  • regular feature fields are associated to the regular representation. For the reflection group, this implies 2-dimensional features whose two values (channels) are swapped by gauge transformations:

Reflection steerable kernels (gauge equivariance):

Convolution kernels on the Möbius strip are parameterized maps

whose numbers of input and output channels depend on the types of feature fields between which they map. Since a reflection of a kernel should result in a corresponding transformation of its output feature field, the kernel has to obey certain symmetry constraints. Specifically, kernels have to be reflection steerable (or gauge equivariant), i.e. should satisfy:

The following table visualizes this symmetry constraint for any pair of input and output field types that we implement:

Similar equivariance constraints are imposed on biases and nonlinearities; see the paper for more details.

Isometry equivariance: Shifts of the Möbius strip along itself are isometries. After one revolution (a shift by 2π), points on the strip do not return to themselves, but end up reflected along the width of the strip:

Such reflections of patterns are explained away by the reflection equivariance of the convolution kernels. Orientation independent convolutions are therefore automatically equivariant w.r.t. the action of such isometries on feature fields. Our empirical results, shown in the table below, confirm that this theoretical guarantee holds in practice. Conventional CNNs, on the other hand, are explicitly coordinate dependent, and are therefore in particular not isometry equivariant.

Implementation

Neural network layers are implemented in nn_layers.py while the models are found in models.py. All individual layers and all models are unit tested in unit_tests.py.

Feature fields: We assume Möbius strips with a locally flat geometry, i.e. strips which can be thought of as being constructed by gluing two opposite ends of a rectangular flat stripe together in a twisted way. Feature fields are therefore discretized on a regular sampling grid on a rectangular domain of pixels. Note that this choice induces a global gauge (frame field), which is discontinuous at the cut.

In practice, a neural network operates on multiple feature fields which are stacked in the channel dimension (a direct sum). Feature spaces are therefore characterized by their feature field multiplicities. For instance, one could have 10 scalar fields, 4 sign-flip fields and 8 regular feature fields, which consume in total channels. Denoting the batch size by , a feature space is encoded by a tensor of shape .

The correct transformation law of the feature fields is guaranteed by the coordinate independence (steerability) of the network layers operating on it.

Orientation independent convolutions and bias summation: The class MobiusConv implements orientation independent convolutions and bias summations between input and output feature spaces as specified by the multiplicity constructor arguments in_fields and out_fields, respectively. Kernels are as usual discretized by a grid of size*size pixels. The steerability constraints on convolution kernels and biases are implemented by allocating a reduced number of parameters, from which the symmetric (steerable) kernels and biases are expanded during the forward pass.

Coordinate independent convolutions rely furthermore on parallel transporters of feature vectors, which are implemented as a transport padding operation. This operation pads both sides of the cut with size//2 columns of pixels which are 1) spatially reflected and 2) reflection-steered according to the field types. The stripes are furthermore zero-padded along their width.

The forward pass operates then by:

  • expanding steerable kernels and biases from their non-redundant parameter arrays
  • transport padding the input field array
  • running a conventional Euclidean convolution

As the padding added size//2 pixels around the strip, the spatial resolution of the output field agrees with that of the input field.

Orientation independent nonlinearities: Scalar fields and regular feature fields are acted on by conventional ELU nonlinearities, which are equivariant for these field types. Sign-flip fields are processed by applying ELU nonlinearities to their absolute value after summing a learnable bias parameter. To ensure that the resulting fields are again transforming according to the sign-flip representation, we multiply them subsequently with the signs of the input features. See the paper and the class EquivNonlin for more details.

Feature field pooling: The module MobiusPool implements an orientation independent pooling operation with a stride and kernel size of two pixels, thus halving the fields' spatial resolution. Scalar and regular feature fields are pooled with a conventional max pooling operation, which is for these field types coordinate independent. As the coefficients of sign-flip fields negate under gauge transformations, they are pooled based on their (gauge invariant) absolute value.

While the pooling operation is tested to be exactly gauge equivariant, its spatial subsampling interferes inevitably with its isometry equivariance. Specifically, the pooling operation is only isometry equivariant w.r.t. shifts by an even number of pixels. Note that the same issue applies to conventional Euclidean CNNs as well; see e.g. (Azulay and Weiss, 2019) or (Zhang, 2019).

Models: All models are implemented in models.py. The orientation independent models, which differ only in their field type multiplicities but agree in their total number of channels, are implemented as class MobiusGaugeCNN. We furthermore implement conventional CNN baselines, one with the same number of channels and thus more parameters (α=1) and one with the same number of parameters but less channels (α=2). Since conventional CNNs are explicitly coordinate dependent they utilize a naive padding operation (MobiusPadNaive), which performs a spatial reflection of feature maps but does not apply the unspecified gauge transformation. The following table gives an overview of the different models:

Data - Möbius MNIST

We benchmark our models on Möbius MNIST, a simple classification dataset which consists of MNIST digits that are projected on the Möbius strip. Since MNIST digits are gray-scale images, they are geometrically identified as scalar fields. The size of the training set is by default set to 12000 digits, which agrees with the rotated MNIST dataset.

There are two versions of the training and test sets which consist of centered and shifted digits. All digits in the centered datasets occur at the same location (and the same orientation) of the strip. The isometry shifted digits appear at uniformly sampled locations. Recall that shifts once around the strip lead to a reflection of the digits as visualized above. The following digits show isometry shifted digits (note the reflection at the cut):

To generate the datasets it is sufficient to call convert_mnist.py, which downloads the original MNIST dataset via torchvision and saves the Möbius MNIST datasets in data/mobius_MNIST.npz.

Results

The models can then be trained by calling, for instance,

python train.py --model mobius_regular

For more options and further model types, consult the help message: python train.py -h

The following table gives an overview of the performance of all models in two different settings, averaged over 32 runs:

The setting "shifted train digits" trains and evaluates on isometry shifted digits. To test the isometry equivariance of the models, we train them furthermore on "centered train digits", testing them then out-of-distribution on shifted digits. As one can see, the orientation independent models generalize well over these unseen variations while the conventional coordinate dependent CNNs' performance deteriorates.

Dependencies

This library is based on Python3.7. It requires the following packages:

numpy
torch>=1.1
torchvision>=0.3

Logging via tensorboard is optional.

Owner
Maurice Weiler
AI researcher with a focus on geometric and equivariant deep learning. PhD candidate under the supervision of Max Welling. Master's degree in Physics.
Maurice Weiler
Additional functionality for use with fastai’s medical imaging module

fmi Adding additional functionality to fastai's medical imaging module To learn more about medical imaging using Fastai you can view my blog Install g

14 Oct 31, 2022
Learning-Augmented Dynamic Power Management

Learning-Augmented Dynamic Power Management This repository contains source code accompanying paper Learning-Augmented Dynamic Power Management with M

Adam 0 Feb 22, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

DART Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners. Environment

ZJUNLP 83 Dec 27, 2022
Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.

NetScanner.py Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir. Linux'da Kullanımı: git clone https://github.com/

4 Aug 23, 2021
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Price-Prediction-For-a-Dream-Home ROADMAP TO THIS LINEAR REGRESSION BASED HOUSE PRICE PREDICTION PREDICTION MODEL Import all the dependencies of the p

DIKSHA DESWAL 1 Dec 29, 2021
[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

dispersion-score Official implementation of 3DV 2021 Paper A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Rec

Yefan 7 May 28, 2022
Public scripts, services, and configuration for running a smart home K3S network cluster

makerhouse_network Public scripts, services, and configuration for running MakerHouse's home network. This network supports: TODO features here For mo

Scott Martin 1 Jan 15, 2022
Chainer Implementation of Semantic Segmentation using Adversarial Networks

Semantic Segmentation using Adversarial Networks Requirements Chainer (1.23.0) Differences Use of FCN-VGG16 instead of Dilated8 as Segmentor. Caution

Taiki Oyama 99 Jun 28, 2022
Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

From global to local MDI variable importances for random forests and when they are Shapley values Antonio Sutera ( Antonio Sutera 3 Feb 23, 2022

Code release for NeRF (Neural Radiance Fields)

NeRF: Neural Radiance Fields Project Page | Video | Paper | Data Tensorflow implementation of optimizing a neural representation for a single scene an

6.5k Jan 01, 2023
A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms This repo contains: the HDG implementation (Matlab codes) for 'Analysis and

Lei Wang 5 Oct 22, 2022
Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

Patrick Varilly 28 Nov 25, 2022
Implements an infinite sum of poisson-weighted convolutions

An infinite sum of Poisson-weighted convolutions Kyle Cranmer, Aug 2018 If viewing on GitHub, this looks better with nbviewer: click here Consider a v

Kyle Cranmer 26 Dec 07, 2022
DeepLearning Anomalies Detection with Bluetooth Sensor Data

Final Year Project. Constructing models to create offline anomalies detection using Travel Time Data collected from Bluetooth sensors along the route.

1 Jan 10, 2022
Automatic detection and classification of Covid severity degree in LUS (lung ultrasound) scans

Final-Project Final project in the Technion, Biomedical faculty, by Mor Ventura, Dekel Brav & Omri Magen. Subproject 1: Automatic Detection of LUS Cha

Mor Ventura 1 Dec 18, 2021
code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

MMNet This repo is the official implementation of ICCV 2021 paper "Multi-scale Matching Networks for Semantic Correspondence.". Pre-requisite conda cr

joey zhao 25 Dec 12, 2022
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 290 Dec 29, 2022
Temporal Segment Networks (TSN) in PyTorch

TSN-Pytorch We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as oth

1k Jan 03, 2023