A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Last update: Nov 30, 2021

Related tags

Overview

CNN from scratch

The most interesting part is in the folder neural_networks/layers.py: Code for a convolutional neural network, based on only numpy (no PyTorch or TensorFlow). It is therefore very foundational and illustrates how CNNs work mathematically.

The CNNs is compatible with colour images (3-channel rgb), includes pooling layers (class Pool2D) and works with any given (valid) stride.

neural_networks/activations.py contains basic activation functions, like ReLu or SoftMax with the appropriate forward / backward implementations calculating the jacobian, etc., needed for backpropagation.

Many functions make heavy use of slicing, to speed up the training process significantly. See e.g. Conv2D.forward:

for x in range(out_rows):
    for y in range(out_cols):
        out[:,x,y,:] = np.apply_over_axes(np.sum, W[None]*X_pad[:,x*s:x*s+kernel_height,y*s:y*s+kernel_width,:][...,None], [1,2,3])[:,0,0,0,:]

which is the sliced version of a depth-6 nested for loop -- and thus allows for significant speedup (on my computer, more than 20x speedup for the given training data).

In losses.py, CrossEntropy is the most important function. To allow for speed-up, we simplified mathematically as much as possible, yielding

loss = -1.0/m *np.trace(np.matmul(Y,np.log(Y_hat.T)))

for the forward pass and

-1/m*(np.divide(Y,Y_hat))

for the backward pass.

This is based on a project for CS289 at UC Berkeley.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Related tags

Overview

CNN from scratch

Owner

FastyAPI is a Stack boilerplate optimised for heavy loads.

A tool to prepare websites grabbed with wget for local viewing.

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern Estimation

Applicator Kit for Modo allow you to apply Apple ARKit Face Tracking data from your iPhone or iPad to your characters in Modo.

A vanilla 3D face modeling on pose-invariant and multi-lightning image data

The codes of paper 'Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees'

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"

Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

A semismooth Newton method for elliptic PDE-constrained optimization

This is a project based on retinaface face detection, including ghostnet and mobilenetv3

PyTorch implementation of Higher Order Recurrent Space-Time Transformer

The code of "Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer".

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.