simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Overview

Summary

This simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

  • Choose between two different neural network architectures
  • Make architectures parametrizable
  • Read input arguments from config file or command line
    • (command line arguments override config file ones)
  • Download FashionMNIST dataset if not already downloaded
  • Monitor training progress on the terminal and/or with TensorBoard logs
    • Accuracy, loss, confusion matrix

More details about FashionMNIST can be found here.

It may be useful as a starting point for people who are starting to learn about PyTorch and neural networks.

Prerequisites

We assume that most users will have a GPU driver correctly configured, although the script can also be run on the CPU.

The project should work with your preferred python environment, but I have only tested it with conda (MiniConda 3) local environments. To create a local environment for this project,

conda create --name simple_pytorch_example python=3.9

and then activate it with

conda activate simple_pytorch_example

Installation on Ubuntu Linux

(Tested on Ubuntu Linux Focal 20.04.3 LTS)

Go to the directory where you want to have the project, e.g.

cd Software

Clone the simple_pytorch_example github repository

git clone https://github.com/rcasero/simple_pytorch_example.git

Install the python dependencies

cd simple_pytorch_example
python setup.py install

train_simple_pytorch_example.py: Main script to train the neural network

You can run the script train_simple_pytorch_example.py as

./train_simple_pytorch_example.py [options]

or

python train_simple_pytorch_example.py [options]

Usage summary

usage: train_simple_pytorch_example.py [-h] [-c CONFIG_FILE] [-v] [--workdir DIR] [-d STR] [-e N] [-b N] [-l F] [--validation_ratio F] [-n STR] [--conv_out_features N [N ...]]
                                       [--conv_kernel_size N] [--maxpool_kernel_size N]

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config CONFIG_FILE
                        config file path
  -v, --verbose         verbose output for debugging
  --workdir DIR         working directory to place data, logs, weights, etc subdirectories (def .)
  -d STR, --device STR  device to train on (def 'cuda', 'cpu')
  -e N, --epochs N      number of epochs for training (def 10)
  -b N, --batch_size N  batch size for training (def 64)
  -l F, --learning_rate F
                        learning rate for training (def 1e-3)
  --validation_ratio F  ratio of training dataset reserved for validation (def 0.0)
  -n STR, --nn STR      neural network architecture (def 'SimpleCNN', 'SimpleLinearNN')
  --conv_out_features N [N ...]
                        (SimpleCNN only) number of output features for each convolutional block (def 8 16)
  --conv_kernel_size N  (SimpleCNN only) kernel size of convolutional layers (def 3)
  --maxpool_kernel_size N
                        (SimpleCNN only) kernel size of max pool layers (def 2)

Args that start with '--' (eg. -v) can also be set in a config file (specified via -c). Config file syntax allows: key=value, flag=true, stuff=[a,b,c]
(for details, see syntax at https://goo.gl/R74nmi). If an arg is specified in more than one place, then commandline values override config file values
which override defaults.

Options not provided to the script take default values, e.g. running ./train_simple_pytorch_example.py -v produces the output

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v
Defaults:
  --workdir:         .
  --device:          cuda
  --epochs:          10
  --batch_size:      64
  --learning_rate:   0.001
  --validation_ratio:0.0
  --nn:              SimpleCNN
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

Arguments that start with -- can have their default values overridden using a configuration file (-c CONFIG_FILE). A configuration file is just a text file (e.g. config.txt) that looks like this:

device = cuda
epochs = 20
batch_size = 64
learning_rate = 1e-3
validation_ratio = 0.2
nn = SimpleCNN
conv_out_features = [8, 16]
conv_kernel_size = 3
maxpool_kernel_size = 2

Note that when running ./train_simple_pytorch_example.py -v -c config.txt the defaults have been replaced by the arguments provided in the config file:

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v -c config.txt
Config File (config.txt):
  device:            cuda
  epochs:            20
  batch_size:        64
  learning_rate:     1e-3
  validation_ratio:  0.2
  nn:                SimpleCNN
  conv_out_features: [8, 16]
  conv_kernel_size:  3
  maxpool_kernel_size:2
Defaults:
  --workdir:         .

Command line arguments override both defaults and configuration file arguments, e.g.

./train_simple_pytorch_example.py --nn SimpleCNN -v --conv_out_features 8 16 32 -e 5

FashionMNIST data download

When train_simple_pytorch_example.py runs, it checks whether the FashionMNIST data has already been downloaded to WORKDIR/data, and if not, it downloads it automatically.

Network architectures

We provide two neural network architectures that can be selected with option --nn SimpleLinearNN or --nn SimpleCNN.

SimpleLinearNN is a network with fully connected layers

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================

SimpleCNN is a traditional convolutional neural network (CNN) formed by concatenation of convolutional blocks (Conv2d + ReLU + MaxPool2d + BatchNorm2d). Those blocks are followed by a 1x1 convolution and a fully connected layer with 10 outputs. The hyperparameters that the user can configure are (they are ignored for the other network):

  • --conv_kernel_size N: Size of the convolutional kernels (NxN, dafault 3x3).
  • --maxpool_kernel_size N: Size of the maxpool kernels (NxN, dafault 2x2).
  • --conv_out_features N1 [N2 ...]: Each number adds a convolutional block with the corresponding number of output features. E.g. --conv_out_features 8 16 32 creates a network with 3 blocks
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 32, 7, 7]             4,640
│    └─ReLU: 2-10                        [1, 32, 7, 7]             --
│    └─MaxPool2d: 2-11                   [1, 32, 3, 3]             --
│    └─BatchNorm2d: 2-12                 [1, 32, 3, 3]             64
│    └─Conv2d: 2-13                      [1, 1, 3, 3]              289
│    └─Flatten: 2-14                     [1, 9]                    --
│    └─Linear: 2-15                      [1, 10]                   100
==========================================================================================

General training options

Currently, the loss (torch.nn.CrossEntropyLoss) and optimizer (torch.optim.SGD) are fixed.

Parameters common to both architectures are

  • --epochs N: number of training epochs.
  • --batch_size N: size of the training batch (if the dataset size is not a multiple of the batch size, the last batch will be smaller).
  • --learning_rate F: learning rate.
  • --validation_ratio F: by default, the script uses all the training data in FashionMNIST for training. But the user can choose to split the training data between training and validation. (The test data is a separate dataset in FashionMNIST).

Output network parameters

Once the network is trained, the model.state_dict() is saved to WORKDIR/models/LOGFILENAME.state_dict.

Monitoring

Option --verbose outputs detailed information about the script arguments, datasets, network architecture and training progress.

** Training:
Epoch 1/10
-------------------------------
train mean loss: 2.3913  [     0/ 60000]
train mean loss: 2.1813  [  6400/ 60000]
train mean loss: 2.1227  [ 12800/ 60000]
train mean loss: 2.0780  [ 19200/ 60000]
train mean loss: 1.9196  [ 25600/ 60000]
train mean loss: 1.6919  [ 32000/ 60000]
train mean loss: 1.4112  [ 38400/ 60000]
train mean loss: 1.2632  [ 44800/ 60000]
train mean loss: 1.0215  [ 51200/ 60000]
train mean loss: 0.8559  [ 57600/ 60000]
Training: Mean loss: 1.6672
Test: Accuracy: 63.8%, Mean loss: 0.9794
Validation: Accuracy: nan%, Mean loss:    nan
Epoch 2/10
-------------------------------
train mean loss: 1.0026  [     0/ 60000]
train mean loss: 0.8822  [  6400/ 60000]
...

Training progress can also be monitored with TensorBoard. The script saves TensorBoard logs to WORKDIR/runs, with a filename formed by the date (YYYY-MM-DD), time (HH-MM-SS), hostname and network architecture (e.g. 2021-11-25_01-15-49_marcel_SimpleCNN). To monitor the logs either during training or afterwards, run

tensorboard --logdir=runs &

and browse the URL displayed on the terminal, e.g. http://localhost:6006/.

If you are working remotely on the GPU server, you need to forward the remote server's port to your local machine

ssh -L 6006:localhost:6006 [email protected]_IP 

We provide plots for Accuracy (%), Mean loss and the Confusion Matrix

Accuracy and loss plots Confusion matrix

Results

SimpleLinearNN

Experiment 2021-11-26_01-33-52_marcel_SimpleLinearNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleLinearNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
Total mult-adds (M): 0.67
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.01
Params size (MB): 2.68
Estimated Total Size (MB): 2.69
==========================================================================================

The final metrics (after 100 epochs) are shown under each corresponding figure:

Mean loss plots

  • Mean loss:
    • Training (brown): 0.4125
    • Test (dark blue): 0.4571
    • Validation (cyan): 0.4478

Accuracy plots

  • Accuracy:
    • Test (pink): 83.8%
    • Validation (green): 84.3%

SimpleCNN

Experiment 2021-11-26_02-17-18_marcel_SimpleCNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleCNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 1, 7, 7]              145
│    └─Flatten: 2-10                     [1, 49]                   --
│    └─Linear: 2-11                      [1, 10]                   500
==========================================================================================
Total params: 1,941
Trainable params: 1,941
Non-trainable params: 0
Total mult-adds (M): 0.30
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.09
Params size (MB): 0.01
Estimated Total Size (MB): 0.11
==========================================================================================

Mean loss plots

  • Mean loss:
    • Training (dark blue): 0.3186
    • Test (orange): 0.3686
    • Validation (brown): 0.3372

Accuracy plots

  • Accuracy:
    • Test (cyan): 87.2%
    • Validation (pink): 88.1%
You might also like...
A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images. Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

This is a model made out of Neural Network specifically a Convolutional Neural Network model
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明 山东大学 2020级 苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数: resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a,返回一个$lenb\times lenb$的方阵。因

This is the official repo for TransFill:  Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.
Releases(v1.0.0)
  • v1.0.0(Jan 7, 2022)

    Toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

    • Choose between two different neural network architectures
    • Make architectures parametrizable
    • Read input arguments from config file or command line
      • (command line arguments override config file ones)
    • Download FashionMNIST dataset if not already downloaded
    • Monitor training progress on the terminal and/or with TensorBoard logs
      • Accuracy, loss, confusion matrix
    Source code(tar.gz)
    Source code(zip)
Owner
Ramón Casero
Ramón Casero
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-l

Facebook Research 4.6k Jan 09, 2023
Resco: A simple python package that report the effect of deep residual learning

resco Description resco is a simple python package that report the effect of dee

Pierre-Arthur Claudé 1 Jun 28, 2022
MonoRCNN is a monocular 3D object detection method for automonous driving

MonoRCNN MonoRCNN is a monocular 3D object detection method for automonous driving, published at ICCV 2021. This project is an implementation of MonoR

87 Dec 27, 2022
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Retrieval-Augmented Denoising Diffusion Probabilistic Models (wip) Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in P

Phil Wang 55 Jan 01, 2023
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Triangle Multiplicative Module - Pytorch Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or c

Phil Wang 22 Oct 28, 2022
Pytorch implementation of the paper "Optimization as a Model for Few-Shot Learning"

Optimization as a Model for Few-Shot Learning This repo provides a Pytorch implementation for the Optimization as a Model for Few-Shot Learning paper.

Albert Berenguel Centeno 238 Jan 04, 2023
NOD: Taking a Closer Look at Detection under Extreme Low-Light Conditions with Night Object Detection Dataset

NOD (Night Object Detection) Dataset NOD: Taking a Closer Look at Detection under Extreme Low-Light Conditions with Night Object Detection Dataset, BM

Igor Morawski 17 Nov 05, 2022
moving object detection for satellite videos.

DSFNet: Dynamic and Static Fusion Network for Moving Object Detection in Satellite Videos Algorithm Introduction DSFNet: Dynamic and Static Fusion Net

xiaochao 39 Dec 16, 2022
A Factor Model for Persistence in Investment Manager Performance

Factor-Model-Manager-Performance A Factor Model for Persistence in Investment Manager Performance I apply methods and processes similar to those used

Omid Arhami 1 Dec 01, 2021
Code related to the manuscript "Averting A Crisis In Simulation-Based Inference"

Abstract We present extensive empirical evidence showing that current Bayesian simulation-based inference algorithms are inadequate for the falsificat

Montefiore Artificial Intelligence Research 3 Nov 14, 2022
This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

<a href=[email protected]"> 4 Nov 06, 2022
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Seung Min Lee 54 Dec 08, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
Official Pytorch implementation of C3-GAN

Official pytorch implemenation of C3-GAN Contrastive Fine-grained Class Clustering via Generative Adversarial Networks [Paper] Authors: Yunji Kim, Jun

NAVER AI 114 Dec 02, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

顾宇超 77 Dec 02, 2022
A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm

Multi-Agent-Deep-Deterministic-Policy-Gradients A Pytorch implementation of the multi agent deep deterministic policy gradients(MADDPG) algorithm This

Phil Tabor 159 Dec 28, 2022
Main Results on ImageNet with Pretrained Models

This repository contains Pytorch evaluation code, training code and pretrained models for the following projects: SPACH (A Battle of Network Structure

Microsoft 151 Dec 14, 2022
This package contains deep learning models and related scripts for RoseTTAFold

RoseTTAFold This package contains deep learning models and related scripts to run RoseTTAFold This repository is the official implementation of RoseTT

1.6k Jan 03, 2023
Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

AdaBelief Optimizer NeurIPS 2020 Spotlight, trains fast as Adam, generalizes well as SGD, and is stable to train GANs. Release of package We have rele

Juntang Zhuang 998 Dec 29, 2022