Convolutional networks using PyTorch
This is a complete training example for Deep Convolutional Networks on various datasets (ImageNet, Cifar10, Cifar100, MNIST).
Available models include:
'alexnet', 'amoebanet', 'darts', 'densenet', 'googlenet', 'inception_resnet_v2', 'inception_v2', 'mnist', 'mobilenet', 'mobilenet_v2', 'nasnet', 'resnet', 'resnet_se', 'resnet_zi', 'resnet_zi_se', 'resnext', 'resnext_se'
It is based off imagenet example in pytorch with helpful additions such as:
 Training on several datasets other than imagenet
 Complete logging of trained experiment
 Graph visualization of the training/validation loss and accuracy
 Definition of preprocessing and optimization regime for each model
 Distributed training
To clone:
git clone recursive https://github.com/eladhoffer/convNet.pytorch
example for efficient multigpu training of resnet50 (4 gpus, labelsmoothing):
python m torch.distributed.launch nproc_per_node=4 main.py model resnet modelconfig "{'depth': 50}" evalbatchsize 512 save resnet50_ls labelsmoothing 0.1
This code can be used to implement several recent papers:

Hoffer et al. (2018): Fix your classifier: the marginal value of training the last weight layer

Hoffer et al. (2018): Norm matters: efficient and accurate normalization schemes in deep networks
For example, training ResNet18 with L1 norm (instead of batchnorm):
python main.py model resnet modelconfig "{'depth': 18, 'bn_norm': 'L1'}" save resnet18_l1 b 128

Banner et al. (2018): Scalable Methods for 8bit Training of Neural Networks
For example, training ResNet18 with 8bit quantization:
python main.py model resnet modelconfig "{'depth': 18, 'quantize':True}" save resnet18_8bit b 64

Hoffer et al. (2020): Augment Your Batch: Improving Generalization Through Instance Repetition
For example, training the resnet44 + cutout example in paper:
python main.py dataset cifar10 model resnet modelconfig "{'depth': 44}" duplicates 40 cutout b 64 epochs 100 save resnet44_cutout_m40

For example, training the resnet44 with mixed sizes example in paper:
python main.py model resnet dataset cifar10 save cifar10_mixsize_d b 64 modelconfig "{'regime': 'sampled_D+'}" epochs 200
Then, calibrate for specific size and evaluate using
python evaluate.py ./results/cifar10_mixsize_d/checkpoint.pth.tar dataset cifar10 b 64 inputsize 32 calibratebn
Pretrained models (ResNet50, ImageNet) are also available here
Dependencies
 pytorch
 torchvision to load the datasets, perform image transforms
 pandas for logging to csv
 bokeh for training visualization
Data
 Configure your dataset path with
datasetsdir
argument  To get the ILSVRC data, you should register on their site for access: http://www.imagenet.org/
Model configuration
Network model is defined by writing a .py file in models
folder, and selecting it using the model
flag. Model function must be registered in models/__init__.py
The model function must return a trainable network. It can also specify additional training options such optimization regime (either a dictionary or a function), and input transform modifications.
e.g for a model definition:
class Model(nn.Module):
def __init__(self, num_classes=1000):
super(Model, self).__init__()
self.model = nn.Sequential(...)
self.regime = [
{'epoch': 0, 'optimizer': 'SGD', 'lr': 1e2,
'weight_decay': 5e4, 'momentum': 0.9},
{'epoch': 15, 'lr': 1e3, 'weight_decay': 0}
]
self.data_regime = [
{'epoch': 0, 'input_size': 128, 'batch_size': 256},
{'epoch': 15, 'input_size': 224, 'batch_size': 64}
]
def forward(self, inputs):
return self.model(inputs)
def model(**kwargs):
return Model()
Citation
If you use the code in your paper, consider citing one of the implemented works.
@inproceedings{hoffer2018fix,
title={Fix your classifier: the marginal value of training the last weight layer},
author={Elad Hoffer and Itay Hubara and Daniel Soudry},
booktitle={International Conference on Learning Representations},
year={2018},
url={https://openreview.net/forum?id=S1Dh8Tg0},
}
@inproceedings{hoffer2018norm,
title={Norm matters: efficient and accurate normalization schemes in deep networks},
author={Hoffer, Elad and Banner, Ron and Golan, Itay and Soudry, Daniel},
booktitle={Advances in Neural Information Processing Systems},
year={2018}
}
@inproceedings{banner2018scalable,
title={Scalable Methods for 8bit Training of Neural Networks},
author={Banner, Ron and Hubara, Itay and Hoffer, Elad and Soudry, Daniel},
booktitle={Advances in Neural Information Processing Systems},
year={2018}
}
@inproceedings{Hoffer_2020_CVPR,
author = {Hoffer, Elad and BenNun, Tal and Hubara, Itay and Giladi, Niv and Hoefler, Torsten and Soudry, Daniel},
title = {Augment Your Batch: Improving Generalization Through Instance Repetition},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
@article{hoffer2019mix,
title={Mix \& Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency},
author={Hoffer, Elad and Weinstein, Berry and Hubara, Itay and BenNun, Tal and Hoefler, Torsten and Soudry, Daniel},
journal={arXiv preprint arXiv:1908.08986},
year={2019}
}