Deep Residual Networks with 1K Layers

By Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

Microsoft Research Asia (MSRA).

Introduction
Notes
Usage

Introduction

This repository contains re-implemented code for the paper "Identity Mappings in Deep Residual Networks" (http://arxiv.org/abs/1603.05027). This work enables training quality 1k-layer neural networks in a super simple way.

Acknowledgement: This code is re-implemented by Xiang Ming from Xi'an Jiaotong Univeristy for the ease of release.

Seel Also: Re-implementations of ResNet-200 [a] on ImageNet from Facebook AI Research (FAIR): https://github.com/facebook/fb.resnet.torch/tree/master/pretrained

Notes

This code is based on the implementation of Torch ResNets (https://github.com/facebook/fb.resnet.torch).
The experiments in the paper were conducted in Caffe, whereas this code is re-implemented in Torch. We observed similar results within reasonable statistical variations.
To fit the 1k-layer models into memory without modifying much code, we simply reduced the mini-batch size to 64, noting that results in the paper were obtained with a mini-batch size of 128. Less expectedly, the results with the mini-batch size of 64 are slightly better:

mini-batch CIFAR-10 test error (%): (median (mean+/-std))

128 (as in [a]) 4.92 (4.89+/-0.14)

64 (as in this code) 4.62 (4.69+/-0.20)
Curves obtained by running this code with a mini-batch size of 64 (training loss: y-axis on the left; test error: y-axis on the right):

mini-batch	CIFAR-10 test error (%): (median (mean+/-std))
128 (as in [a])	4.92 (4.89+/-0.14)
64 (as in this code)	4.62 (4.69+/-0.20)

Usage

Install Torch ResNets (https://github.com/facebook/fb.resnet.torch) following instructions therein.
Add the file resnet-pre-act.lua from this repository to ./models.
To train ResNet-1001 as of the form in [a]:

th main.lua -netType resnet-pre-act -depth 1001 -batchSize 64 -nGPU 2 -nThreads 4 -dataset cifar10 -nEpochs 200 -shareGradInput false

Note: ``shareGradInput=true'' is not valid for this model yet.

Deep Residual Networks with 1K Layers

Related tags

Overview

Deep Residual Networks with 1K Layers

Table of Contents

Introduction

Notes

Usage

Owner

Kaiming He

My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

fklearn: Functional Machine Learning

Bravia core script for python

Repository of continual learning papers

A library for finding knowledge neurons in pretrained transformer models.

ML-Decoder: Scalable and Versatile Classification Head

BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

A simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

Sequential GCN for Active Learning

RLBot Python bindings for the Rust crate rl_ball_sym

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Cross-Document Coreference Resolution

Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"

Joint Detection and Identification Feature Learning for Person Search

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

OpenMMLab Detection Toolbox and Benchmark