Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Overview

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

This project acts as both a tutorial and a demo to using Hyperopt with Keras, TensorFlow and TensorBoard. Not only we try to find the best hyperparameters for the given hyperspace, but also we represent the neural network architecture as hyperparameters that can be tuned. This automates the process of searching for the best neural architecture configuration and hyperparameters.

Here, we are meta-optimizing a neural net and its architecture on the CIFAR-100 dataset (100 fine labels), a computer vision task. This code could be easily transferred to another vision dataset or even to another machine learning task.

How Hyperopt works

First off, to learn how hyperopt works and what it is for, read the hyperopt tutorial.

Meta-optimize the neural network with Hyperopt

To run the hyperparameter search vy yourself, do: python3 hyperopt_optimize.py. You might want to look at requirements.py and install some of them manually to acquire GPU acceleration (e.g.: installing TensorFlow and Keras especially by yourself).

Optimization results will continuously be saved in the results/ folder (sort files to take best result as human-readable text). Also, the results are pickled to results.pkl to be able to resume the TPE meta-optimization process later simply by running the program again with python3 hyperopt_optimize.py.

If you want to learn more about Hyperopt, you'll probably want to watch that video made by the creator of Hyperopt. Also, if you want to run the model on the CIFAR-10 dataset, you must edit the file neural_net.py.

It is possible that you get better results than there are already here. Pull requests / contributions are welcome. Suggestion: trying many different initializers for the layers would be an interesting thing to try. Adding SELU activations would be interesting too. To restart the training with new or removed hyperparameters, it is recommended to delete existing results with ./delete_results.sh.

The Deep Convolutional Neural Network Model

Here is a basic overview of the model. I implemented it in such a way that Hyperopt will try to change the shape of the layers and remove or replace some of them according to some pre-parametrized ideas that I have got. Therefore, not only the learning rate is changed with hyperopt, but a lot more parameters.

Analysis of the hyperparameters

Here is an analysis of the results regarding the effect of every hyperparameters. Here is an excerpt:

This could help to redefine the hyperparameters and to narrow them down successively, relaunching the meta-optimization on refined spaces.

Best result

The best model is this one: results/model_0.676100010872_6066e.txt.json.

The final accuracy is of 67.61% in average on the 100 fine labels, and is of 77.31% in average on the 20 coarse labels. My results are comparable to the ones in the middle of that list, under the CIFAR-100 section. The only image preprocessing that I do is a random flip left-right.

Best hyperspace found:

space_best_model = {
    "coarse_best_accuracy": 0.7731000242233277,
    "coarse_best_loss": 0.8012041954994201,
    "coarse_end_accuracy": 0.7565,
    "coarse_end_loss": 0.9019438380718231,
    "fine_best_accuracy": 0.6761000108718872,
    "fine_best_loss": 1.3936876878738402,
    "fine_end_accuracy": 0.6549,
    "fine_end_loss": 1.539645684337616,
    "history": {...},
    "loss": -0.6761000108718872,
    "model_name": "model_0.676100010872_6066e",
    "real_loss": 3.018656848526001,
    "space": {
        "activation": "elu",
        "batch_size": 320.0,
        "coarse_labels_weight": 0.3067103474295116,
        "conv_dropout_drop_proba": 0.25923531175521264,
        "conv_hiddn_units_mult": 1.5958302613876916,
        "conv_kernel_size": 3.0,
        "conv_pool_res_start_idx": 0.0,
        "fc_dropout_drop_proba": 0.4322253354921089,
        "fc_units_1_mult": 1.3083964454436132,
        "first_conv": 3,
        "l2_weight_reg_mult": 0.41206755600055983,
        "lr_rate_mult": 0.6549347353077412,
        "nb_conv_pool_layers": 3,
        "one_more_fc": null,
        "optimizer": "Nadam",
        "pooling_type": "avg",
        "res_conv_kernel_size": 2.0,
        "residual": 3.0,
        "use_BN": true
    },
    "status": "ok"
}

Plotting this best hyperspace's model:

TensorBoard

TensorBoard can be used to inspect the best result (or all results in case you retrain and edit the code to enable TensorBoard on everything.)

It is possible to run python3 retrain_best_with_tensorboard.py to retrain the model and save TensorBoard logs, as well as saving the weights at their best state during training for a potential reuse. The instructions to run TensorBoard will be printed in the console at the end of the retraining.

Every training's TensorBoard log will be in a new folder under the "TensorBoard/" directory with an unique name (the model ID).

Here is the command to run TensorBoard once located in the root directory of the project:

tensorboard --logdir=TensorBoard/

Logs for the best model can be downloaded manually (approximately 7 GB). Refer to the text file under the folder TensorBoard for directions on how to download the logs from Google Drive before running the TensorBoard client with the tensorboard --logdir=TensorBoard/ command.

Just as an example, here is what can be seen in TensorBoard for the histograms related to the first convolutional layer, conv2d_1:

It suggests that better weights and biases initialization schemes could be used.

It is also possible to see in TensorBoard more statistics and things, such as the distribution tab, the graphs tab, and the the scalars tab. See printscreens of all the statistics available under the TensorBoard/previews/ folder of this project.

Visualizing what activates certain filters

We use the method of gradient ascent in the input space. This consists of generating images that activate certain filters in layers. This consists of using a loss on the filters' activation to then derive and apply gradients in the input space to gradually form input images that activate the given filters maximally. This is done for each filter separately.

To run the visualization, one must edit conv_filters_visualization.py to make it load the good weights (in case a retraining was done) and then run python3 conv_filters_visualization.py. The images for layers will be seen under the folder layers/ of this project.

Here is an example for a low level layer, the one named add_1:

License

The MIT License (MIT)

Copyright (c) 2017 Vooban Inc.

For more information on sublicensing and the use of other parts of open-source code, see: https://github.com/Vooban/Hyperopt-Keras-CNN-CIFAR-100/blob/master/LICENSE

Owner
Guillaume Chevalier
e^(πi) + 1 = 0
Guillaume Chevalier
Selfplay In MultiPlayer Environments

This project allows you to train AI agents on custom-built multiplayer environments, through self-play reinforcement learning.

200 Jan 08, 2023
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 04, 2023
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

Realcat 270 Jan 07, 2023
MoCap-Solver: A Neural Solver for Optical Motion Capture Data

MoCap-Solver is a data-driven-based robust marker denoising method, which takes raw mocap markers as input and outputs corresponding clean markers and skeleton motions.

55 Dec 28, 2022
For IBM Quantum Challenge Africa 2021, 9 September (07:00 UTC) - 20 September (23:00 UTC).

IBM Quantum Challenge Africa 2021 To ensure Africa is able to apply quantum computing to solve problems relevant to the continent, the IBM Research La

Qiskit Community 48 Dec 25, 2022
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

PGpoints Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021) Hyeontae Son, Young Min Kim Pre

Hyeontae Son 9 Jun 06, 2022
Code and Experiments for ACL-IJCNLP 2021 Paper Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering.

Code and Experiments for ACL-IJCNLP 2021 Paper Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering.

Sidd Karamcheti 50 Nov 16, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

HU Zeyu 82 Dec 27, 2022
Human Dynamics from Monocular Video with Dynamic Camera Movements

Human Dynamics from Monocular Video with Dynamic Camera Movements Ri Yu, Hwangpil Park and Jehee Lee Seoul National University ACM Transactions on Gra

215 Jan 01, 2023
Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

EMDATA-AILAB 23 Dec 26, 2022
Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)

Physion: Evaluating Physical Prediction from Vision in Humans and Machines This repo contains code and data to reproduce the results in our paper, Phy

Cognitive Tools Lab 38 Jan 06, 2023
Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

NorCal Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation On Model Calibration for Long-Tailed Object Detec

Tai-Yu (Daniel) Pan 24 Dec 25, 2022
PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

Smoothed Mutual Information ``Lower Bound'' Estimator PyTorch implementation for the ICLR 2020 paper Understanding the Limitations of Variational Mutu

50 Nov 09, 2022
A NSFW content filter.

Project_Nfilter A NSFW content filter. With a motive of minimizing the spreads and leakage of NSFW contents on internet and access to others devices ,

1 Jan 20, 2022
Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”. Introduction Based

96 Dec 13, 2022
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
pytorch implementation of dftd2 & dftd3

torch-dftd pytorch implementation of dftd2 [1] & dftd3 [2, 3] Install # Install from pypi pip install torch-dftd # Install from source (for developer

33 Nov 28, 2022
Tutorial page of the Climate Hack, the greatest hackathon ever

Tutorial page of the Climate Hack, the greatest hackathon ever

UCL Artificial Intelligence Society 12 Jul 02, 2022