DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Last update: Jan 08, 2023

Overview

What is DeepHyper?

DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of machine learning (ML) models for scientific and engineering applications. DeepHyper reduces the barrier to entry for using AI/ML model development by reducing manually intensive trial-and-error efforts for developing predictive models. The package performs four key functions:

pipeline optimization for ML (DeepHyper/POPT)
neural architecture search (DeepHyper/NAS)
hyperparameter search (DeepHyper/HPS)
ensemble uncertainty quantification (DeepHyper/AutoDEUQ)

Pipeline optimization for ML (DeepHyper/POPT)

Predictive modeling with classical ML methods typically requires a pipeline of methods such as data preprocessing, data balancing, data splitting, variable importance analysis, variable selection, classification/regression algorithm selection, and cross-validation methods. Because of the myriad choices available for each method, employing an effective ML pipeline is beyond most scientists and engineers; therefore, they tend to resort to rules of thumb, often resulting in non-robust models. DeepHyper provides an interface to model the search space of the pipeline. It uses an intelligent search algorithm that samples a small number of pipeline configurations and progressively fits a surrogate model over the configuration-performance space until it exhausts the user-defined maximum number of evaluations. The asynchronous aspect allows the search to avoid waiting for all the evaluation results before proceeding to the next iteration. When an evaluation is finished, the data are used to retrain the surrogate model, which is then used to bias the search toward the promising configurations. The framework uses a master/worker computational paradigm, where one master node fits the surrogate model and generates promising pipeline configurations and worker nodes perform the computationally expensive evaluations and return the outputs to the master node.

Hyperparameter search (DeepHyper/HPS)

ML methods used for predictive modeling typically require user-specified values for hyperparameters, which include the number of hidden layers and units per layer, sparsity/overfitting regularization parameters, batch size, learning rate, type of initialization, optimizer, and activation function specification. Traditionally, to find performance-optimizing hyperparameter settings, researchers have used a trial-and-error process or a brute-force grid/random search. Such approaches lead to far-from-optimal performance, however, or are impractical for addressing large numbers of hyperparameters. DeepHyper provides a set of scalable hyperparameter search methods for automatically searching for high-performing hyperparameters for a given DNN architecture. DeepHyper uses an asynchronous model-based search that relies on fitting a dynamically updated surrogate model that tries to learn the relationship between the hyperparameter configurations (input) and their validation errors (output). The surrogate model is cheap to evaluate and can be used to prune the search space and identify promising regions, where the model then is iteratively refined by obtaining new outputs at inputs that are predicted by the model to be high-performing.

Neural architecture search (DeepHyper/NAS)

Scientific data sets are diverse and often require data-set-specific DNN models. Nevertheless, designing high-performing DNN architecture for a given data set is an expert-driven, time-consuming, trial-and-error manual task. To that end, DeepHyper provides a NAS for automatically identifying high-performing DNN architectures for a given set of training data. DeepHyper adopts an evolutionary algorithm that generates a population of DNN architectures, trains them concurrently by using multiple nodes, and improves the population by performing mutations on the existing architectures within a population. To reduce the training time of each architecture evaluation, DeepHyper adopts a distributed data-parallel training technique, splitting the training data and distributing the shards to multiple processing units. Multiple models with the same architecture are trained on different data shards, and the gradients from each model are averaged and used to update the weights of all the models. To maintain accuracy and reduce training time, DeepHyper combines aging evolution and an asynchronous Bayesian optimization method for tuning the hyperparameters of the data-parallel training simultaneously.

Ensemble uncertainty quantification (DeepHyper/AutoDEUQ)

Uncertainty quantification in DNN predictions is of paramount importance for confident scientific utilization of DL. Prediction with uncertainty quantification is vital when DNN learning deployments are performed on unseen datasets that may not be from the distribution of the training data. In such cases, confidence estimates are essential for deciding when to discard predictions from neural networks, because of their proclivity to extrapolation. More importantly, DeepHyper/AutoDEUQ sheds light on learning systems that are frequently dismissed as black-box within the scientific community, and it paves the way for greater model trustworthiness. To that end, DeepHyper/AutoDEUQ employs a scalable deep-ensemble approach for uncertainty quantification. The approach involves constructing several models with varying architectures, independently, on the training dataset. Parallel independent runs of DeepHyper/NAS are used, wherein each NAS run starts with different initialization and randomization of datasets. The best model candidates suggested by the parallel search are then utilized in an ensemble setting for quantifying the model uncertainty. Furthermore, each generated model can be configured to use various data likelihood options. The quantiles of these function values can be used to compute calibrated prediction intervals to capture data uncertainty.

Install instructions

From PyPI:

pip install deephyper

From Github:

git clone https://github.com/deephyper/deephyper.git
pip install -e deephyper/

If you want to install deephyper with test and documentation packages:

From PyPI:

pip install 'deephyper[dev]'

From Github:

git clone https://github.com/deephyper/deephyper.git
pip install -e 'deephyper/[dev]'

Quickstart

The black-box function named run is defined by taking an input dictionnary named config which contains the different variables to optimize. Then the run-function is binded to an Evaluator in charge of distributing the computation of multiple evaluations. Finally, a Bayesian search named AMBS is created and executed to find the values of config which maximize the return value of run(config).

def run(config: dict):
    return -config["x"]**2


# Necessary IF statement otherwise it will enter in a infinite loop
# when loading the 'run' function from a subprocess
if __name__ == "__main__":
    from deephyper.problem import HpProblem
    from deephyper.search.hps import AMBS
    from deephyper.evaluator import Evaluator

    # define the variable you want to optimize
    problem = HpProblem()
    problem.add_hyperparameter((-10.0, 10.0), "x")

    # define the evaluator to distribute the computation
    evaluator = Evaluator.create(
        run,
        method="subprocess",
        method_kwargs={
            "num_workers": 2,
        },
    )

    # define your search and execute it
    search = AMBS(problem, evaluator)

    results = search.search(max_evals=100)
    print(results)

Which outputs the following where the best x found is clearly around 0.

            x  id  objective  elapsed_sec  duration
0   1.667375   1  -2.780140     0.124388  0.071422
1   9.382053   2 -88.022911     0.124440  0.071465
2   0.247856   3  -0.061433     0.264603  0.030261
3   5.237798   4 -27.434527     0.345482  0.111113
4   5.168073   5 -26.708983     0.514158  0.175257
..       ...  ..        ...          ...       ...
94  0.024265  95  -0.000589     9.261396  0.117477
95 -0.055000  96  -0.003025     9.367814  0.113984
96 -0.062223  97  -0.003872     9.461532  0.101337
97 -0.016222  98  -0.000263     9.551584  0.096401
98  0.009660  99  -0.000093     9.638016  0.092450

How do I learn more?

Documentation: https://deephyper.readthedocs.io
GitHub repository: https://github.com/deephyper/deephyper

Who is responsible?

Currently, the core DeepHyper team is at Argonne National Laboratory (do not hesitate to reach out if we forgot you in the list!):

Prasanna Balaprakash [email protected], Lead and founder
Romain Egele [email protected], Co-Lead
Misha Salim [email protected]
Romit Maulik [email protected]
Venkat Vishwanath [email protected]
Stefan Wild [email protected]
Kyle Gerard Felker [email protected]

Modules, patches (code, documentation, etc.) contributed by:

Elise Jennings
Dipendra Kumar Jha [email protected]
Shengli Jiang [email protected]
Felix Perez [email protected]
Joceran Gouneau [email protected]
Bethany Lusch [email protected]

Citing DeepHyper

Find all our publications on the Research & Publication page of the Documentation.

How can I participate?

Questions, comments, feature requests, bug reports, etc. can be directed to:

Issues on GitHub

Patches through pull requests are much appreciated on the software itself as well as documentation. Optionally, please include in your first patch a credit for yourself in the list above.

The DeepHyper Team uses git-flow to organize the development: Git-Flow cheatsheet. For tests we are using: Pytest.

Acknowledgements

Scalable Data-Efficient Learning for Scientific Domains, U.S. Department of Energy 2018 Early Career Award funded by the Advanced Scientific Computing Research program within the DOE Office of Science (2018--Present)
Argonne Leadership Computing Facility: This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357.
SLIK-D: Scalable Machine Learning Infrastructures for Knowledge Discovery, Argonne Computing, Environment and Life Sciences (CELS) Laboratory Directed Research and Development (LDRD) Program (2016--2018)

Copyright and license

DeepHyper is distributed under the terms of BSD License. See LICENSE

Argonne Patent & Intellectual Property File Number: SF-19-007

Comments

could not install the deephyper.

I am installing deephyper using, pip install deephper, but i am getting error:

Running setup.py clean for yarl
 Building wheel for multidict (PEP 517) ... error
 Complete output from command /usr/bin/python3 /usr/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py build_wheel /tmp/tmp0rmzbrl6:
 *********************
 * Accelerated build *
 *********************
 running bdist_wheel
 running build
 running build_py
 creating build
 creating build/lib.linux-x86_64-3.6
 creating build/lib.linux-x86_64-3.6/multidict
 copying multidict/_compat.py -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/__init__.py -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/_multidict_py.py -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/_abc.py -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/_multidict_base.py -> build/lib.linux-x86_64-3.6/multidict
 running egg_info
 writing multidict.egg-info/PKG-INFO
 writing dependency_links to multidict.egg-info/dependency_links.txt
 writing top-level names to multidict.egg-info/top_level.txt
 adding license file 'LICENSE' (matched pattern 'LICENSE')
 reading manifest file 'multidict.egg-info/SOURCES.txt'
 reading manifest template 'MANIFEST.in'
 warning: no previously-included files matching '*.pyc' found anywhere in distribution
 warning: no previously-included files found matching 'multidict/_multidict.html'
 warning: no previously-included files found matching 'multidict/*.so'
 warning: no previously-included files found matching 'multidict/*.pyd'
 warning: no previously-included files found matching 'multidict/*.pyd'
 no previously-included directories found matching 'docs/_build'
 writing manifest file 'multidict.egg-info/SOURCES.txt'
 copying multidict/__init__.pyi -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/_multidict.c -> build/lib.linux-x86_64-3.6/multidict
 copying multidict/py.typed -> build/lib.linux-x86_64-3.6/multidict
 creating build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/defs.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/dict.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/istr.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/iter.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/pair_list.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 copying multidict/_multilib/views.h -> build/lib.linux-x86_64-3.6/multidict/_multilib
 running build_ext
 building 'multidict._multidict' extension
 creating build/temp.linux-x86_64-3.6
 creating build/temp.linux-x86_64-3.6/multidict
 gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.6m -c multidict/_multidict.c -o build/temp.linux-x86_64-3.6/multidict/_multidict.o -O2 -std=c99 -Wall -Wsign-compare -Wconversion -fno-strict-aliasing -pedantic
 multidict/_multidict.c:1:20: fatal error: Python.h: No such file or directory
  #include "Python.h"
                     ^
 compilation terminated.
 error: command 'gcc' failed with exit status 1
 
 ----------------------------------------
 Failed building wheel for multidict
 Running setup.py clean for multidict
Failed to build yarl multidict
Could not build wheels for yarl, multidict which use PEP 517 and cannot be installed directly

opened by ml-rakesh 19

How to get the best model?

I have tried exactly same as mentioned herefor nas problem using regevo

deephyper start-project nas_problems
cd nas_problems/nas_problems/
deephyper new-problem nas polynome2
cd nas_problems/nas_problems/polynome2
python load_data.py
python problem.py
deephyper nas regevo --evaluator ray --problem nas_problems.polynome2.problem.Problem --max-evals 100

After this I am getting a deephyper.log file

Now how to predict a data? Where is the best model located?

opened by anuragverma77 15

Fix coupling to Balsam
The calculation of the number of workers available to DeepHyper when --evaluator balsam is used was incorrect in many cases, and it was unclear how to modify DEEPHYPER_WORKERS_PER_NODE consistently with balsam job --node-packing-count=X when serial job mode is used in order to achieve the desired packing on each node.

E.g. consider a 2 node, serial job mode Balsam case with a desired behavior of 5 jobs (1x search + 4x evaluators) operating on the second node. The first node is fully reserved by Balsam in this case. In the current master, the following line in DeepHyper https://github.com/deephyper/deephyper/blob/8293621e069b15176568c3491f069f3684b85f56/deephyper/evaluator/_balsam.py#L34

seems to reserve 2 "workers" from the overall Balsam pool when launching evaluators: 1x for the searcher/learner ABMS process, and 1x presumably for the Balsam Master process for the MPI ensemble in the serial job mode.

This implies that the user must execute balsam job --node-packing-count=3 --env DEEPHYPER_WORKERS_PER_NODE=6 ... in order to get full occupancy on the single node that actually runs Balsam jobs. This is because DeepHyper sets the spawned evaluation jobs with --node-packing-count=DEEPHYPER_WORKERS_PER_NODE, even though it only considers Total number of workers: 4. The resulting occupancy that Balsam computes for the node will be 4/6 + 1/3 = 1.0

Furthermore, the current self.num_workers calculation is clearly incorrect for --job-mode=mpi, or --job-mode=serial with 1 node.

With the changes in this PR, the correct user behavior in my above (serial job mode) example is balsam job --node-packing-count=5 --env DEEPHYPER_WORKERS_PER_NODE=5 ... (which makes sense to me), and it is fully documented in docs/run/theta.rst. This is also the correct usage in the 1 node --job-mode=serial edge case, when the Master is silently colocated on the node with the Balsam Worker rank.

One thing that was confusing to me during this debugging was the differences between search/hps/ambs.py and search/nas/ambs.py. Only the latter has the following logic: https://github.com/deephyper/deephyper/blob/8293621e069b15176568c3491f069f3684b85f56/deephyper/search/nas/ambs.py#L44-L53 Was this an aborted attempt to incorporate the correct num_workers logic? Or something else? I have not run any NAS problems with DeepHyper.

I am still not convinced that the terminology of DeepHyper + Balsam is fully consistent/ not misleading. Specifically, that "Worker" in DH includes both the AMBS "searcher"/"learner" + evaluators/runners? This impacts the naming of the environment variables, etc. Here is my current breakdown:

DeepHyper:

Worker:

Evaluator/Runner?

Learner/searcher?

Balsam:

Job/Task

Worker (serial job mode only, I believe)

Other changes:

Avoided a Seaborn v0.9.0 incompatibility with NumPy >=v1.18.x

Fix Seaborn heatmap plot in HPS analytics notebook

Make default max_evals in Search class's __init__() consistent with the argparse default

Do not pin specific TF version, but prohibit TF >= 2.0.0
opened by felker 12

[BUG] Error while Installing dev version on ThetaGPU

Description:  

Got a `error: [Errno 13] Permission denied: '/lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/easy-install.pth` error while trying to install development version of Deephyper on ThetaGPU

Steps to Reproduce:

    module load conda/2021-09-22
    conda create -p dhgpu --clone base
    conda activate dhgpu/
    pip install pip --upgrade
    git clone https://github.com/deephyper/deephyper.git
    cd deephyper/ && git checkout develop
    pip install -e ".[dev,analytics]"

System Information:

 - System: ThetaGPU
 - Python version: 3.8.10
 - DeepHyper Version: 0.3.1

Command line output:

 ERROR: Command errored out with exit status 1:
  command: /lus/theta-fs0/projects/OptADDN/dhgpu/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/lus/theta-fs0/projects/OptADDN/deephyper/setup.py'"'"'; __file__='"'"'/lus/theta-fs0/projects/OptADDN/deephyper/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
      cwd: /lus/theta-fs0/projects/OptADDN/deephyper/
 Complete output (15 lines):
 running develop
 running egg_info
 writing deephyper.egg-info/PKG-INFO
 writing dependency_links to deephyper.egg-info/dependency_links.txt
 writing entry points to deephyper.egg-info/entry_points.txt
 writing requirements to deephyper.egg-info/requires.txt
 writing top-level names to deephyper.egg-info/top_level.txt
 reading manifest file 'deephyper.egg-info/SOURCES.txt'
 reading manifest template 'MANIFEST.in'
 adding license file 'LICENSE.md'
 writing manifest file 'deephyper.egg-info/SOURCES.txt'
 running build_ext
 Creating /lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/deephyper.egg-link (link to .)
 Adding deephyper 0.3.1 to easy-install.pth file
 error: [Errno 13] Permission denied: '/lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/easy-install.pth'
 ----------------------------------------

Rolling back uninstall of deephyper
Moving to /lus/theta-fs0/projects/OptADDN/dhgpu/bin/deephyper
from /tmp/pip-uninstall-uan2r4pw/deephyper
Moving to /lus/theta-fs0/projects/OptADDN/dhgpu/bin/deephyper-analytics
from /tmp/pip-uninstall-uan2r4pw/deephyper-analytics
Moving to /lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/deephyper-0.3.0.dist-info/
from /lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/~eephyper-0.3.0.dist-info
Moving to /lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/deephyper/
from /lus/theta-fs0/projects/OptADDN/dhgpu/lib/python3.8/site-packages/~eephyper
ERROR: Command errored out with exit status 1: /lus/theta-fs0/projects/OptADDN/dhgpu/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/lus/theta-fs0/projects/OptADDN/deephyper/setup.py'"'"'; __file__='"'"'/lus/theta-fs0/projects/OptADDN/deephyper/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.

bug

opened by Sande33p 10

Feature/moo
[x] adapt fit_surrogate

[x] adapt fit_generative_model

[x] add unit test

[ ] add other MoScalarFunction classes

[x] set default moo_scalarization_strategy along with default weights

[ ] decide callback behavior for multiobjective

[x] add utilities for computing hypervolume and pareto fronts
opened by Deathn0t 8
How to use AUC metric in DeepHyper
Refers to this conversation: https://github.com/deephyper/deephyper/issues/62#issuecomment-818772921 with @anuragverma77

To install develop version of DeepHyper:

git clone https://github.com/deephyper/deephyper.git cd deephyper/ git checkout develop pip install -e.

Then the string to use the AUC of the ROC in the Problem.metric(...) is "auroc" for Precision-Recall it is "aucpr".
opened by Deathn0t 8
[BUG] Ray tasks are not Distributed on the different threads of a same node

Describe the bug

From @ekourlit,

Could I use the Ray technology to parallelize the HPS on a single machine? For example, if I switch back to the CPU usage, can the different Ray tasks run on the different cores or threads of my CPU in parallel? At the moment, Ray is creating 24 tasks (as many as my CPU threads) but only one is actually running, the rest are IDLE.

bug

opened by Deathn0t 8
Update license badge

Related to #163

Once pypi is updated, need to confirm that README.md has new badge reflecting BSD 3-Clause:

![PyPI - License](https://img.shields.io/pypi/l/deephyper.svg)

opened by wildsm 7
Trying to get deephyper to work with a script that distributes training using horovod

I've have been working on ThetaGPU using the 2021-11-30 version of Anaconda.

My task is to trying to find hyperparameters for a deep learning model built using tensorflow keras which distributes training across multiple GPUs using horovod. I have attached the training file which contains the run function that I have been using.

I have been running across the following error message while trying to execute this script. INVALID_ARGUMENT: Requested to allreduce, allgather, or broadcast a tensor with the same name as another tensor that is currently being processed. If you want to request another tensor, use a different tensor name.

I know that this isn't an issue with horovod or the model itself because I was able to run the script before trying to integrate hyperparameter optimization using deephyper. I was hoping you would have some insight as to why the problem is arising and how to go about fixing it.

train.txt
question

opened by athish-thiru 7
[BUG] Deephyper and Ray Cluster using GPUs on Cori
Describe the bug I'm facing an issue when I try to start deephyper running on ray cluster and allocating GPUs on Cori (NERSC). I'm using Tuster to deploy everything but when Tuster (most likely) attempts to submit the job with srun then I receive the following error on the --gres argument.

srun: error: Unable to create step for job X: Invalid generic resource (gres) specification

This is the sbatch script I'm submitting: Run_Batch_Ray_Bepop_GPU.zip

To Reproduce Steps to reproduce the behavior: module load cgpu sbatch Run_Batch_Ray_Bepop_GPU.sh

Desktop (please complete the following information):

System: Cori dev with GPUs

Python version: 3.7.4

DeepHyper Version: 0.2.1

Tuster version: 0.0.1

Ray version: 0.7.6

Could you please point me to where I should look to debug and resolve this issue?
bug
opened by papajim 7
[FEATURE] Spack install
Is your feature request related to a problem? Please describe. Deephyper should be installable from the spack package manager.

spack install py-deephyper spack load py-deephyper
enhancement
opened by Deathn0t 6
[FEATURE] GP surrogate model not repeating the winner evaluation over and over

Is your feature request related to a problem? Please describe.

Using DeepHyper sequentially with GP and the default Acquisition Function, when it converges it is suggesting the same sample many times.

Describe the solution you'd like

IMHO if you allow the GP re-evaluate the same optimum (because of the noise), I would stop the search after a number of re-evaluations. If the objective function is expensive, you will burn a lot of computing hours on evaluating something you (mostly) already know its result. Instead, you can try to explore some other uncertain regions on the search space: you have likely found the optimum, but if you are going to burn computing hours, at least you are increasing your odds to find a better optimum

Additional context In my user case, I ran the search with GP in my application. The search converged and the winning candidate was evaluated many times. However, the RF-based search found 5 better optimums than the GP-winning configuration, and when I merged the datasets, I realized that GP didn't evaluate those configurations. There was room to find a better optimum but GP (its acquisition function) suggested the same over and over.
enhancement

opened by aperezdieguez 0
[Doc] DeepHyper documentation on NERSC webpage
Add documentation about how to use DeepHyper on Perlmutter.

[x] write draft documentation in markdown (send to @Deathn0t for review)

[ ] add site/NERSC/Perlmutter to https://github.com/deephyper/quickstart (creates PR and follow Polaris, ThetaGPU examples)

[x] add installation documentation Perlmutter (NERSC) to https://deephyper.readthedocs.io/en/latest/install/index.html (creates PR)

[x] add documentation to the https://docs.nersc.gov/machinelearning/hpo/ (contact support)

docs
opened by Deathn0t 3
[FEATURE] Implement dashboard which uses the default database
Experiment selection

We could use Pyparsing: https://pyparsing-docs.readthedocs.io/en/latest/HowToUsePyparsing.html#using-the-pyparsing-module to have a search based on a query

Single experiment analysis

[x] scatter plot

[x] display CSV file

[x] search trajectory

[x] submit/gather profile plot

[x] start/end profile plot

[ ] parallel coordinate plot
opened by Deathn0t 0
[FEATURE] Implement DBManager
Implement a lightweight DBManager based on TinyDB.

[x] hyperparameter search

[ ] neural architecture search

[ ] documentation API

[ ] example usage documentation

enhancement
opened by Deathn0t 0

[BUG] Aging Evolution Search Crash for NAS when one of the Variable Nodes has only one possible operation

Describe the bug

The search process crash:

Uncaught Exception <class 'ValueError'>: 'a' cannot be empty unless no samples are taken
  12     Traceback (most recent call last):
  13       File "/home/rmaulik/.conda/envs/ae_search_env/lib/python3.7/runpy.py", line 193, in _run_module_as_main
  14         "__main__", mod_spec)
  15       File "/home/rmaulik/.conda/envs/ae_search_env/lib/python3.7/runpy.py", line 85, in _run_code
  16         exec(code, run_globals)
  17       File "/lus/theta-fs0/projects/datascience/rmaulik/AE_Search/deephyper_repo/deephyper/search/nas/regevo.py", line 160, in <module>
  18         search.main()
  19       File "/lus/theta-fs0/projects/datascience/rmaulik/AE_Search/deephyper_repo/deephyper/search/nas/regevo.py", line 105, in main
  20         child = self.copy_mutate_arch(parent)
  21       File "/lus/theta-fs0/projects/datascience/rmaulik/AE_Search/deephyper_repo/deephyper/search/nas/regevo.py", line 149, in copy_mutate_arch
  22         sample = np.random.choice(elements, 1)[0]
  23       File "mtrand.pyx", line 900, in numpy.random.mtrand.RandomState.choice
  24     ValueError: 'a' cannot be empty unless no samples are taken
  25     Application 22199611 exit codes: 1
  26     Application 22199611 resources: utime ~36s, stime ~18s, Rss ~296852, inblocks ~18490, outblocks ~1416

Additional context

The issue is coming from

    def copy_mutate_arch(self, parent_arch: list) -> dict:
        """
        # ! Time performance is critical because called sequentialy
        Args:
            parent_arch (list(int)): [description]
        Returns:
            dict: [description]
        """
        i = np.random.choice(len(parent_arch))
        child_arch = parent_arch[:]
        range_upper_bound = self.space_list[i][1]
        elements = [j for j in range(range_upper_bound + 1) if j != child_arch[i]]
        # The mutation has to create a different search_space!
        sample = np.random.choice(elements, 1)[0]
        child_arch[i] = sample
        cfg = self.pb_dict.copy()
        cfg["arch_seq"] = child_arch
        return cfg

so the ligne sample = np.random.choice(elements, 1)[0] is failing because len(elements) == 0 (cf. Numpy Doc) it means that one of the Variable Nodes of the search space has only 1 operation

bug

opened by Deathn0t 0

Releases(0.4.2)

0.4.2(Jul 12, 2022)
deephyper.evaluator

patched ThreadPoolEvaluator to remove extra overheads of pool initialisation

deephyper.search

resolved constant hyperparameter when hyperparameter is discrete with log-uniform in df01040d44a8f3b80700f2f853a6b452680e1112

patch id to job_id in neural architecture search history saver

adding multi objectives optimisation to CBO a run-function can now return multiple objectives as a tuple to be maximised

def run(config): ... return objective_0, objective_1

deephyper.ensemble

ensemble with uncertainty quantification UQBaggingEnsembleRegressor is now compatible with predictions of arbitrary shapes

deephyper dashboard

adding dashboard with deephyper-analytics dashboard paired with results stored in local deephyper database managed through DBManager

adding dataframe visualization

adding scatter plot visualization

Source code(tar.gz)
Source code(zip)
0.4.0(Jun 9, 2022)
global updates

contributors of DeepHyper now appear on a dedicated page, see DeepHyper Authors, submit a PR if we forgot you!

lighter installation via pip install deephyper packed with the minimum requirements for hyperparameter search.

update API documentation

removed deephyper.benchmark

make neural architecture search features optional with pip install deephyper[nas]

make auto-sklearn features optional with pip install deephyper[popt] (Pipeline OPTimization)

improve epistemic uncertainty quantification for Random Forest surrogate model in Bayesian Optimisation

moved deephyper/scikit-optimize as a sub package deephyper.skopt

new tutorials dedicated to ALCF systems, see Tutorials - Argonne Leadership Computing Facility

deephyper.search

renamed AMBS to CBO (Centralised Bayesian Optimization) at deephyper.search.hps.CBO

added new scalable Distributed Bayesian Optimization algorithm at deephyper.search.hps.DBO (experimented with up to 4,096 parallel workers)

moved problem.add_starting_point of HpProblem to CBO(..., initial_points=[...])

added generative-model based transfer-learning for hyper-parameter optimisation (Example - Transfer Learning for Hyperparameter Search)

added filtration of duplicated configurations in CBO CBO(..., filter_duplicated=True)

notify failures to the optimiser for it to learn them (Example - Notify Failures in Hyperparameter optimization)

added new multi-point acquisition strategy for better scalability in CBO CBO(..., acq_func="UCB, multi_point_strategy="qUCB", ...)

added the possibility to switch between synchronous/asynchronous communication in CBO CBO(..., sync_communication=True, ...)

deephyper.evaluator

added MPI-based Evaluators (better scaling, lower initialisation overhead): MPICommEvaluator and MPIPoolEvaluator

added @profile(run_function) decorator for run-function to collect execution times/durations of the black-box, this allow to profile the worker utilisation (Example - Profile the Worker Utilisation)

added @queued(Evaluator) decorator for any evaluator class to manage a queue of resources

added SerialEvalutor to adapt to serial-search (one by one)

added deephyper.evaluator.callback.TqdmCallback to display progress bar when running a search

the run-function can now return other values than the objective to be logged in the results.csv for example {"objective": 0.9, "num_parameters": 20000, ...}

asyncio is patched automatically when using notebooks/ipython

deephyper.ensemble

an ensemble API is provided to have uncertainty quantification estimates after running a neural architecture search or hyperparameter search (Tutorial - From Neural Architecture Search to Automated Deep Ensemble with Uncertainty Quantification)

Source code(tar.gz)
Source code(zip)
0.3.3(Oct 29, 2021)
Now compatible with Python >=3.7, <3.10

Fixed log_dir argument in search

Added logging for command line HPS/NAS

Source code(tar.gz)
Source code(zip)

0.3.2(Oct 25, 2021)

All the search algorithms were tested to have a correct behaviour when random_state is set.
Callbacks (deephyper.evaluator.callback) can now be used to extend the behavior of the existing Evaluator. A LoggerCallback, ProfilingCallback, SearchEarlyStopping are already available (see example below).
All search algorithms are now importable from their hps or nas package. For example, from deephyper.search.hps import AMBS and from deephyper.search.nas import AgEBO.
HpProblem and NaProblem do not have a seed parameter anymore. The random_state has to be set when instantiating a Search(random_state=...).

Examlpe: SearchEarlyStopping

from deephyper.problem import HpProblem
from deephyper.search.hps import AMBS
from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import LoggerCallback, SearchEarlyStopping

problem = HpProblem()
problem.add_hyperparameter((0.0, 10.0), "x")

def f(config):
    return config["x"]
    
evaluator = Evaluator.create(f, 
                             method="ray",
                             method_kwargs={
                                 "num_cpus": 1,
                                 "num_cpus_per_task": 0.25,
                                 "callbacks": [LoggerCallback(), SearchEarlyStopping(patience=10)]
                             })
print(f"Num. Workers {evaluator.num_workers}")

search = AMBS(problem, evaluator, filter_duplicated=False)

results = search.search(max_evals=500)

Gives the following output:

Num. Workers 4
[00001] -- best objective: 3.74540 -- received objective: 3.74540
[00002] -- best objective: 6.38145 -- received objective: 6.38145
Objective has improved from 3.74540 -> 6.38145
[00003] -- best objective: 6.38145 -- received objective: 3.73641
[00004] -- best objective: 7.29998 -- received objective: 7.29998
Objective has improved from 6.38145 -> 7.29998
[00005] -- best objective: 7.29998 -- received objective: 2.98912
[00006] -- best objective: 7.29998 -- received objective: 5.52077
[00007] -- best objective: 7.29998 -- received objective: 4.59535
[00008] -- best objective: 7.29998 -- received objective: 5.28775
[00009] -- best objective: 7.29998 -- received objective: 5.52099
[00010] -- best objective: 9.76781 -- received objective: 9.76781
Objective has improved from 7.29998 -> 9.76781
[00011] -- best objective: 9.76781 -- received objective: 7.48943
[00012] -- best objective: 9.76781 -- received objective: 7.42981
[00013] -- best objective: 9.76781 -- received objective: 9.30103
[00014] -- best objective: 9.76781 -- received objective: 8.22588
[00015] -- best objective: 9.76781 -- received objective: 8.96084
[00016] -- best objective: 9.76781 -- received objective: 8.96303
[00017] -- best objective: 9.96415 -- received objective: 9.96415
Objective has improved from 9.76781 -> 9.96415
[00018] -- best objective: 9.96415 -- received objective: 9.58723
[00019] -- best objective: 9.96415 -- received objective: 9.93599
[00020] -- best objective: 9.96415 -- received objective: 9.35591
[00021] -- best objective: 9.96415 -- received objective: 9.90210
[00022] -- best objective: 9.97627 -- received objective: 9.97627
Objective has improved from 9.96415 -> 9.97627
[00023] -- best objective: 9.98883 -- received objective: 9.98883
Objective has improved from 9.97627 -> 9.98883
[00024] -- best objective: 9.98883 -- received objective: 9.97969
[00025] -- best objective: 9.98883 -- received objective: 9.96051
[00026] -- best objective: 9.98883 -- received objective: 9.86835
[00027] -- best objective: 9.98883 -- received objective: 9.80940
[00028] -- best objective: 9.98883 -- received objective: 9.84498
[00029] -- best objective: 9.98883 -- received objective: 9.86562
[00030] -- best objective: 9.99664 -- received objective: 9.99664
Objective has improved from 9.98883 -> 9.99664
[00031] -- best objective: 9.99664 -- received objective: 9.99541
[00032] -- best objective: 9.99790 -- received objective: 9.99790
Objective has improved from 9.99664 -> 9.99790
[00033] -- best objective: 9.99790 -- received objective: 9.99640
[00034] -- best objective: 9.99790 -- received objective: 9.98190
[00035] -- best objective: 9.99790 -- received objective: 9.98854
[00036] -- best objective: 9.99790 -- received objective: 9.98335
[00037] -- best objective: 9.99790 -- received objective: 9.99303
[00038] -- best objective: 9.99790 -- received objective: 9.99271
[00039] -- best objective: 9.99790 -- received objective: 9.99164
[00040] -- best objective: 9.99790 -- received objective: 9.99313
[00041] -- best objective: 9.99790 -- received objective: 9.99236
[00042] -- best objective: 9.99875 -- received objective: 9.99875
Objective has improved from 9.99790 -> 9.99875
[00043] -- best objective: 9.99875 -- received objective: 9.99735
[00044] -- best objective: 9.99969 -- received objective: 9.99969
Objective has improved from 9.99875 -> 9.99969
[00045] -- best objective: 9.99969 -- received objective: 9.99755
[00046] -- best objective: 9.99969 -- received objective: 9.99742
[00047] -- best objective: 9.99995 -- received objective: 9.99995
Objective has improved from 9.99969 -> 9.99995
[00048] -- best objective: 9.99995 -- received objective: 9.99725
[00049] -- best objective: 9.99995 -- received objective: 9.99746
[00050] -- best objective: 9.99995 -- received objective: 9.99990
[00051] -- best objective: 9.99995 -- received objective: 9.99915
[00052] -- best objective: 9.99995 -- received objective: 9.99962
[00053] -- best objective: 9.99995 -- received objective: 9.99930
[00054] -- best objective: 9.99995 -- received objective: 9.99982
[00055] -- best objective: 9.99995 -- received objective: 9.99985
[00056] -- best objective: 9.99995 -- received objective: 9.99851
[00057] -- best objective: 9.99995 -- received objective: 9.99794
Stopping the search because it did not improve for the last 10 evaluations!

Tutorials

[NEW] Hyperparameter search for text classification (Pytorch)
[NEW] Neural Architecture Search with Multiple Input Tensors
[NEW] From Neural Architecture Search to Automated Deep Ensemble with Uncertainty Quantification
[UPDATED] Execution on the Theta supercomputer/N-evaluation per 1-node

Hyperparameter search

[NEW] Filtering duplicated samples: New parameters filter_duplicated and n_points appeared for deephyper.search.hps.AMBS. By default filter_duplicated = True implies that the search space filters duplicated values until it cannot sample new unique values (and therefore will re-sample existing configurations of hyperparameters). This filtering behaviour and sampling speed are sensitive to the n_points parameter which corresponds to the number of samples drawn from the search space before being filtered by the surrogate model. By default n_points = 10000. If filter_duplicated = False then the filtering of duplicated points will be skipped but n_points will still impact sampling speed.
Arguments of AMBS were adapted to match the maximisation setting of DeepHyper: "LCB" -> "UCB", cl_min -> cl_max, "cl_max" -> "cl_min".

Neural architecture search

The package deephyper.nas was restructured. All the neural architecture search space should now be subclasses of deephyper.nas.KSearchSpace:

import tensorflow as tf

from deephyper.nas import KSearchSpace
from deephyper.nas.node import ConstantNode, VariableNode
from deephyper.nas.operation import operation, Identity

Dense = operation(tf.keras.layers.Dense)
Dropout = operation(tf.keras.layers.Dropout)

class ExampleSpace(KSearchSpace):
    
    def build(self):
        
        # input nodes are automatically built based on `input_shape`
        input_node = self.input_nodes[0] 
        
        # we want 4 layers maximum (Identity corresponds to not adding a layer)
        for i in range(4):
            node = VariableNode()
            self.connect(input_node, node) 

            # we add 3 possible operations for each node
            node.add_op(Identity())
            node.add_op(Dense(100, "relu"))
            node.add_op(Dropout(0.2))
            
            input_node = node
            
        output = ConstantNode(op=Dense(self.output_shape[0]))
        self.connect(input_node, output)

        return self

space = ExampleSpace(input_shape=(1,), output_shape=(1,)).build()
space.sample().summary()

will output:

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_0 (InputLayer)         [(None, 1)]               0         
_________________________________________________________________
dense_3 (Dense)              (None, 100)               200       
_________________________________________________________________
dense_4 (Dense)              (None, 100)               10100     
_________________________________________________________________
dropout_2 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 101       
=================================================================
Total params: 10,401
Trainable params: 10,401
Non-trainable params: 0
_________________________________________________________________

To have a complete example follow the Neural Architecture Search (Basic) tutorial.

The main changes were the following:

AutoKSearchSpace, SpaceFactory, Dense, Dropout and others were removed. Operations like Dense can now be created directly using the operation(tf.keras.layers.Dense) to allow for lazy tensor allocation.
The search space class should now be passed directly to the NaProblem.search_space(KSearchSpaceSubClass).
deephyper.nas.space is now deephyper.nas
All operations are now under deephyper.nas.operation
Nodes are now under deephyper.nas.node

Documentation

API Reference: A new section on the documentation website to give details about all usable functions/classes of DeepHyper.

Suppressed

Notebooks generated with deephyper-analytics were removed.
deephyper ray-submit
deephyper ray-config
Some unused dependencies were removed: balsam-flow, deap.

Source code(tar.gz)
Source code(zip)

0.3.0(Oct 12, 2021)
This new release help us move toward a more stable version of DeepHyper.

Refactored the DeepHyper Documentation

Developed notebook tutorials

Decoupled the command line and Python interfaces

Refactored the Evaluator interface with evaluator.submit/gather

Added deephyper.ensemble for ensembles with uncertainty quantification

Removed deephyper.post

Source code(tar.gz)
Source code(zip)
0.2.5(Jun 10, 2021)
General

Full API documentation

The DeepHyper API is now fully documented at DeepHyper API

Tensorflow-Probability as a new dependency

TensorFlow Probability is now part of DeepHyper default set of dependencies

Automated submission with Ray at ALCF

It is now possible to directly submit with deephyper ray-submit ... for DeepHyper at the ALCF. This feature is only available on ThetaGPU for now but can be extended to other systems by following this script.

ThetaGPU at ALCF

New installation documentation is available at Installation ThetaGPU (ALCF)

A new user guide is available at Running on ThetaGPU (ALCF) to understand how to run manually and automatically DeepHyper on ThetaGPU.

New documentation for auto-sklearn search with DeepHyper

The access to auto-sklearn features was changed to deephyper.sklearn and a new documentation is available for this feature at User guide: AutoSklearn

New command lines for DeepHyper Analytics

The deephyper-analytics command was modified and enhanced with new features. The see the full updated documentation follow DeepHyper Analytics Tools.

The topk command is now available to have quick feedback from the results of an experiment:

$ deephyper-analytics topk combo_8gpu_8_agebo/infos/results.csv -k 2 '0': arch_seq: '[229, 0, 22, 1, 1, 53, 29, 1, 119, 1, 0, 116, 123, 1, 273, 0, 1, 388]' batch_size: 59 elapsed_sec: 10259.2741303444 learning_rate: 0.0001614947 loss: log_cosh objective: 0.9236862659 optimizer: adam patience_EarlyStopping: 22 patience_ReduceLROnPlateau: 10 '1': arch_seq: '[229, 0, 22, 0, 1, 235, 29, 1, 313, 1, 0, 116, 123, 1, 37, 0, 1, 388]' batch_size: 51 elapsed_sec: 8818.2674164772 learning_rate: 0.0001265946 loss: mae objective: 0.9231553674 optimizer: nadam patience_EarlyStopping: 23 patience_ReduceLROnPlateau: 14

Neural architecture search

New documentation for the problem definition

A new documentation for the neural architecture search problem setup can be found here.

It is now possible to defined auto-tuned hyperparameters in addition of the architecture in a NAS Problem.

New Algorithms for Joint Hyperparameter and Neural Architecture Search

Three new algorithms are available to run a joint Hyperparameter and neural architecture search. The Hyperparameter optimisation is defined as HPO and neural architecture search as NAS.

agebo (Aging Evolution for NAS with Bayesian Optimisation for HPO)

ambsmixed (an extension of Asynchronous Model-Based Search for HPO + NAS)

regevomixed (an extension of regularised evolution for HPO + NAS)

A run function to use data-parallelism with Tensorflow

A new run function to use data-parallelism during neural architecture search is available (link to code)

To use this function pass it to the run argument of the command line such as:

deephyper nas agebo ... --run deephyper.nas.run.tf_distributed.run ... --num-cpus-per-task 2 --num-gpus-per-task 2 --evaluator ray --address auto ...

This function allows for new hyperparameters in the Problem.hyperparameters(...):

... Problem.hyperparameters( ... lsr_batch_size=True, lsr_learning_rate=True, warmup_lr=True, warmup_epochs=5, ... ) ...

Optimization of the input pipeline for the training

The data-ingestion pipeline was better optimised to reduce the overheads on GPU instances:

self.dataset_train = ( self.dataset_train.cache() .shuffle(self.train_size, reshuffle_each_iteration=True) .batch(self.batch_size) .prefetch(tf.data.AUTOTUNE) .repeat(self.num_epochs) )

Easier model generation from Neural Architecture Search results

A new method is now available from the Problem object Problem.get_keras_model(arch_seq) to easily build a Keras model instance from an arch_seq (list encoding a neural network).
Source code(tar.gz)
Source code(zip)
0.2.1(Nov 26, 2020)

Minor bug corrections
Source code(tar.gz)
Source code(zip)
0.2.0(Nov 16, 2020)
Compatible with Tensorflow 2.

Horovod compatibility with Balsam evaluator for Theta.

Horovod and Balsam are now optional installations.

Update of the AMBS algorithm for Hyperparameter search for better scalability.

Removing the PPO search for Neural Architecture Search.

Creating the SpaceFactory interface for the deepspace package which provides ready to go neural architecture search spaces.

Local distribution of jobs with Ray and multiprocessors CPUs.

Source code(tar.gz)
Source code(zip)
0.1.13(Nov 1, 2020)
DeepHyper 0.1.13

New NAS Algorithm

Aging Evolution with Bayesian Optimization (AgEBO)

New AMBS implementation

Previous AMBS renamed to ambsv1

New implementation of AMBS for better scaling capabilities

Data-Parallelism settings for Balsam and Horovod

Graph convolution layers with message passin
Source code(tar.gz)
Source code(zip)
0.1.12(Oct 16, 2020)

A release for the creation of a DOI on Zeno.
Source code(tar.gz)
Source code(zip)
0.1.2(Dec 2, 2019)

Changelog - DeepHyper 0.1.2

DeepHyper 0.1.2 is now forward-compatible with Python 3.7+ and Balsam 0.3.8+ after removing the async reserved keyword.
Source code(tar.gz)
Source code(zip)

0.1.1(Sep 5, 2019)

Changelog - DeepHyper 0.1.1

This release is mostly introducing features for Neural Architecture Search with DeepHyper.

DeepHyper command-line interface

For hyperparameter search use deephyper hps ... here is an example for the hyperparameter polynome2 benchmark:

deephyper hps ambs --problem deephyper.benchmark.hps.polynome2.Problem --run deephyper.benchmark.hps.polynome2.run

For neural architecture search use deephyper nas ... here is an example for the neural architecture search linearReg benchmark:

deephyper nas regevo --problem deephyper.benchmark.nas.linearReg.Problem

Use commands such as deephyper --help, deephyper nas --help or deephyper nas regevo --help to find out more about the command-line interface.

Create an Operation from a Keras Layer

Create a new Operation directly from tensorflow.keras.layers:

>>> import tensorflow as tf
>>> from deephyper.search.nas.model.space.node import VariableNode
>>> from deephyper.search.nas.model.space.op import Operation
>>> vnode = VariableNode()
>>> vnode.add_op(Operation(layer=tf.keras.layers.Dense(10)))

Trainer default CSVLogger callback

TrainerTrainValid now has a default callback: tf.keras.callbacks.CSVLogger(...)

Ray evaluator

The ray evaluator is now available through ... --evaluator ray... for both hyperparameter and neural architecture search.

Seeds for reproducibility

To use a seed for any run do Problem(seed=seed) while creating your problem object.

AMBS learner distributed

Use the .. --n-jobs ... to define how to distribute the learner computation in AMBS.

MimeNode to replicate actions

The goal of MimeNode is to replicate the action applied to the targeted variable node.

import tensorflow as tf

from deephyper.search.nas.model.space.node import VariableNode, MimeNode
from deephyper.search.nas.model.space.op.op1d import Dense

vnode = VariableNode()
dense_10_op = Dense(10)
vnode.add_op(dense_10_op)
vnode.add_op(Dense(20))

mnode = MimeNode(vnode)
dense_30_op = Dense(30)
mnode.add_op(dense_30_op)
mnode.add_op(Dense(40))

# The first operation "Dense(10)" has been choosen
# for the mimed node: vnode
vnode.set_op(0)

assert vnode.op == dense_10_op

# mnode is miming the choice made for vnode as you can see
# the first operation was choosen as well
assert mnode.op == dense_30_op

MirrorNode to reuse the same operation

The goal of MirroNode is to replicate the action applied to the targeted VariableNode, ConstantNode or MimeNode.

import tensorflow as tf

from deephyper.search.nas.model.space.node import VariableNode, MirrorNode
from deephyper.search.nas.model.space.op.op1d import Dense

vnode = VariableNode()
dense_10_op = Dense(10)
vnode.add_op(dense_10_op)
vnode.add_op(Dense(20))

mnode = MirrorNode(vnode)

# The operation "Dense(10)" is being set for vnode.
vnode.set_op(0)

# The same operation (i.e. same instance) is now returned by both vnode and mnode.
assert vnode.op == dense_10_op
assert mnode.op == dense_10_op

Tensorboard and Beholder callbacks available for post-training

Tensorboard and Beholder callbacks can now be used during the post-training. Beholder is a Tensorboard which enable you to visualize the evolution of the trainable parameters of a model during the training.

Problem.post_training(
    ...
    callbacks=dict(
        TensorBoard={
            'log_dir':'tb_logs',
            'histogram_freq':1,
            'batch_size':64,
            'write_graph':True,
            'write_grads':True,
            'write_images':True,
            'update_freq':'epoch',
            'beholder': True
        })
)

Source code(tar.gz)
Source code(zip)

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Related tags

Overview

What is DeepHyper?

Pipeline optimization for ML (DeepHyper/POPT)

Hyperparameter search (DeepHyper/HPS)

Neural architecture search (DeepHyper/NAS)

Ensemble uncertainty quantification (DeepHyper/AutoDEUQ)

Install instructions

Quickstart

How do I learn more?

Who is responsible?

Citing DeepHyper

How can I participate?

Acknowledgements

Copyright and license

Comments

Experiment selection

Single experiment analysis

Releases(0.4.2)

0.4.2(Jul 12, 2022)

deephyper.evaluator

deephyper.search

deephyper.ensemble

deephyper dashboard

0.4.0(Jun 9, 2022)

global updates

deephyper.search

deephyper.evaluator

deephyper.ensemble

0.3.3(Oct 29, 2021)

0.3.2(Oct 25, 2021)

Tutorials

Hyperparameter search

Neural architecture search

Documentation

Suppressed

0.3.0(Oct 12, 2021)

0.2.5(Jun 10, 2021)

General

Full API documentation

Tensorflow-Probability as a new dependency

Automated submission with Ray at ALCF

ThetaGPU at ALCF

New documentation for auto-sklearn search with DeepHyper

New command lines for DeepHyper Analytics

Neural architecture search

New documentation for the problem definition

New Algorithms for Joint Hyperparameter and Neural Architecture Search

A run function to use data-parallelism with Tensorflow

Optimization of the input pipeline for the training

Easier model generation from Neural Architecture Search results

0.2.1(Nov 26, 2020)

0.2.0(Nov 16, 2020)

0.1.13(Nov 1, 2020)

DeepHyper 0.1.13

New NAS Algorithm

New AMBS implementation

Data-Parallelism settings for Balsam and Horovod

Graph convolution layers with message passin

0.1.12(Oct 16, 2020)

0.1.2(Dec 2, 2019)

Changelog - DeepHyper 0.1.2

0.1.1(Sep 5, 2019)

Changelog - DeepHyper 0.1.1

DeepHyper command-line interface

Create an Operation from a Keras Layer

Trainer default CSVLogger callback

Ray evaluator

Seeds for reproducibility

AMBS learner distributed

MimeNode to replicate actions

MirrorNode to reuse the same operation

Tensorboard and Beholder callbacks available for post-training

Owner

DeepHyper Team

python debugger and anti-vm that checks if you're in a virtual machine or if someones trying to debug your file

Learning Spatio-Temporal Transformer for Visual Tracking

Convert ONNX model graph to Keras model format.

Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".