A full pipeline AutoML tool for tabular data

Overview

HyperGBM

Python Versions Downloads PyPI Version

Doc | 中文

We Are Hiring!

Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are keen on AutoML/NAS. Come be a part of DataCanvas! Please send your CV to [email protected]. (Application deadline: TBD.)

What is HyperGBM

HyperGBM is a full pipeline automated machine learning (AutoML) toolkit designed for tabular data. It covers the complete end-to-end ML processing stages, consisting of data cleaning, preprocessing, feature generation and selection, model selection and hyperparameter optimization.

Overview

HyperGBM optimizes the end-to-end ML processing stages within one search space, which differs from most existing AutoML approaches that only tackle partial stages, for instance, hyperparameter optimazation. This full pipeline optimization process is very similar to a sequential decision process (SDP). Therefore, HyperGBM utilizes reinforcement learning, Monte Carlo Tree Search, evolution algorithm combined with a meta-learner to efficiently solve the pipeline optimization problem.

HyperGBM, as indicated in the name, involves several gradient boosting tree models (GBM), namely, XGBoost, LightGBM and Catboost. What's more, it could access the Hypernets, a general automated machine learning framework, and introduce its advanced characteristics in data cleaning, feature engineering and model ensemble. Additionally, the search space representation and search algorithm inside Hyper GBM are also supported by Hypernets.

Tutorial

Installation

Install HyperGBM with different options:

  • Typical installation:
pip install hypergbm
  • To run HyperGBM in JupyterLab/Jupyter notebook, install with command:
pip install hypergbm[notebook]
  • To support dataset with simplified Chinese in feature generation,
    • Install jieba package before running HyperGBM.
    • OR install with command:
pip install hypergbm[zhcn]
  • Install all above with one command:
pip install hypergbm[all]

Examples

  • Use HyperGBM with Python

Users can quickly create and run an experiment with make_experiment, which only needs one required input parameter train_data. The example shown below is using the blood dataset as train_data from hypernet.tabular. If the target column of the dataset is not y, it needs to be manually set through the argument target.

An example codes:

from hypergbm import make_experiment
from hypernets.tabular.datasets import dsutils

train_data = dsutils.load_blood()
experiment = make_experiment(train_data, target='Class')
estimator = experiment.run()
print(estimator)

This training experiment returns a pipeline with two default steps, data_clean and estimator. In particular, the estimator returns a final model which consists of various models. The outputs:

Pipeline(steps=[('data_clean',
                 DataCleanStep(...),
                ('estimator',
                 GreedyEnsemble(...)])

To see more examples, please read Quick Start and Examples.

  • Use HyperGBM with Command line tools

Hypergbm also supports command line tools to perform model training, evaluation and prediction. The following codes enable the user to view command line help:

hypergbm -h

usage: hypergbm [-h] [--log-level LOG_LEVEL] [-error] [-warn] [-info] [-debug]
                [--verbose VERBOSE] [-v] [--enable-dask ENABLE_DASK] [-dask]
                [--overload OVERLOAD]
                {train,evaluate,predict} ...

The example of training a model for dataset blood.csv is shown below:

hypergbm train --train-file=blood.csv --target=Class --model-file=model.pkl

For more details, please read Quick Start.

Hypernets related projects

  • Hypernets: A general automated machine learning (AutoML) framework.
  • HyperGBM: A full pipeline AutoML tool integrated various GBM models.
  • HyperDT/DeepTables: An AutoDL tool for tabular data.
  • HyperKeras: An AutoDL tool for Neural Architecture Search and Hyperparameter Optimization on Tensorflow and Keras.
  • Cooka: Lightweight interactive AutoML system.

DataCanvas AutoML Toolkit

Documents

DataCanvas

HyperGBM is an open source project created by DataCanvas.

Comments
  • 运行简单示例程序报错

    运行简单示例程序报错

    您好,我新建了一个虚拟环境专门安装HyperGBM,依赖包都已成功安装,但是运行最简单的示例程序报错了,可否帮忙看一下是哪里出了问题?谢谢! 运用以下示例程序: from hypergbm import make_experiment from hypernets.tabular.datasets import dsutils

    train_data = dsutils.load_blood() experiment = make_experiment(train_data, target='Class') estimator = experiment.run() print(estimator)

    报错内容很长,截取最前面几条: Traceback (most recent call last): File "", line 1, in File "D:\Software\Anaconda3\envs\hp\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "D:\Software\Anaconda3\envs\hp\lib\multiprocessing\spawn.py", line 125, in _main prepare(preparation_data) File "D:\Software\Anaconda3\envs\hp\lib\multiprocessing\spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "D:\Software\Anaconda3\envs\hp\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path,

    还包括多个MemoryError例如: import scipy.linalg._interpolative_backend as _backend File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 839, in exec_module File "", line 934, in get_code File "", line 1033, in get_data MemoryError

    opened by apooth 8
  • could you please solve the problem on my installing?

    could you please solve the problem on my installing?

    Hi, distinguished composer! I'm a freshman in machine learning. I want to use the hypergbm on dealing with a classification task. But after I pip intall hypergbm, I met some problem that made me fail to install the package. Could you please take a look and help me with the problem. I'll be greatly appreciated!

    Downloading packaging-21.3-py3-none-any.whl (40 kB) |████████████████████████████████| 40 kB 657 kB/s Building wheels for collected packages: jieba, python-geohash Building wheel for jieba (setup.py) ... done Created wheel for jieba: filename=jieba-0.42.1-py3-none-any.whl size=19314477 sha256=659d5db71012e50daa40cc083515d9987ee8e99ea451a3ea44db3582773fc490 Stored in directory: c:\users\li-hy\appdata\local\pip\cache\wheels\24\aa\17\5bc7c72e9a37990a9620cc3aad0acad1564dcff6dbc2359de3 Building wheel for python-geohash (setup.py) ... error ERROR: Command errored out with exit status 1: command: 'C:\aanaconda\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Li-hy\AppData\Local\Temp\pip-install-zidv_k76\python-geohash_da12afeb154e40c388f7e2b935782a23\setup.py'"'"'; file='"'"'C:\Users\Li-hy\AppData\Local\Temp\pip-install-zidv_k76\python-geohash_da12afeb154e40c388f7e2b935782a23\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Li-hy\AppData\Local\Temp\pip-wheel-hx133l9c' cwd: C:\Users\Li-hy\AppData\Local\Temp\pip-install-zidv_k76\python-geohash_da12afeb154e40c388f7e2b935782a23
    Complete output (12 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.7 copying geohash.py -> build\lib.win-amd64-3.7 copying quadtree.py -> build\lib.win-amd64-3.7 copying jpgrid.py -> build\lib.win-amd64-3.7 copying jpiarea.py -> build\lib.win-amd64-3.7 running build_ext building '_geohash' extension error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads/

    ERROR: Failed building wheel for python-geohash Running setup.py clean for python-geohash Successfully built jieba Failed to build python-geohash Installing collected packages: packaging, jedi, llvmlite, distributed, seaborn, python-geohash, pyarrow, plotly, numba, lightgbm, featuretools, dask-glm, slicer, hypernets, dask-ml, catboost, shap, jieba, hypernets-jupyter-widget, hypergbm Attempting uninstall: packaging Found existing installation: packaging 20.9 Uninstalling packaging-20.9: Successfully uninstalled packaging-20.9 Attempting uninstall: jedi Found existing installation: jedi 0.14.1 Uninstalling jedi-0.14.1: Successfully uninstalled jedi-0.14.1 Attempting uninstall: llvmlite Found existing installation: llvmlite 0.31.0 ERROR: Cannot uninstall 'llvmlite'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

    Installation 
    opened by huayu95 8
  • AttributeError: 'CompeteExperiment' object has no attribute 'y_eval_pred'

    AttributeError: 'CompeteExperiment' object has no attribute 'y_eval_pred'

    当 webui=True 参数设定时 会报错 AttributeError: 'CompeteExperiment' object has no attribute 'y_eval_pred'

    Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\py36\lib\site-packages\hypernets\experiment_experiment.py", line 86, in run callback.experiment_start(self) File "C:\ProgramData\Anaconda3\envs\py36\lib\site-packages\hboard\callbacks.py", line 125, in experiment_start super(WebVisExperimentCallback, self).experiment_start(exp) File "C:\ProgramData\Anaconda3\envs\py36\lib\site-packages\hypernets\experiment_callback.py", line 577, in experiment_start d = ExperimentExtractor(exp).extract() File "C:\ProgramData\Anaconda3\envs\py36\lib\site-packages\hypernets\experiment_extractor.py", line 671, in extract if self.is_evaluated(): File "C:\ProgramData\Anaconda3\envs\py36\lib\site-packages\hypernets\experiment_extractor.py", line 657, in is_evaluated return self.exp.X_eval is not None and self.exp.y_eval is not None and self.exp.y_eval_pred is not None AttributeError: 'CompeteExperiment' object has no attribute 'y_eval_pred'

    opened by zhangyuqi-1 7
  • Unable to view Model Selected Features

    Unable to view Model Selected Features

    Hello. I trained an estimator with make_experiment setting the parameter feature_selection=True, however, the estimator.get_params() does not give me the selected features and I am unable to extract feature importances as well. HyperGBM version - 0.2.5.2

    experiment = make_experiment(
                                 train_data=train, target='Target', eval_data=test,
                                 cv=True, num_folds=5,
                                 reward_metric='auc', pos_label=1,
                                 max_trials=2,
                                 feature_selection=True,
                                feature_selection_threshold=0.0001,
                                 drift_detection=True,
                                 class_balancing='RandomUnderSampler',
                                 njobs = -1)
    estimator = experiment.run()
    

    The estimator.get_params() gives the following ouput: {'memory': None, 'steps': [('data_adaption', DataAdaptionStep(name='data_adaption')), ('data_clean', DataCleanStep(cv=True, data_cleaner_args={'correct_object_dtype': True, 'drop_columns': None, 'drop_constant_columns': True, 'drop_duplicated_columns': False, 'drop_idness_columns': True, 'drop_label_nan_rows': True, 'int_convert_to': 'float', 'nan_chars': None, 'reduce_mem_usage': False, 'reserve_columns': None}, name='data_clean')), ('feature_selection', FeatureImportanceSelectionStep(name='feature_selection', number=15, quantile=None, strategy='number', threshold=None)), ('estimator', GreedyEnsemble(weight=[0.0, 1.0], scores=[0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832]))], 'verbose': False, 'data_adaption': DataAdaptionStep(name='data_adaption'), 'data_clean': DataCleanStep(cv=True, data_cleaner_args={'correct_object_dtype': True, 'drop_columns': None, 'drop_constant_columns': True, 'drop_duplicated_columns': False, 'drop_idness_columns': True, 'drop_label_nan_rows': True, 'int_convert_to': 'float', 'nan_chars': None, 'reduce_mem_usage': False, 'reserve_columns': None}, name='data_clean'), 'feature_selection': FeatureImportanceSelectionStep(name='feature_selection', number=15, quantile=None, strategy='number', threshold=None), 'estimator': GreedyEnsemble(weight=[0.0, 1.0], scores=[0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832, 0.7872902176075832]), 'data_adaption__memory_limit': 0.05, 'data_adaption__min_cols': 0.3, 'data_adaption__name': 'data_adaption', 'data_adaption__target': None, 'data_clean__cv': True, 'data_clean__data_cleaner_args': {'nan_chars': None, 'correct_object_dtype': True, 'drop_constant_columns': True, 'drop_label_nan_rows': True, 'drop_idness_columns': True, 'drop_columns': None, 'reserve_columns': None, 'drop_duplicated_columns': False, 'reduce_mem_usage': False, 'int_convert_to': 'float'}, 'data_clean__name': 'data_clean', 'data_clean__train_test_split_strategy': None, 'feature_selection__name': 'feature_selection', 'feature_selection__number': 15, 'feature_selection__quantile': None, 'feature_selection__strategy': 'number', 'feature_selection__threshold': None}

    opened by haykuhidavtyan 6
  • Still running error when setting eval_data!

    Still running error when setting eval_data!

    image

    try eval_data like code below.

    import pandas as pd
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
    from hypergbm import make_experiment
    X,y = datasets.load_breast_cancer(as_frame=True,return_X_y=True)
    X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.7,random_state=335)
    dftrain = pd.concat([X_train,y_train],axis=1)
    dfval = pd.concat([X_test,y_test],axis=1)
    
    
    experiment = make_experiment(train_data=dftrain.copy(), 
                                 target='target', 
                                 reward_metric='precision',
                                 eval_data=dfval.copy(),
                                 max_trials=5)
    estimator = experiment.run()
    estimator
    

    it throws an error like this:

    03-21 22:28:22 E hypernets.e._experiment.py 100 - ExperimentID:[HyperGBM_51f6515f9945cce8cf9365412d041ebe] - None: 'CompeteExperiment' object has no attribute 'y_eval_pred'
    Traceback (most recent call last):
      File "/conda/envs/notebook/lib/python3.6/site-packages/hypernets/experiment/_experiment.py", line 86, in run
        callback.experiment_start(self)
      File "/conda/envs/notebook/lib/python3.6/site-packages/hypernets/experiment/_callback.py", line 577, in experiment_start
        d = ExperimentExtractor(exp).extract()
      File "/conda/envs/notebook/lib/python3.6/site-packages/hypernets/experiment/_extractor.py", line 668, in extract
        if self.is_evaluated():
      File "/conda/envs/notebook/lib/python3.6/site-packages/hypernets/experiment/_extractor.py", line 654, in is_evaluated
        return self.exp.X_eval is not None and self.exp.y_eval is not None and self.exp.y_eval_pred is not None
    AttributeError: 'CompeteExperiment' object has no attribute 'y_eval_pred'
    

    Could this bug be fixed in the next version?

    opened by lyhue1991 6
  • How does HyperGBM works?

    How does HyperGBM works?

    Does HyperGBM's make_experiment return the best model? How does it work on paramter tuning? It's say that, what's its seach space (e.g. in XGboost)???

    good first issue question 
    opened by mokeeqian 3
  • Sklearn version conflict when I use hypergbm on kaggle

    Sklearn version conflict when I use hypergbm on kaggle

    System information

    • OS Platform and Distribution (e.g., CentOS 7.6): Linux version 5.4.144+ ([email protected])
    • Python version:
    • HyperGBM version: 0.2.3
    • Other Python packages(run pip list):

    Describe the current behavior I wanna use hypergbm on kaggle, but i got error, Looks like a version conflict

    Describe the expected behavior Just use !pip3 install -U hypergbm then i can use hypergbm on kaggle.

    Standalone code to reproduce the issue

    !pip3 install -U hypergbm
    import hypergbm
    

    Are you willing to submit PR?(Yes/No)

    I notice here.

    Attempting uninstall: scikit-learn Found existing installation: scikit-learn 0.23.2 Uninstalling scikit-learn-0.23.2: Successfully uninstalled scikit-learn-0.23.2

    If add !pip3 install -U scikit-learn==0.23.2, it will be okay.

    Installing collected packages: scikit-learn Attempting uninstall: scikit-learn Found existing installation: scikit-learn 1.0 Uninstalling scikit-learn-1.0: Successfully uninstalled scikit-learn-1.0

    Other info / logs

    Collecting hypergbm Downloading hypergbm-0.2.3-py3-none-any.whl (2.9 MB) |████████████████████████████████| 2.9 MB 809 kB/s eta 0:00:01 Requirement already satisfied: plotly in /opt/conda/lib/python3.7/site-packages (from hypergbm) (5.3.1) Requirement already satisfied: catboost>=0.26 in /opt/conda/lib/python3.7/site-packages (from hypergbm) (0.26.1) Requirement already satisfied: distributed in /opt/conda/lib/python3.7/site-packages (from hypergbm) (2021.9.0) Requirement already satisfied: dask in /opt/conda/lib/python3.7/site-packages (from hypergbm) (2021.9.0) Collecting hypernets==0.2.3 Downloading hypernets-0.2.3-py3-none-any.whl (3.2 MB) |████████████████████████████████| 3.2 MB 54.2 MB/s eta 0:00:01 Collecting dask-ml Downloading dask_ml-1.9.0-py3-none-any.whl (143 kB) |████████████████████████████████| 143 kB 64.3 MB/s eta 0:00:01 Requirement already satisfied: xgboost>=1.3.0 in /opt/conda/lib/python3.7/site-packages (from hypergbm) (1.4.2) Requirement already satisfied: imbalanced-learn>=0.7.0 in /opt/conda/lib/python3.7/site-packages (from hypergbm) (0.8.0) Requirement already satisfied: tqdm in /opt/conda/lib/python3.7/site-packages (from hypergbm) (4.62.1) Requirement already satisfied: fsspec>=0.8.0 in /opt/conda/lib/python3.7/site-packages (from hypergbm) (2021.8.1) Requirement already satisfied: lightgbm>=3.2.0 in /opt/conda/lib/python3.7/site-packages (from hypergbm) (3.2.1) Requirement already satisfied: scikit-learn>=0.22.1 in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (0.23.2) Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (1.19.5) Requirement already satisfied: pandas>=0.25.3 in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (1.3.2) Collecting seaborn==0.11.0 Downloading seaborn-0.11.0-py3-none-any.whl (283 kB) |████████████████████████████████| 283 kB 64.3 MB/s eta 0:00:01 Requirement already satisfied: pyarrow in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (5.0.0) Requirement already satisfied: scipy in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (1.7.1) Requirement already satisfied: featuretools in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (0.27.1) Requirement already satisfied: ipython in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (7.26.0) Collecting python-geohash Downloading python-geohash-0.8.5.tar.gz (17 kB) Requirement already satisfied: traitlets in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (5.0.5) Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from hypernets==0.2.3->hypergbm) (1.15.0) Requirement already satisfied: matplotlib>=2.2 in /opt/conda/lib/python3.7/site-packages (from seaborn==0.11.0->hypernets==0.2.3->hypergbm) (3.4.3) Requirement already satisfied: graphviz in /opt/conda/lib/python3.7/site-packages (from catboost>=0.26->hypergbm) (0.8.4) Requirement already satisfied: joblib>=0.11 in /opt/conda/lib/python3.7/site-packages (from imbalanced-learn>=0.7.0->hypergbm) (1.0.1) Collecting scikit-learn>=0.22.1 Downloading scikit_learn-1.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (23.1 MB) |████████████████████████████████| 23.1 MB 52.6 MB/s eta 0:00:01 Requirement already satisfied: wheel in /opt/conda/lib/python3.7/site-packages (from lightgbm>=3.2.0->hypergbm) (0.37.0) Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn==0.11.0->hypernets==0.2.3->hypergbm) (2.8.0) Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn==0.11.0->hypernets==0.2.3->hypergbm) (8.2.0) Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn==0.11.0->hypernets==0.2.3->hypergbm) (2.4.7) Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn==0.11.0->hypernets==0.2.3->hypergbm) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=2.2->seaborn==0.11.0->hypernets==0.2.3->hypergbm) (1.3.1) Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.7/site-packages (from pandas>=0.25.3->hypernets==0.2.3->hypergbm) (2021.1) Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn>=0.22.1->hypernets==0.2.3->hypergbm) (2.2.0) Requirement already satisfied: toolz>=0.8.2 in /opt/conda/lib/python3.7/site-packages (from dask->hypergbm) (0.11.1) Requirement already satisfied: cloudpickle>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from dask->hypergbm) (1.6.0) Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.7/site-packages (from dask->hypergbm) (21.0) Requirement already satisfied: partd>=0.3.10 in /opt/conda/lib/python3.7/site-packages (from dask->hypergbm) (1.2.0) Requirement already satisfied: pyyaml in /opt/conda/lib/python3.7/site-packages (from dask->hypergbm) (5.4.1) Requirement already satisfied: locket in /opt/conda/lib/python3.7/site-packages (from partd>=0.3.10->dask->hypergbm) (0.2.1) Collecting dask-glm>=0.2.0 Downloading dask_glm-0.2.0-py2.py3-none-any.whl (12 kB) Requirement already satisfied: multipledispatch>=0.4.9 in /opt/conda/lib/python3.7/site-packages (from dask-ml->hypergbm) (0.6.0) Requirement already satisfied: numba>=0.51.0 in /opt/conda/lib/python3.7/site-packages (from dask-ml->hypergbm) (0.53.1) Requirement already satisfied: sortedcontainers!=2.0.0,!=2.0.1 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (2.4.0) Requirement already satisfied: psutil>=5.0 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (5.8.0) Requirement already satisfied: tornado>=5 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (6.1) Requirement already satisfied: setuptools in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (57.4.0) Requirement already satisfied: tblib>=1.6.0 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (1.7.0) Requirement already satisfied: msgpack>=0.6.0 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (1.0.2) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (3.0.1) Requirement already satisfied: click>=6.6 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (8.0.1) Requirement already satisfied: zict>=0.1.3 in /opt/conda/lib/python3.7/site-packages (from distributed->hypergbm) (2.0.0) Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from click>=6.6->distributed->hypergbm) (3.4.0) Requirement already satisfied: llvmlite<0.37,>=0.36.0rc1 in /opt/conda/lib/python3.7/site-packages (from numba>=0.51.0->dask-ml->hypergbm) (0.36.0) Requirement already satisfied: heapdict in /opt/conda/lib/python3.7/site-packages (from zict>=0.1.3->distributed->hypergbm) (1.0.1) Requirement already satisfied: typing-extensions>=3.6.4 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->click>=6.6->distributed->hypergbm) (3.7.4.3) Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->click>=6.6->distributed->hypergbm) (3.5.0) Requirement already satisfied: backcall in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (0.2.0) Requirement already satisfied: decorator in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (5.0.9) Requirement already satisfied: pexpect>4.3 in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (4.8.0) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (3.0.19) Requirement already satisfied: matplotlib-inline in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (0.1.2) Requirement already satisfied: pygments in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (2.10.0) Requirement already satisfied: pickleshare in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (0.7.5) Requirement already satisfied: jedi>=0.16 in /opt/conda/lib/python3.7/site-packages (from ipython->hypernets==0.2.3->hypergbm) (0.18.0) Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/conda/lib/python3.7/site-packages (from jedi>=0.16->ipython->hypernets==0.2.3->hypergbm) (0.8.2) Requirement already satisfied: ptyprocess>=0.5 in /opt/conda/lib/python3.7/site-packages (from pexpect>4.3->ipython->hypernets==0.2.3->hypergbm) (0.7.0) Requirement already satisfied: wcwidth in /opt/conda/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython->hypernets==0.2.3->hypergbm) (0.2.5) Requirement already satisfied: ipython-genutils in /opt/conda/lib/python3.7/site-packages (from traitlets->hypernets==0.2.3->hypergbm) (0.2.0) Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.7/site-packages (from jinja2->distributed->hypergbm) (2.0.1) Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/lib/python3.7/site-packages (from plotly->hypergbm) (8.0.1) Building wheels for collected packages: python-geohash Building wheel for python-geohash (setup.py) ... done Created wheel for python-geohash: filename=python_geohash-0.8.5-cp37-cp37m-linux_x86_64.whl size=50692 sha256=0e0dc40b118fd1a5d26f093ec91dfc5cfbe771e34e3f469b9a788f8100ba7926 Stored in directory: /root/.cache/pip/wheels/ea/62/7a/e8b943f1d8025cd93a93928a162319e56843301c8c06610ffe Successfully built python-geohash Installing collected packages: scikit-learn, seaborn, python-geohash, dask-glm, hypernets, dask-ml, hypergbm Attempting uninstall: scikit-learn Found existing installation: scikit-learn 0.23.2 Uninstalling scikit-learn-0.23.2: Successfully uninstalled scikit-learn-0.23.2 Attempting uninstall: seaborn Found existing installation: seaborn 0.11.2 Uninstalling seaborn-0.11.2: Successfully uninstalled seaborn-0.11.2 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pdpbox 0.2.1 requires matplotlib==3.1.1, but you have matplotlib 3.4.3 which is incompatible. hypertools 0.7.0 requires scikit-learn!=0.22,<0.24,>=0.19.1, but you have scikit-learn 1.0 which is incompatible. Successfully installed dask-glm-0.2.0 dask-ml-1.9.0 hypergbm-0.2.3 hypernets-0.2.3 python-geohash-0.8.5 scikit-learn-1.0 seaborn-0.11.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

    AttributeError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in dep_map(self) 3015 try: -> 3016 return self.__dep_map 3017 except AttributeError: /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in getattr(self, attr) 2812 if attr.startswith(''): -> 2813 raise AttributeError(attr) 2814 return getattr(self._provider, attr) AttributeError: _DistInfoDistribution__dep_map During handling of the above exception, another exception occurred: AttributeError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in _parsed_pkg_info(self) 3006 try: -> 3007 return self.pkg_info 3008 except AttributeError: /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in getattr(self, attr) 2812 if attr.startswith(''): -> 2813 raise AttributeError(attr) 2814 return getattr(self._provider, attr) AttributeError: _pkg_info During handling of the above exception, another exception occurred: FileNotFoundError Traceback (most recent call last) /tmp/ipykernel_43/4208425762.py in ----> 1 import hypergbm /opt/conda/lib/python3.7/site-packages/hypergbm/init.py in 4 5 """ ----> 6 from hypernets.experiment import CompeteExperiment 7 from .hyper_gbm import HyperGBM, HyperGBMEstimator, HyperGBMExplainer, HyperEstimator, HyperModel 8 from .experiment import make_experiment /opt/conda/lib/python3.7/site-packages/hypernets/experiment/init.py in 7 from ._experiment import Experiment, ExperimentCallback 8 from .general import GeneralExperiment ----> 9 from .compete import CompeteExperiment, SteppedExperiment, StepNames 10 from ._callback import ConsoleCallback, SimpleNotebookCallback 11 from ._maker import make_experiment /opt/conda/lib/python3.7/site-packages/hypernets/experiment/compete.py in 19 from hypernets.core import set_random_state 20 from hypernets.experiment import Experiment ---> 21 from hypernets.tabular import dask_ex as dex, column_selector as cs 22 from hypernets.tabular import drift_detection as dd, feature_importance as fi, pseudo_labeling as pl 23 from hypernets.tabular.cache import cache /opt/conda/lib/python3.7/site-packages/hypernets/tabular/dask_ex/init.py in 18 19 try: ---> 20 import dask_ml.preprocessing as dm_pre 21 import dask_ml.model_selection as dm_sel 22 /opt/conda/lib/python3.7/site-packages/dask_ml/init.py in 7 8 try: ----> 9 version = get_distribution(name).version 10 all.append("version") 11 except DistributionNotFound: /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in get_distribution(dist) 464 dist = Requirement.parse(dist) 465 if isinstance(dist, Requirement): --> 466 dist = get_provider(dist) 467 if not isinstance(dist, Distribution): 468 raise TypeError("Expected string, Requirement, or Distribution", dist) /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in get_provider(moduleOrReq) 340 """Return an IResourceProvider for the named module or requirement""" 341 if isinstance(moduleOrReq, Requirement): --> 342 return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] 343 try: 344 module = sys.modules[moduleOrReq] /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in require(self, *requirements) 884 included, even if they were already activated in this working set. 885 """ --> 886 needed = self.resolve(parse_requirements(requirements)) 887 888 for dist in needed: /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in resolve(self, requirements, env, installer, replace_conflicting, extras) 778 779 # push the new requirements onto the stack --> 780 new_requirements = dist.requires(req.extras)[::-1] 781 requirements.extend(new_requirements) 782 /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in requires(self, extras) 2732 def requires(self, extras=()): 2733 """List of Requirements needed for this distro if extras are used""" -> 2734 dm = self._dep_map 2735 deps = [] 2736 deps.extend(dm.get(None, ())) /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in _dep_map(self) 3016 return self.__dep_map 3017 except AttributeError: -> 3018 self.__dep_map = self._compute_dependencies() 3019 return self.__dep_map 3020 /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in _compute_dependencies(self) 3025 reqs = [] 3026 # Including any condition expressions -> 3027 for req in self._parsed_pkg_info.get_all('Requires-Dist') or []: 3028 reqs.extend(parse_requirements(req)) 3029 /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in _parsed_pkg_info(self) 3007 return self._pkg_info 3008 except AttributeError: -> 3009 metadata = self.get_metadata(self.PKG_INFO) 3010 self._pkg_info = email.parser.Parser().parsestr(metadata) 3011 return self._pkg_info /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in get_metadata(self, name) 1405 return "" 1406 path = self._get_metadata_path(name) -> 1407 value = self._get(path) 1408 try: 1409 return value.decode('utf-8') /opt/conda/lib/python3.7/site-packages/pkg_resources/init.py in _get(self, path) 1609 1610 def _get(self, path): -> 1611 with open(path, 'rb') as stream: 1612 return stream.read() 1613 FileNotFoundError: [Errno 2] No such file or directory: '/opt/conda/lib/python3.7/site-packages/scikit_learn-0.23.2.dist-info/METADATA'

    Installation 
    opened by Enpen 3
  • Can't control the behavior of cache mechanism correctly.

    Can't control the behavior of cache mechanism correctly.

    When I'm using dask dataframe as input for HyperGBM(v0.2.2) with options:
    use_cache=False and don't specify cache_dir

    it gaves the error bellow:

    [ERROR] E hypergbm.hyper_gbm.py 584 - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/workdir/hypergbm_cache/22805_16_32f8e48ef673a57e6644a4b1b295fb66_4355a2f3c42cdb4a7b3c3f53ee8a26b5.parquet/part.0.parquet'
    [ERROR] Traceback (most recent call last):
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/hyper_gbm.py", line 582, in _save_df
    [ERROR]     to_parquet(df, filepath, fs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/persistence.py", line 93, in to_parquet
    [ERROR]     result = dask.compute(parts)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/dask/base.py", line 565, in compute
    [ERROR]     results = schedule(dsk, keys, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 2654, in get
    [ERROR]     results = self.gather(packed, asynchronous=asynchronous, direct=direct)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1969, in gather
    [ERROR]     asynchronous=asynchronous,
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 838, in sync
    [ERROR]     self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/utils.py", line 351, in sync
    [ERROR]     raise exc.with_traceback(tb)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/utils.py", line 334, in f
    [ERROR]     result[0] = yield future
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tornado/gen.py", line 762, in run
    [ERROR]     value = future.result()
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1828, in _gather
    [ERROR]     raise exception.with_traceback(traceback)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/persistence.py", line 54, in _arrow_write_parquet
    [ERROR]     pq.write_table(tbl, target_path, filesystem=filesystem, **pa_options)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/pyarrow/parquet.py", line 1797, in write_table
    [ERROR]     **kwargs) as writer:
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/pyarrow/parquet.py", line 609, in __init__
    [ERROR]     path, compression=None)
    [ERROR]   File "pyarrow/_fs.pyx", line 660, in pyarrow._fs.FileSystem.open_output_stream
    [ERROR]     out_handle = GetResultValue(self.fs.OpenOutputStream(pathstr))
    [ERROR]   File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
    [ERROR]     return check_status(status)
    [ERROR]   File "pyarrow/_fs.pyx", line 1072, in pyarrow._fs._cb_open_output_stream
    [ERROR]     stream = handler.open_output_stream(frombytes(path))
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/pyarrow/fs.py", line 314, in open_output_stream
    [ERROR]     return PythonFile(self.fs.open(path, mode="wb"), mode="w")
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypernets/utils/_fsutils.py", line 105, in execute
    [ERROR]     result = fn(self.to_rpath(rpath), *args, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/fsspec/spec.py", line 943, in open
    [ERROR]     **kwargs,
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/fsspec/implementations/local.py", line 118, in _open
    [ERROR]     return LocalFileOpener(path, mode, fs=self, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/fsspec/implementations/local.py", line 200, in __init__
    [ERROR]     self._open()
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/fsspec/implementations/local.py", line 205, in _open
    [ERROR]     self.f = open(self.path, mode=self.mode)
    [ERROR] 
    

    And sometimes results in another error:

    [ERROR] 07-14 16:24:18 E hypernets.e._experiment.py 85 - ExperiementID:[None] - evaluate feature importance: 
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypernets/experiment/_experiment.py", line 75, in run
    [ERROR]     y_eval=self.y_eval, eval_size=self.eval_size, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/experiment.py", line 1116, in train
    [ERROR]     return super().train(hyper_model, X_train, y_train, X_test, X_eval, y_eval, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/experiment.py", line 839, in train
    [ERROR]     step.fit_transform(hyper_model, X_train, y_train, X_test=X_test, X_eval=X_eval, y_eval=y_eval, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/experiment.py", line 431, in fit_transform
    [ERROR]     importances = feature_importance_batch(estimators, X_eval, y_eval, self.scorer, n_repeats=5)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/feature_importance.py", line 73, in feature_importance_batch
    [ERROR]     random_state=random_state)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/dask_ex.py", line 356, in permutation_importance
    [ERROR]     col_scores.append(scorer(estimator, X_permuted, y))
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 170, in __call__
    [ERROR]     sample_weight=sample_weight)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 247, in _score
    [ERROR]     y_pred = method_caller(clf, "predict_proba", X)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 53, in _cached_call
    [ERROR]     return getattr(estimator, method)(*args, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/dask_ex.py", line 274, in call_and_compute
    [ERROR]     r = fn_call(*args, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/hyper_gbm.py", line 483, in predict_proba
    [ERROR]     proba = getattr(self.gbm_model, method)(X)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/hypergbm/estimators.py", line 382, in predict_proba
    [ERROR]     proba = dex.fix_binary_predict_proba_result(proba)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/dask_ex.py", line 261, in fix_binary_predict_proba_result
    [ERROR]     proba = make_chunk_size_known(proba)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/tabular_toolbox/dask_ex.py", line 142, in make_chunk_size_known
    [ERROR]     a = a.compute_chunk_sizes()
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1274, in compute_chunk_sizes
    [ERROR]     [tuple([int(chunk) for chunk in chunks]) for chunks in compute(tuple(c))[0]]
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/dask/base.py", line 565, in compute
    [ERROR]     results = schedule(dsk, keys, **kwargs)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 2654, in get
    [ERROR]     results = self.gather(packed, asynchronous=asynchronous, direct=direct)
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1969, in gather
    [ERROR]     asynchronous=asynchronous,
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 838, in sync
    [ERROR]     self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    [ERROR]   File "/usr/local/lib/python3.7/site-packages/distributed/utils.py", line 351, in sync
    [ERROR]     raise exc.with_traceback(tb)
    
    

    This first error is logged by hyper_gbm.py:

        def _save_df(self, filepath, df):
            try:
                # with fs.open(filepath, 'wb') as f:
                #     df.to_parquet(f)
                if not isinstance(df, pd.DataFrame):
                    fs.mkdirs(filepath, exist_ok=True)
                to_parquet(df, filepath, fs)
            except Exception as e:
                logger.error(e)
                # traceback.print_exc()
                if fs.exists(filepath):
                    fs.rm(filepath, recursive=True)
    

    I can see before the invocation of to_parquet, the filepath is already created by fs. Here I'm confused about:

    1. While the default cache_dir is hypergbm_cache, where does the prefix /tmp/workdir/ come from? The only location related with this /tmp/workdir/ is hypernets\utils\_fsutils.py:

        if type(fs).__name__.lower().find('local') >= 0:
           if fs_root is None or fs_root == '':
               fs_root = os.path.join(tempfile.gettempdir(), 'workdir')
      
    2. Is the path create by fs the same as the path in function to_parquet where there are also some operations related with the file system.

    3. In jupyter, the error disappears?


    Then comes to the mechanism of cache:

    The use_cache option cannot control the cache behavior of hypergbm.hyper_gbm.HyperGBMEstimator.predict, hypergbm.hyper_gbm.HyperGBMEstimator.predict_proba in steps such as hypergbm.experiment.PermutationImportanceSelectionStep or hypergbm.experiment.EnsembleStep, where the HyperGBMEstimator is loaded from training trails and the predict method is invoked with use_cache=None, then in hypergbm.hyper_gbm.HyperGBMEstimator.transform_data:

      def transform_data(self, X, y=None, fit=False, use_cache=None, verbose=0):
         if use_cache is None:
             use_cache = False
    

    results in the action of saving intermediate data.


    To avoid this, I think the easiest way may be:
    Change use_cache to False when it's None in HyperGBMEstimator's transform_data.

    Or, fix the to_parquet.


    Expect for the fix.

    bug 
    opened by ArkiZh 3
  • Feature generation error

    Feature generation error

    Please make sure that this is a bug.

    System information

    • OS Platform and Distribution (e.g., CentOS 7.6): Docker Python:3.9 | Debian GNU/Linux 11
    • Python version: 3.9.13
    • HyperGBM version: 0.2.5.4
    • Other Python packages(run pip list):

    anyio 3.6.1 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asttokens 2.0.5 attrs 22.1.0 Babel 2.10.3 backcall 0.2.0 bcrypt 3.2.2 beautifulsoup4 4.11.1 bleach 5.0.1 bokeh 2.4.3 catboost 1.0.6 certifi 2022.6.15 cffi 1.15.1 charset-normalizer 2.1.0 click 8.1.3 cloudpickle 2.1.0 convertdate 2.4.0 cryptography 37.0.4 cycler 0.11.0 dask 2022.8.0 dask-glm 0.2.0 dask-ml 2022.5.27 debugpy 1.6.2 decorator 5.1.1 defusedxml 0.7.1 distributed 2022.8.0 entrypoints 0.4 executing 0.9.1 fastjsonschema 2.16.1 featuretools 1.12.1 Flask 2.2.1 fonttools 4.34.4 fsspec 2022.7.1 graphviz 0.20.1 gunicorn 20.1.0 hboard 0.1.0 hboard-widget 0.1.0 HeapDict 1.0.1 hijri-converter 2.2.4 holidays 0.14.2 hypergbm 0.2.5.4 hypernets 0.2.5.4 idna 3.3 imbalanced-learn 0.9.1 importlib-metadata 4.12.0 iniconfig 1.1.1 ipykernel 6.15.1 ipython 8.4.0 ipython-genutils 0.2.0 ipywidgets 7.7.1 itsdangerous 2.1.2 jedi 0.18.1 jieba 0.42.1 Jinja2 3.1.2 joblib 1.1.0 json5 0.9.9 jsonschema 4.9.1 jupyter 1.0.0 jupyter-client 7.3.4 jupyter-console 6.4.4 jupyter-core 4.11.1 jupyter-server 1.18.1 jupyterlab 3.4.4 jupyterlab-pygments 0.2.2 jupyterlab-server 2.15.0 jupyterlab-widgets 1.1.1 kiwisolver 1.4.4 korean-lunar-calendar 0.2.1 lightgbm 3.3.2 llvmlite 0.39.0 locket 1.0.0 MarkupSafe 2.1.1 matplotlib 3.5.2 matplotlib-inline 0.1.3 mistune 0.8.4 msgpack 1.0.4 multipledispatch 0.6.0 nbclassic 0.4.3 nbclient 0.6.6 nbconvert 6.5.0 nbformat 5.4.0 nest-asyncio 1.5.5 notebook 6.4.12 notebook-shim 0.1.0 numba 0.56.0 numpy 1.22.4 packaging 21.3 pandas 1.4.3 pandocfilters 1.5.0 paramiko 2.11.0 parso 0.8.3 partd 1.2.0 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.2.0 pip 22.0.4 plotly 5.9.0 pluggy 1.0.0 prettytable 3.3.0 prometheus-client 0.14.1 prompt-toolkit 3.0.30 psutil 5.9.1 ptyprocess 0.7.0 pure-eval 0.2.2 py 1.11.0 py4j 0.10.9.5 pyarrow 9.0.0 pycparser 2.21 Pygments 2.12.0 PyMeeus 0.5.11 PyNaCl 1.5.0 pyparsing 3.0.9 pyrsistent 0.18.1 pyspark 3.3.0 pytest 7.1.2 python-dateutil 2.8.2 python-geohash 0.8.5 pytz 2022.1 PyYAML 6.0 pyzmq 23.2.0 qtconsole 5.3.2 QtPy 2.2.0 requests 2.28.1 scikit-learn 1.1.2 scikit-plot 0.3.7 scipy 1.9.0 seaborn 0.11.2 Send2Trash 1.8.0 setuptools 58.1.0 shap 0.41.0 six 1.16.0 slicer 0.0.7 sniffio 1.2.0 sortedcontainers 2.4.0 soupsieve 2.3.2.post1 stack-data 0.3.0 tblib 1.7.0 tenacity 8.0.1 terminado 0.15.0 threadpoolctl 3.1.0 tinycss2 1.1.1 tomli 2.0.1 toolz 0.12.0 tornado 6.1 tqdm 4.64.0 traitlets 5.3.0 typing_extensions 4.3.0 urllib3 1.26.11 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 1.3.3 Werkzeug 2.2.1 wheel 0.37.1 widgetsnbextension 3.6.1 woodwork 0.17.2 xgboost 1.6.1 XlsxWriter 3.0.3 zict 2.2.0 zipp 3.8.1

    Describe the current behavior 开启了特征衍生后,选取类别型特征、连续型特征,训练报错 企业微信截图_16653908817866

    Describe the expected behavior Finish the training successfully

    Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Jupyter notebook.

    test_1010.csv train_1010.csv

    from hypergbm import make_experiment
    import json
    
    test={"train_data":"./train_1010.csv",
    "test_data":"./test_1010.csv",
    "pseudo_labeling_sample_number":10,
    "pseudo_labeling":True,
    "max_trials":10,
    "eval_size":0.2,
    "feature_selection_quantile":0.3,
    "train_test_split_strategy":False,
    "ensemble_size":10,
    "random_state":8888,
    "feature_selection":True,
    "feature_reselection_estimator_size":10,
    "drift_detection_remove_size":0.1,
    "uuid":"train_2da2bceca48c4a568cd904ed8aa66db7",
    "test_ratio":0.3,
    "drift_detection_num_folds":5,
    "feature_selection_threshold":0.1,
    "pseudo_labeling_strategy":"threshold",
    "feature_reselection_strategy":"number",
    "down_sample_search":False,
    "feature_generation":True,
    "feature_generation_categories_cols":["join_chl_type","is_permonth_fee"],
    #"feature_generation_date_cols":["],
    "feature_generation_continuous_cols":["sph_num","total_tax_fee"],
    "pseudo_labeling_resplit":True,
    "down_sample_search_max_trials":10,
    "feature_selection_number":0.8,
    "collinearity_detection":True,
    "drift_detection_remove_shift_variable":True,
    "pseudo_labeling_proba_threshold":0.1,
    "early_stopping_time_limit":3600,
    "num_folds":5,
    "searcher":"evolution",
    "drift_detection_min_features":10,
    "down_sample_search_size":0.9,
    "drift_detection":True,
    "retrain_on_wholedata":True,
    "early_stopping_rounds":10,
    "log_level":"debug",
    "data_cleaner_args":{
    "drop_idness_columns":True,
    "drop_constant_columns":True,
    "drop_label_nan_rows":True,
    "drop_duplicated_columns":True,
    "nan_chars":"/t"
    },
    "feature_reselection":False,
    "drift_detection_variable_shift_threshold":0.7,
    "drift_detection_threshold":0.7,
    "target":"churn_target",
    "testRatio":0.3,
    "task":"binary",
    "cv":True,
    "feature_selection_strategy":"threshold",
    "reward_metric":"auc",
    "feature_generation":True,
    "early_stopping_reward":0.8,
    "feature_reselection_number":10,
    "down_sample_search_time_limit":300
    }
    
    ex = make_experiment(**test)
    model = ex.run()
    

    Are you willing to submit PR?(Yes/No) Yes

    Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

    bug 
    opened by lyheimat 2
  • Dependencies are broken. Release 0.2.5.4 (the stable)

    Dependencies are broken. Release 0.2.5.4 (the stable)

    System information

    • OS Platform and Distribution (e.g., CentOS 7.6): Manjaro
    • Python version: 3.10.4
    • HyperGBM version: 0.2.5.4
    • Other Python packages(run pip list):
    Package               Version
    --------------------- ---------
    asttokens             2.0.8
    backcall              0.2.0
    bcrypt                4.0.0
    catboost              1.0.6
    certifi               2022.6.15
    cffi                  1.15.1
    charset-normalizer    2.1.1
    click                 8.1.3
    cloudpickle           2.1.0
    convertdate           2.4.0
    cryptography          37.0.4
    cycler                0.11.0
    dask                  2022.7.1
    decorator             5.1.1
    distributed           2022.7.1
    executing             1.0.0
    featuretools          1.13.0
    fonttools             4.37.1
    fsspec                2022.8.2
    graphviz              0.20.1
    HeapDict              1.0.1
    hijri-converter       2.2.4
    holidays              0.15
    hypergbm              0.2.5.4
    hypernets             0.2.5.4
    idna                  3.3
    imbalanced-learn      0.9.1
    ipython               8.4.0
    jedi                  0.18.1
    Jinja2                3.1.2
    joblib                1.1.0
    kiwisolver            1.4.4
    korean-lunar-calendar 0.2.1
    lightgbm              3.3.2
    locket                1.0.0
    MarkupSafe            2.1.1
    matplotlib            3.5.3
    matplotlib-inline     0.1.6
    msgpack               1.0.4
    numpy                 1.23.2
    packaging             21.3
    pandas                1.4.4
    paramiko              2.11.0
    parso                 0.8.3
    partd                 1.3.0
    pexpect               4.8.0
    pickleshare           0.7.5
    Pillow                9.2.0
    pip                   22.1.2
    plotly                5.10.0
    prettytable           3.4.0
    prompt-toolkit        3.0.30
    psutil                5.9.1
    ptyprocess            0.7.0
    pure-eval             0.2.2
    pyarrow               9.0.0
    pycparser             2.21
    Pygments              2.13.0
    PyMeeus               0.5.11
    PyNaCl                1.5.0
    pyparsing             3.0.9
    python-dateutil       2.8.2
    python-geohash        0.8.5
    pytz                  2022.2.1
    PyYAML                6.0
    requests              2.28.1
    scikit-learn          1.1.2
    scipy                 1.9.1
    seaborn               0.11.2
    setuptools            63.4.1
    six                   1.16.0
    sortedcontainers      2.4.0
    stack-data            0.5.0
    tblib                 1.7.0
    tenacity              8.0.1
    threadpoolctl         3.1.0
    toolz                 0.12.0
    tornado               6.1
    tqdm                  4.64.0
    traitlets             5.3.0
    urllib3               1.26.12
    wcwidth               0.2.5
    wheel                 0.37.1
    woodwork              0.18.0
    xgboost               1.6.2
    XlsxWriter            3.0.3
    zict                  2.2.0
    
    

    Describe the current behavior Failed to run a simple demo.

    Describe the expected behavior

    Standalone code to reproduce the issue

    Environment:

    conda clean -a
    pip cache purge
    conda create -n gbm
    conda activate gbm
    conda install pip
    pip install hypergbm
    
    

    Log: Successfully installed MarkupSafe-2.1.1 XlsxWriter-3.0.3 asttokens-2.0.8 backcall-0.2.0 bcrypt-4.0.0 catboost-1.0.6 cffi-1.15.1 charset-normalizer-2.1.1 click-8.1.3 cloudpickle-2.1.0 convertdate-2.4.0 cryptography-37.0.4 cycler-0.11.0 dask-2022.7.1 decorator-5.1.1 distributed-2022.7.1 executing-1.0.0 featuretools-1.13.0 fonttools-4.37.1 fsspec-2022.8.2 graphviz-0.20.1 heapdict-1.0.1 hijri-converter-2.2.4 holidays-0.15 hypergbm-0.2.5.4 hypernets-0.2.5.4 idna-3.3 imbalanced-learn-0.9.1 ipython-8.4.0 jedi-0.18.1 jinja2-3.1.2 joblib-1.1.0 kiwisolver-1.4.4 korean-lunar-calendar-0.2.1 lightgbm-3.3.2 locket-1.0.0 matplotlib-3.5.3 matplotlib-inline-0.1.6 msgpack-1.0.4 numpy-1.23.2 packaging-21.3 pandas-1.4.4 paramiko-2.11.0 parso-0.8.3 partd-1.3.0 pexpect-4.8.0 pickleshare-0.7.5 pillow-9.2.0 plotly-5.10.0 prettytable-3.4.0 prompt-toolkit-3.0.30 psutil-5.9.1 ptyprocess-0.7.0 pure-eval-0.2.2 pyarrow-9.0.0 pycparser-2.21 pygments-2.13.0 pymeeus-0.5.11 pynacl-1.5.0 pyparsing-3.0.9 python-dateutil-2.8.2 python-geohash-0.8.5 pytz-2022.2.1 pyyaml-6.0 requests-2.28.1 scikit-learn-1.1.2 scipy-1.9.1 seaborn-0.11.2 six-1.16.0 sortedcontainers-2.4.0 stack-data-0.5.0 tblib-1.7.0 tenacity-8.0.1 threadpoolctl-3.1.0 toolz-0.12.0 tornado-6.1 tqdm-4.64.0 traitlets-5.3.0 urllib3-1.26.12 wcwidth-0.2.5 woodwork-0.18.0 xgboost-1.6.2 zict-2.2.0

    Code sample:

    import pandas as pd
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
    
    X,y = datasets.load_breast_cancer(as_frame=True,return_X_y=True)
    X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.7,random_state=335)
    train_data = pd.concat([X_train,y_train],axis=1)
    
    from hypergbm import make_experiment
    
    
    experiment = make_experiment(train_data, target='target', reward_metric='precision')
    estimator = experiment.run()
    

    The output:

    09-01 14:28:01 E hypernets.m.hyper_model.py 71 - run_trail failed! trail_no=7
    09-01 14:28:01 E hypernets.m.hyper_model.py 73 - Traceback (most recent call last):
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypernets/model/hyper_model.py", line 61, in _run_trial
        scores, oof, oof_scores = estimator.fit_cross_validation(X, y, stratified=True, num_folds=num_folds,
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/hyper_gbm.py", line 312, in fit_cross_validation
        fold_est.fit(x_train_fold, y_train_fold, **fit_kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/estimators.py", line 482, in fit
        return self.fit_with_encoder(super().fit, X, y, kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/estimators.py", line 181, in fit_with_encoder
        return fn_fit(X, y, **kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/core.py", line 575, in inner_f
        return f(**kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/sklearn.py", line 1400, in fit
        self._Booster = train(
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/core.py", line 575, in inner_f
        return f(**kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/training.py", line 159, in train
        _assert_new_callback(callbacks)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/training.py", line 25, in _assert_new_callback
        raise ValueError(
    ValueError: Old style callback was removed in version 1.6.  See: https://xgboost.readthedocs.io/en/latest/python/callbacks.html.
    
    09-01 14:28:01 E hypernets.m.hyper_model.py 71 - run_trail failed! trail_no=8
    09-01 14:28:01 E hypernets.m.hyper_model.py 73 - Traceback (most recent call last):
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypernets/model/hyper_model.py", line 61, in _run_trial
        scores, oof, oof_scores = estimator.fit_cross_validation(X, y, stratified=True, num_folds=num_folds,
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/hyper_gbm.py", line 312, in fit_cross_validation
        fold_est.fit(x_train_fold, y_train_fold, **fit_kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/estimators.py", line 482, in fit
        return self.fit_with_encoder(super().fit, X, y, kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/hypergbm/estimators.py", line 181, in fit_with_encoder
        return fn_fit(X, y, **kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/core.py", line 575, in inner_f
        return f(**kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/sklearn.py", line 1400, in fit
        self._Booster = train(
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/core.py", line 575, in inner_f
        return f(**kwargs)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/training.py", line 159, in train
        _assert_new_callback(callbacks)
      File "~/miniconda3/envs/gbm/lib/python3.10/site-packages/xgboost/training.py", line 25, in _assert_new_callback
        raise ValueError(
    ValueError: Old style callback was removed in version 1.6.  See: https://xgboost.readthedocs.io/en/latest/python/callbacks.html.
    
    opened by ArkiZh 2
  • collinearity_detection has a error when training if using dask distrubuted

    collinearity_detection has a error when training if using dask distrubuted

    Please make sure that this is a bug.

    System information

    • OS Platform and Distribution (e.g., CentOS 7.6): CentOS 7.9 Docker 20.10.14
    • Python version: 3.9
    • HyperGBM version: 0.2.5.4
    • Other Python packages(run pip list): anyio==3.6.1 argon2-cffi==21.3.0 argon2-cffi-bindings==21.2.0 asttokens==2.0.5 attrs==22.1.0 Babel==2.10.3 backcall==0.2.0 bcrypt==3.2.2 beautifulsoup4==4.11.1 bleach==5.0.1 bokeh==2.4.3 catboost==1.0.6 certifi==2022.6.15 cffi==1.15.1 charset-normalizer==2.1.0 click==8.1.3 cloudpickle==2.1.0 convertdate==2.4.0 cryptography==37.0.4 cycler==0.11.0 dask==2022.7.1 dask-glm==0.2.0 dask-ml==2022.5.27 debugpy==1.6.2 decorator==5.1.1 defusedxml==0.7.1 distributed==2022.7.1 entrypoints==0.4 executing==0.9.1 fastjsonschema==2.16.1 featuretools==1.12.0 fonttools==4.34.4 fsspec==2022.7.1 graphviz==0.20.1 hboard==0.1.0 hboard-widget==0.1.0 HeapDict==1.0.1 hijri-converter==2.2.4 holidays==0.14.2 hypergbm==0.2.5.4 hypernets==0.2.5.4 idna==3.3 imbalanced-learn==0.9.1 importlib-metadata==4.12.0 iniconfig==1.1.1 ipykernel==6.15.1 ipython==8.4.0 ipython-genutils==0.2.0 ipywidgets==7.7.1 jedi==0.18.1 jieba==0.42.1 Jinja2==3.1.2 joblib==1.1.0 json5==0.9.9 jsonschema==4.9.0 jupyter-client==7.3.4 jupyter-core==4.11.1 jupyter-server==1.18.1 jupyterlab==3.4.4 jupyterlab-pygments==0.2.2 jupyterlab-server==2.15.0 jupyterlab-widgets==1.1.1 kiwisolver==1.4.4 korean-lunar-calendar==0.2.1 lightgbm==3.3.2 llvmlite==0.39.0 locket==1.0.0 MarkupSafe==2.1.1 matplotlib==3.5.2 matplotlib-inline==0.1.3 mistune==0.8.4 msgpack==1.0.4 multipledispatch==0.6.0 nbclassic==0.4.3 nbclient==0.6.6 nbconvert==6.5.0 nbformat==5.4.0 nest-asyncio==1.5.5 notebook==6.4.12 notebook-shim==0.1.0 numba==0.56.0 numpy==1.22.4 packaging==21.3 pandas==1.4.3 pandocfilters==1.5.0 paramiko==2.11.0 parso==0.8.3 partd==1.2.0 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.2.0 plotly==5.9.0 pluggy==1.0.0 prettytable==3.3.0 prometheus-client==0.14.1 prompt-toolkit==3.0.30 psutil==5.9.1 ptyprocess==0.7.0 pure-eval==0.2.2 py==1.11.0 pyarrow==8.0.0 pycparser==2.21 Pygments==2.12.0 PyMeeus==0.5.11 PyNaCl==1.5.0 pyparsing==3.0.9 pyrsistent==0.18.1 pytest==7.1.2 python-dateutil==2.8.2 python-geohash==0.8.5 pytz==2022.1 PyYAML==6.0 pyzmq==23.2.0 requests==2.28.1 scikit-learn==1.1.1 scipy==1.9.0 seaborn==0.11.2 Send2Trash==1.8.0 shap==0.41.0 six==1.16.0 slicer==0.0.7 sniffio==1.2.0 sortedcontainers==2.4.0 soupsieve==2.3.2.post1 stack-data==0.3.0 tblib==1.7.0 tenacity==8.0.1 terminado==0.15.0 threadpoolctl==3.1.0 tinycss2==1.1.1 tomli==2.0.1 toolz==0.12.0 tornado==6.1 tqdm==4.64.0 traitlets==5.3.0 typing_extensions==4.3.0 urllib3==1.26.11 wcwidth==0.2.5 webencodings==0.5.1 websocket-client==1.3.3 widgetsnbextension==3.6.1 woodwork==0.17.1 xgboost==1.6.1 XlsxWriter==3.0.3 zict==2.2.0 zipp==3.8.1

    Describe the current behavior We wanna train a dataset via dask distrubuted with 3 nodes.

    Describe the expected behavior Training finished successfully

    Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Jupyter notebook. dask.zip

    Are you willing to submit PR?(Yes/No) Yes

    Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. 企业微信截图_16595067253520 企业微信截图_16595067711352 企业微信截图_16595068094460

    opened by SonixLegend 2
  • In the feature selection stage, no feature is selected.

    In the feature selection stage, no feature is selected.

    In the feature selection stage, no feature is selected, and an error was returned:

    07-11 13:49:59 I hypernets.e.compete.py 882 - feature_selection drop 10 columns, 0 kept

    raise ValueError("No data output, ??? ") ValueError: No data output, ???

    Can protection mechanism be added so that at least a few features are retained when selected is None?

    opened by zhangxjohn 0
Releases(0.2.5.7)
Owner
DataCanvas
We love data science!
DataCanvas
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

11 Nov 29, 2022
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 08, 2022
Extracts data from the database for a graph-node and stores it in parquet files

subgraph-extractor Extracts data from the database for a graph-node and stores it in parquet files Installation For developing, it's recommended to us

Cardstack 0 Jan 10, 2022
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
Caffe: a fast open framework for deep learning.

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berke

Berkeley Vision and Learning Center 33k Dec 28, 2022
This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset.

FACT This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset. To cite, please use:

105 Dec 17, 2022
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 06, 2022
ISNAS-DIP: Image Specific Neural Architecture Search for Deep Image Prior [CVPR 2022]

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior (CVPR 2022) Metin Ersin Arican*, Ozgur Kara*, Gustav Bredell, Ender Konukogl

Özgür Kara 24 Dec 18, 2022
机器学习、深度学习、自然语言处理等人工智能基础知识总结。

说明 机器学习、深度学习、自然语言处理基础知识总结。 目前主要参考李航老师的《统计学习方法》一书,也有一些内容例如XGBoost、聚类、深度学习相关内容、NLP相关内容等是书中未提及的。

Peter 445 Dec 12, 2022
Code for the paper "Relation of the Relations: A New Formalization of the Relation Extraction Problem"

This repo contains the code for the EMNLP 2020 paper "Relation of the Relations: A New Paradigm of the Relation Extraction Problem" (Jin et al., 2020)

YYY 27 Oct 26, 2022
A toy project using OpenCV and PyMunk

A toy project using OpenCV, PyMunk and Mediapipe the source code for my LindkedIn post It's just a toy project and I didn't write a documentation yet,

Amirabbas Asadi 82 Oct 28, 2022
Project Tugas Besar pertama Pengenalan Komputasi Institut Teknologi Bandung

Vending_Machine_(Mesin_Penjual_Minuman) Project Tugas Besar pertama Pengenalan Komputasi Institut Teknologi Bandung Raw Sketch untuk Essay Ringkasan P

QueenLy 1 Nov 08, 2021
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 01, 2023
Official PyTorch repo for JoJoGAN: One Shot Face Stylization

JoJoGAN: One Shot Face Stylization This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization. Abstract: While there have been recent ad

1.3k Dec 29, 2022
Class-Attentive Diffusion Network for Semi-Supervised Classification [AAAI'21] (official implementation)

Class-Attentive Diffusion Network for Semi-Supervised Classification Official Implementation of AAAI 2021 paper Class-Attentive Diffusion Network for

Jongin Lim 7 Sep 20, 2022
Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz).

Blender-Cave-Generation Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz). Installation

2 Dec 28, 2022
Code I use to automatically update my videos' metadata on YouTube

mCodingYouTube This repository contains the code I use to automatically update my videos' metadata on YouTube, including: titles, descriptions, tags,

James Murphy 19 Oct 07, 2022
This code is for eCaReNet: explainable Cancer Relapse Prediction Network.

eCaReNet This code is for eCaReNet: explainable Cancer Relapse Prediction Network. (Towards Explainable End-to-End Prostate Cancer Relapse Prediction

Institute of Medical Systems Biology 2 Jul 28, 2022
Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Vera 75 Dec 13, 2022
Comp445 project - Data Communications & Computer Networks

COMP-445 Data Communications & Computer Networks Change Python version in Conda

Peng Zhao 2 Oct 03, 2022