moDel Agnostic Language for Exploration and eXplanation

Overview

moDel Agnostic Language for Exploration and eXplanation

R build status Coverage Status CRAN_Status_Badge Total Downloads DrWhy-eXtrAI

Python-check Supported Python versions PyPI version Downloads

Overview

Unverified black box model is the path to the failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection.

The DALEX package xrays any model and helps to explore and explain its behaviour, helps to understand how complex models are working. The main function explain() creates a wrapper around a predictive model. Wrapped models may then be explored and compared with a collection of local and global explainers. Recent developents from the area of Interpretable Machine Learning/eXplainable Artificial Intelligence.

The philosophy behind DALEX explanations is described in the Explanatory Model Analysis e-book. The DALEX package is a part of DrWhy.AI universe.

If you work with scikit-learn, keras, H2O, tidymodels, xgboost, mlr or mlr3 in R, you may be interested in the DALEXtra package, which is an extension of DALEX with easy to use explain_*() functions for models created in these libraries.

Additional overview of the dalex Python package is available.

Installation

The DALEX R package can be installed from CRAN

install.packages("DALEX")

The dalex Python package is available on PyPI and conda-forge

pip install dalex -U

conda install -c conda-forge dalex

Learn more

Machine Learning models are widely used and have various applications in classification or regression tasks. Due to increasing computational power, availability of new data sources and new methods, ML models are more and more complex. Models created with techniques like boosting, bagging of neural networks are true black boxes. It is hard to trace the link between input variables and model outcomes. They are use because of high performance, but lack of interpretability is one of their weakest sides.

In many applications we need to know, understand or prove how input variables are used in the model and what impact do they have on final model prediction. DALEX is a set of tools that help to understand how complex models are working.

Resources

R package

Python package

Talks about DALEX

Citation

If you use DALEX in R or dalex in Python, please cite our JMLR papers:

@article{JMLR:v19:18-416,
  author  = {Przemyslaw Biecek},
  title   = {DALEX: Explainers for Complex Predictive Models in R},
  journal = {Journal of Machine Learning Research},
  year    = {2018},
  volume  = {19},
  number  = {84},
  pages   = {1-5},
  url     = {http://jmlr.org/papers/v19/18-416.html}
}

@article{JMLR:v22:20-1473,
  author  = {Hubert Baniecki and
             Wojciech Kretowicz and
             Piotr Piatyszek and 
             Jakub Wisniewski and 
             Przemyslaw Biecek},
  title   = {dalex: Responsible Machine Learning 
             with Interactive Explainability and Fairness in Python},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {214},
  pages   = {1-7},
  url     = {http://jmlr.org/papers/v22/20-1473.html}
}

Why

76 years ago Isaac Asimov devised Three Laws of Robotics: 1) a robot may not injure a human being, 2) a robot must obey the orders given it by human beings and 3) A robot must protect its own existence. These laws impact discussion around Ethics of AI. Today’s robots, like cleaning robots, robotic pets or autonomous cars are far from being conscious enough to be under Asimov’s ethics.

Today we are surrounded by complex predictive algorithms used for decision making. Machine learning models are used in health care, politics, education, judiciary and many other areas. Black box predictive models have far larger influence on our lives than physical robots. Yet, applications of such models are left unregulated despite many examples of their potential harmfulness. See Weapons of Math Destruction by Cathy O'Neil for an excellent overview of potential problems.

It's clear that we need to control algorithms that may affect us. Such control is in our civic rights. Here we propose three requirements that any predictive model should fulfill.

  • Prediction's justifications. For every prediction of a model one should be able to understand which variables affect the prediction and how strongly. Variable attribution to final prediction.
  • Prediction's speculations. For every prediction of a model one should be able to understand how the model prediction would change if input variables were changed. Hypothesizing about what-if scenarios.
  • Prediction's validations For every prediction of a model one should be able to verify how strong are evidences that confirm this particular prediction.

There are two ways to comply with these requirements. One is to use only models that fulfill these conditions by design. White-box models like linear regression or decision trees. In many cases the price for transparency is lower performance. The other way is to use approximated explainers – techniques that find only approximated answers, but work for any black box model. Here we present such techniques.

Acknowledgments

Work on this package was financially supported by the NCN Opus grant 2016/21/B/ST6/02176 and NCN Opus grant 2017/27/B/ST6/0130.

Comments
  • tensorflow.python.keras.engine.functional.Functional Exception: Data must be 1-dimensional

    tensorflow.python.keras.engine.functional.Functional Exception: Data must be 1-dimensional

    Hi, I want to use Dalex with a model created from Autkeras, with a self-made Sequential Model it worked. What is the problem?

    Model from Autokeras structured_data_classifier : <tensorflow.python.keras.engine.functional.Functional at 0x1bbab227340> https://autokeras.com/structured_data_classifier/

    explainer = dx.Explainer(
        model = model,
        data = X_test,
        y = y_test,
        model_type= "classification")
    
    #%%
    explainer.model_performance()
    
    explainer.model_parts().plot()
    
    Python 🐍 invalid ❕ 
    opened by blanpa 16
  • How to create an explainer on tuned models with mlr3

    How to create an explainer on tuned models with mlr3

    I have the following code, for parameters tuning with mlr3

    df=readARFF("xerces.arff") task=TaskRegr$new("df", df, target = "price") learner= lrn("regr.rpart") resampling = rsmp("repeated_cv", folds=3, repeats=3) measure= msr("regr.mae")

    search_space = paradox::ParamSet$new( params = list(paradox::ParamDbl$new("cp", lower = 0.001, upper = 0.1))) terminator = trm("evals", n_evals = 15)

    tuner = tnr("grid_search")

    at = AutoTuner$new( learner = learner, resampling = resampling, measure = measure, search_space = search_space, terminator = terminator, tuner = tuner, store_tuning_instance = TRUE, store_benchmark_result = TRUE, store_models = TRUE )

    bmr = benchmark(grid,store_models = TRUE)

    My question is how can I create an explainer on this model?

    explainer = explain_mlr3( ??,...........)

    question ❔ R 🐳 
    opened by Nehagupta90 15
  • Error in `predict_surrogate` when `new_observation` has a target value

    Error in `predict_surrogate` when `new_observation` has a target value

    I have used Break-down method for instance level explanation and does work fine. I have never used LIME method and now when I am using it, it gives me the following error:

    Error in [.data.frame(explainer$data, , colnames(new_observation)) : undefined columns selected

    My code is:

    explainer5 = explain_mlr3(model5, data = test[,-21], y = as.numeric(test$report)-1, label="SVM")

    new_observation= test[6,] plot(predict_parts(explainer5, new_observation = new_observation, type = "break_down_interactions")) //// This works fine

    /// Problem is in the following code

    model_type.dalex_explainer <- DALEXtra::model_type.dalex_explainer predict_model.dalex_explainer <- DALEXtra::predict_model.dalex_explainer

    lime_tool <- predict_surrogate(explainer = explainer5, new_observation = new_observation, n_features = 3, n_permutations = 1000, type = "lime")

    Error in [.data.frame(explainer$data, , colnames(new_observation)) : undefined columns selected

    What could be the problem? I am taking help from the example in https://ema.drwhy.ai/LIME.html

    bug 💣 R 🐳 
    opened by Nehagupta90 13
  • How to interpret an instance-level explanation

    How to interpret an instance-level explanation

    Hi I have a bit confusion over a plot of instance-level explanation.. I have to predict something which has a value of 0 to 10, while 10 means a product is very expensive and 0 means it is not expensive at all. I get a prediction of 0.7 (which of course means not expensive) and majority of input features have negative impact (red bars instead of green bar). So my question is how to interpret this explanation. If the value of this input feature will decreases, the prediction will decrease (go from 0.7 to 0.0) or the prediction will increase (goes from 0.7 to higher values such as 10).

    Thank you and hope you have understood my point

    question ❔ 
    opened by Nehagupta90 13
  • Discussion: default for loss_function in model_parts

    Discussion: default for loss_function in model_parts

    I've noticed different results of feature_importance in DALEX and ingredients. It is caused by different default values for loss_function. I was wondering shouldn't they be the same for consistency reasons.

    https://github.com/ModelOriented/DALEX/blob/8858926c4abb5b7cd7d1a16bfd9f6df2434003b0/R/model_parts.R#L43

    https://github.com/ModelOriented/ingredients/blob/9f8f82f05ddb87e4abf2fb20846f097113522a33/R/feature_importance.R#L152

    > library("DALEX")
    > model_titanic_glm <- glm(survived ~ gender + age + fare,
                             data = titanic_imputed, family = "binomial")
    > explain_titanic_glm <- explain(model_titanic_glm,
                                   data = titanic_imputed[,-8],
                                   y = titanic_imputed[,8])
    
    > DALEX::feature_importance(explain_titanic_glm, n_sample = NULL) 
          variable mean_dropout_loss label
    1 _full_model_          370.1496    lm
    2        class          370.1496    lm
    3     embarked          370.1496    lm
    4        sibsp          370.1496    lm
    5        parch          370.1496    lm
    6          age          371.6003    lm
    7         fare          384.4722    lm
    8       gender          548.2130    lm
    9   _baseline_          599.7132    lm
    
    > ingredients::feature_importance(explain_titanic_glm)
          variable mean_dropout_loss label
    1 _full_model_         0.4095316    lm
    2        class         0.4095316    lm
    3     embarked         0.4095316    lm
    4        sibsp         0.4095316    lm
    5        parch         0.4095316    lm
    6          age         0.4101522    lm
    7         fare         0.4173782    lm
    8       gender         0.4976695    lm
    9   _baseline_         0.5178448    lm
    
    > ingredients::feature_importance(explain_titanic_glm, loss_function = DALEX::loss_sum_of_squares)
          variable mean_dropout_loss label
    1 _full_model_          370.1496    lm
    2        class          370.1496    lm
    3     embarked          370.1496    lm
    4        sibsp          370.1496    lm
    5        parch          370.1496    lm
    6          age          371.0894    lm
    7         fare          384.0622    lm
    8       gender          546.5614    lm
    9   _baseline_          598.0323    lm
    

    (Because of https://github.com/ModelOriented/DALEX/issues/175, n_sample = NULL in DALEX::feature_importance)

    R 🐳 
    opened by kasiapekala 13
  • PDP with default and optimized settings

    PDP with default and optimized settings

    Hello

    I want to evaluate the partial dependence profile of a variable with RF model (with default settings) and RF (with optimized settings). Can I do it?

    default_model=model_profile(explainer = my_explainer, variables = "var1") opt_model=model_profile(explainer = my_explainer, variables = "var1")

    question ❔ 
    opened by Nehagupta90 10
  • iBreakDown::local_attributions() throws Error:

    iBreakDown::local_attributions() throws Error: "undefined columns selected"

    I trained 5 models with caret::train. The following code works for svmRadial, gbm and rf, BUT NOT for lm, knn:

    DALEX.explainer <- model %>%
      DALEX::explain(
        data = features,
        y = model$trainingData$.outcome >= UPPER.MIN,
        label = paste(model$method, " model"),
        colorize = TRUE
      )
    
    DALEX.attribution <- DALEX.explainer %>%
      iBreakDown::local_attributions(
        local.obs,
        keep_distributions = TRUE
      )
    

    throws error:

    Error in `[.data.frame`(newdata, , object$method$center, drop = FALSE) : 
      undefined columns selected
    

    In the documentation, I couldn't find any hint that the function is model-dependent. Any clue why this doesn't work for lm and knn??

    opened by agilebean 10
  • plot predict_parts in R gives scatterplot instead of break-down-plot

    plot predict_parts in R gives scatterplot instead of break-down-plot

    Im using DALEX in R, trying to create a break_down plot from a new observation and a XGBoost model.

    I succesfully created the explain object and the predict_parts break_down dataframe, which BTW took ages (more than a day to calculate - anyone have an intuitive idea why it takes so long or how to speed things up?).

    But plotting the predict_parts only gives a scatterplot, I was expecting a nice waterfall-looking plot.

    Im sorry for not being able to post a minimal reproducible example and not being able to share data or model, but here ist the snippet:

    library(DALEX)
      
    explain <- DALEX::explain(trained_model, data = train, y = as.numeric(train$PRUEFERGEBNIS)-1, type = "classification", label = "xgboost", predict_function = 
                                  function(trained_model, test){
                                    previous_na_action <- options('na.action')
                                    options(na.action='na.pass')
                                    sparse_matrix_test <- sparse.model.matrix(PRUEFERGEBNIS ~.
                                                                              , data = test)
                                    options(na.action=previous_na_action$na.action)
                                    
                                    results_test <- predict(trained_model, sparse_matrix_test, type = "response")
                                    round(results_test,3)
                                  })
    
    parts_plot <- predict_parts(explain, new_observation = test[1, , drop = FALSE], type = "break_down")
    plot(parts_plot)
    

    image

    R 🐳 invalid ❕ 
    opened by MJimitater 9
  • Working with case weights

    Working with case weights

    Wonderful package, thanks a lot, I am using it quite frequently.

    Here are some suggestions, especially having models with case weights in mind:

    1. The function model_performance allows to pass a loss function of the residuals (as single argument) only. This seems too restrictive as many loss functions cannot be written as a function of residuals but rather of the observed value and the predicted one. Most DALEX functions allow a general loss function with two arguments, not just one. This inconsitency is e.g. a problem when we want to plot deviance residuals of a logistic regression.

    2. Ability to enhance plots with ggplot functions, i.e. allowing to add e.g. + geom_points(...)

    3. variable_importance: In our case, the custom predict function depends on a weight vector w (basically the raw predictions are multiplied by w). Since the DALEX interface requires this vector to be a column in the data frame passed to explain, w currently appears in the variable_importance output as well. Would it be an option to add an optional "dropVars = NULL" argument to variable_importance (or even in explain)?

    4. Like issue 3, but for the single_prediction part. Here, it is not only an aesthetic problem if a "special" column like case weight w appears.

    feature 💡 long term 📆 R 🐳 
    opened by mayer79 9
  • Did the prescribed: devtools::install_github(

    Did the prescribed: devtools::install_github("pbiecek/DALEX") but Rstudio (latest version under Linux), will not complete the DALEX install. After the above command, R will just hang and nothing happens...

    Did the prescribed: devtools::install_github("pbiecek/DALEX") but Rstudio (latest version under Linux), will not complete the DALEX install.

    After the above command, R will just hang and nothing happens... R will just hang there.

    Waited for 5 minutes, then suspended the installation.

    What am I missing in order to install DALEX ?. (all other R packages have installed with no problems...).

    Thanks!

    opened by sfd99 9
  • Difference between PD profile and CP profile when we manually specify the instances

    Difference between PD profile and CP profile when we manually specify the instances

    Hi

    If I have 100 observations/instances and want to create the PD profile of few important features, then what is the difference between the PD profile of these instances using

    variab= c("var1","var2", "var3" ) pdp <- model_profile(explainer = explainer, variables = variab) plot(pdp)

    and between the CP profile when we manually specify the instances like

    new_observation= data[c(2,7,8,11,15,16,21,24,25,26,30,45,46),] cp<- ceteris_paribus(explainer, new_observation) cp_agg <- aggregate_profiles(cp, variables = variab) plot(cp_agg)

    opened by Nehagupta90 8
  • `loss_accuracy` returns 0 for `mean_dropout_loss`

    `loss_accuracy` returns 0 for `mean_dropout_loss`

    I would like to use loss_accuracy as my loss function in model_parts(), but whenever I use it, the mean_drop_loss is always 0. I have tried loss_accuracy for regression, classification, and multiclass classification (see reprex below). Am I using it correctly?

    library(DALEX)
    #> Welcome to DALEX (version: 2.4.2).
    #> Find examples and detailed introduction at: http://ema.drwhy.ai/
    library(ranger)
    df <- mtcars[, c('mpg', 'cyl', 'disp', 'hp', 'vs')]
    # Regression
    reg <- lm(mpg ~ ., data = df)
    explainer_reg <- explain(reg, data = df[,-1], y = df[,1])
    #> Preparation of a new explainer is initiated
    #>   -> model label       :  lm  (  default  )
    #>   -> data              :  32  rows  4  cols 
    #>   -> target variable   :  32  values 
    #>   -> predict function  :  yhat.lm  will be used (  default  )
    #>   -> predicted values  :  No value for predict function target column. (  default  )
    #>   -> model_info        :  package stats , ver. 4.2.2 , task regression (  default  ) 
    #>   -> predicted values  :  numerical, min =  12.56206 , mean =  20.09062 , max =  27.04625  
    #>   -> residual function :  difference between y and yhat (  default  )
    #>   -> residuals         :  numerical, min =  -4.019038 , mean =  1.010303e-14 , max =  6.976988  
    #>   A new explainer has been created!
    feature_importance(explainer_reg, loss_function = loss_accuracy)
    #>       variable mean_dropout_loss label
    #> 1 _full_model_                 0    lm
    #> 2          cyl                 0    lm
    #> 3         disp                 0    lm
    #> 4           hp                 0    lm
    #> 5           vs                 0    lm
    #> 6   _baseline_                 0    lm
    # Classification
    classif <- glm(vs ~ ., data = df, family = binomial)
    explainer_classif <- explain(classif, data = df[,-5], y = df[,5])
    #> Preparation of a new explainer is initiated
    #>   -> model label       :  lm  (  default  )
    #>   -> data              :  32  rows  4  cols 
    #>   -> target variable   :  32  values 
    #>   -> predict function  :  yhat.glm  will be used (  default  )
    #>   -> predicted values  :  No value for predict function target column. (  default  )
    #>   -> model_info        :  package stats , ver. 4.2.2 , task classification (  default  ) 
    #>   -> predicted values  :  numerical, min =  7.696047e-06 , mean =  0.4375 , max =  0.9920295  
    #>   -> residual function :  difference between y and yhat (  default  )
    #>   -> residuals         :  numerical, min =  -0.9474062 , mean =  -1.483608e-12 , max =  0.5318376  
    #>   A new explainer has been created!
    feature_importance(explainer_classif, loss_function = loss_accuracy)
    #>       variable mean_dropout_loss label
    #> 1 _full_model_                 0    lm
    #> 2          mpg                 0    lm
    #> 3          cyl                 0    lm
    #> 4         disp                 0    lm
    #> 5           hp                 0    lm
    #> 6   _baseline_                 0    lm
    # Multiclass classification
    multiclass <- ranger(cyl ~ ., data = df, probability = TRUE)
    explainer_multiclass <- explain(multiclass, data = df[,-2], y = df[,2])
    #> Preparation of a new explainer is initiated
    #>   -> model label       :  ranger  (  default  )
    #>   -> data              :  32  rows  4  cols 
    #>   -> target variable   :  32  values 
    #>   -> predict function  :  yhat.ranger  will be used (  default  )
    #>   -> predicted values  :  No value for predict function target column. (  default  )
    #>   -> model_info        :  package ranger , ver. 0.14.1 , task multiclass (  default  ) 
    #>   -> model_info        :  Model info detected multiclass task but 'y' is a numeric .  (  WARNING  )
    #>   -> model_info        :  By deafult multiclass tasks supports only factor 'y' parameter. 
    #>   -> model_info        :  Consider changing to a factor vector with true class names.
    #>   -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
    #>   -> predicted values  :  predict function returns multiple columns:  3  (  default  ) 
    #>   -> residual function :  difference between 1 and probability of true class (  default  )
    #>   -> residuals         :  the residual_function returns an error when executed (  WARNING  ) 
    #>   A new explainer has been created!
    feature_importance(explainer_multiclass, loss_function = loss_accuracy)
    #>       variable mean_dropout_loss  label
    #> 1 _full_model_                 0 ranger
    #> 2          mpg                 0 ranger
    #> 3         disp                 0 ranger
    #> 4           hp                 0 ranger
    #> 5           vs                 0 ranger
    #> 6   _baseline_                 0 ranger
    

    Created on 2022-12-29 with reprex v2.0.2

    When I try other loss functions (e.g., loss_root_mean_square for regression, loss_one_minus_auc for classification), they return non-zero values.

    feature_importance(explainer_reg, loss_function = loss_root_mean_square)
    #>       variable mean_dropout_loss label
    #> 1 _full_model_          2.844520    lm
    #> 2           vs          2.861546    lm
    #> 3           hp          3.328176    lm
    #> 4         disp          4.201312    lm
    #> 5          cyl          4.498485    lm
    #> 6   _baseline_          7.777811    lm
    feature_importance(explainer_classif, loss_function = loss_one_minus_auc)
    #>       variable mean_dropout_loss label
    #> 1 _full_model_        0.03571429    lm
    #> 2          mpg        0.04603175    lm
    #> 3         disp        0.04642857    lm
    #> 4           hp        0.31785714    lm
    #> 5          cyl        0.36031746    lm
    #> 6   _baseline_        0.51884921    lm
    

    Created on 2022-12-29 with reprex v2.0.2

    Is there something different about how loss_accuracy is used?

    I'm using DALEX v2.4.2, R v4.2.2, RStudio v2022.12.0+353, Ubuntu 22.04.1

    R 🐳 invalid ❕ 
    opened by JeffreyRStevens 8
  • Fairness module: why using those specific checks, while omitting others?

    Fairness module: why using those specific checks, while omitting others?

    Hi, I am using your fairness package for my project, and I wondered about your fairness checks:

    For the test equal opportunity ratio, you use TPR, not FNR (the True ratio and not the complementary False ratio). but, For the test "predictive equality ratio" you use FPR, not TNR (the False ratio, not the complementary True ratio).

    I wondered what is the justification for TPR and FPR specifically (and not TPR and TNR for example). Thanks!

    question ❔ 
    opened by ronnee19 1
  • collinearity issue in permutation based feature importance (technical question)

    collinearity issue in permutation based feature importance (technical question)

    I use DALEX in most of my ML research projects. I keep getting a criticism from reviewers regarding a drawback of permutation based method in the presence of multiple correlated predictors. They argue that if there is a group of highly correlated but important predictors, they may not show up at the top of feature importance. Can someone comment on this criticism? How is this addressed in DALEX?

    question ❔ 
    opened by asheetal 7
  • [fastai] 'DataFrame' object has no attribute 'to_frame' with fastai

    [fastai] 'DataFrame' object has no attribute 'to_frame' with fastai

    I'm trying to wrap a fastai tabular learner with DALEX. I got 'DataFrame' object has no attribute 'to_frame' error with dx.Explainer(learn, xs, y, label = "Deep NN"). Any potential problems with this line of code? Thanks!

    feature 💡 question ❔ Python 🐍 
    opened by ming-cui 12
  • python model_parts: Is there show_boxplots parameter?

    python model_parts: Is there show_boxplots parameter?

    In R package show_boxplots = TRUE would plot the permutations in feature importance In python model_parts, this parameter does not seem to exist. Or it is called something else?

    feature 💡 Python 🐍 
    opened by asheetal 1
Releases(python-v1.5.0)
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 648 Dec 16, 2022
Neighbourhood Retrieval (Nearest Neighbours) with Distance Correlation.

Neighbourhood Retrieval with Distance Correlation Assign Pseudo class labels to datapoints in the latent space. NNDC is a slim wrapper around FAISS. N

The Learning Machines 1 Jan 16, 2022
nn-Meter is a novel and efficient system to accurately predict the inference latency of DNN models on diverse edge devices

A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Microsoft 241 Dec 26, 2022
This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.

Zillow-Houses This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform. Pipeline is consists of 10

2 Jan 09, 2022
Hypernets: A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

DataCanvas 216 Dec 23, 2022
Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Toolkit for Building Robust ML models that generalize to unseen domains (RobustDG) Divyat Mahajan, Shruti Tople, Amit Sharma Privacy & Causal Learning

Microsoft 149 Jan 06, 2023
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Jan 06, 2023
Toolss - Automatic installer of hacking tools (ONLY FOR TERMUKS!)

Tools Автоматический установщик хакерских утилит (ТОЛЬКО ДЛЯ ТЕРМУКС!) Оригиналь

14 Jan 05, 2023
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
This machine learning model was developed for House Prices

This machine learning model was developed for House Prices - Advanced Regression Techniques competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.

serhat_derya 1 Mar 02, 2022
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 663 Dec 31, 2022
Simple and flexible ML workflow engine.

This is a simple and flexible ML workflow engine. It helps to orchestrate events across a set of microservices and create executable flow to handle requests. Engine is designed to be configurable wit

Katana ML 295 Jan 06, 2023
Hierarchical Time Series Forecasting using Prophet

htsprophet Hierarchical Time Series Forecasting using Prophet Credit to Rob J. Hyndman and research partners as much of the code was developed with th

Collin Rooney 131 Dec 02, 2022
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models. Solve a variety of tasks with pre-trained models or finetune them in

Backprop 227 Dec 10, 2022
BigDL: Distributed Deep Learning Framework for Apache Spark

BigDL: Distributed Deep Learning on Apache Spark What is BigDL? BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can w

4.1k Jan 09, 2023
Machine Learning for RC Cars

Suiron Machine Learning for RC Cars Prediction visualization (green = actual, blue = prediction) Click the video below to see it in action! Dependenci

Kendrick Tan 706 Jan 02, 2023
Send rockets to Mars with artificial intelligence(Genetic algorithm) in python.

Send Rockets To Mars With AI Send rockets to Mars with artificial intelligence(Genetic algorithm) in python. Tools Python 3 EasyDraw How to Play Insta

Mohammad Dori 3 Jul 15, 2022
Automatic extraction of relevant features from time series:

tsfresh This repository contains the TSFRESH python package. The abbreviation stands for "Time Series Feature extraction based on scalable hypothesis

Blue Yonder GmbH 7k Jan 06, 2023
AutoX是一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色、简单易用、通用、自动化、灵活。

English | 简体中文 AutoX是什么? AutoX一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色: AutoX在多个kaggle数据集上,效果显著优于其他解决方案(见效果对比)。 简单易用: AutoX的接口和sklearn类似,方便上手使用。

4Paradigm 431 Dec 28, 2022
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Amplo 10 May 15, 2022