A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects

Overview


KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers

License PyPI Latest Release Downloads

Documentation

https://www.kxy.ai/reference/

Installation

From PyPi:

pip install kxy

From GitHub:

git clone https://github.com/kxytechnologies/kxy-python.git & cd ./kxy-python & pip install .

Authentication

All heavy-duty computations are run on our serverless infrastructure and require an API key. To configure the package with your API key, run

kxy configure

and follow the instructions. To get an API key you need an account; you can sign up for a free trial here. You'll then be automatically given an API key which you can find here.

KXY is free for academic use.

Docker

The Docker image kxytechnologies/kxy has been built for your convenience, and comes with anaconda, auto-sklearn, and the kxy package.

To start a Jupyter Notebook server from a sandboxed Docker environment, run

&& /opt/conda/bin/jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root --NotebookApp.token=''" ">
docker run -i -t -p 5555:8888 kxytechnologies/kxy:latest /bin/bash -c "kxy configure 
   
     && /opt/conda/bin/jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root --NotebookApp.token=''
    "
   

where you should replace with your API key and navigate to http://localhost:5555 in your browser. This docker environment comes with all examples available on the documentation website.

To start a Jupyter Notebook server from an existing directory of notebooks, run

&& /opt/conda/bin/jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root --NotebookApp.token=''" ">
docker run -i -t --mount src=</path/to/your/local/dir>,target=/opt/notebooks,type=bind -p 5555:8888 kxytechnologies/kxy:latest /bin/bash -c "kxy configure 
   
     && /opt/conda/bin/jupyter notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root --NotebookApp.token=''
    "
   

where you should replace with the path to your local notebook folder and navigate to http://localhost:5555 in your browser.

Other Programming Language

We plan to release friendly API client in more programming language.

In the meantime, you can directly issue requests to our RESTFul API using your favorite programming language.

You might also like...
Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

SDK: Overview of the Kubeflow pipelines service Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis.
Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis.

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A library of extension and helper modules for Python's data analysis and machine learning libraries.
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Comments
  • error in import kxy

    error in import kxy

    Hi, After installing the kxy package and configuring the API key, the import kxy shows the error below:

    .../python3.9/site-packages/kxy/pfs/pfs_selector.py in <module>
          6 import numpy as np
          7 
    ----> 8 import tensorflow as tf
          9 from tensorflow.keras.callbacks import EarlyStopping, TerminateOnNaN
         10 from tensorflow.keras.optimizers import Adam
    
    ModuleNotFoundError: No module named 'tensorflow'
    
    

    what version of tensorflow is needed for kxy to work?

    opened by zeydabadi 2
  • generate_features Documentation?

    generate_features Documentation?

    Is there any documentation on how to use the generate_features function? It doesn't appear in the documentation and I can't find it in the github. e.g. how to use the entity column, how to format time-series data in advance for it, etc'. Thanks!

    opened by ddofer 1
  • error kxy.data_valuation

    error kxy.data_valuation

    Hi, After running chievable_performance_df = X_train_reduced.kxy.data_valuation(target_column='state', problem_type='classification', include_mutual_information=True, anonymize=True) I get the following error and the function does not return anything: `During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/usr/lib/python3.9/asyncio/tasks.py", line 258, in __step result = coro.throw(exc) File "/home/lucy/Downloads/general/lib/python3.9/site-packages/tornado/websocket.py", line 1104, in wrapper raise WebSocketClosedError() tornado.websocket.WebSocketClosedError Task exception was never retrieved future: <Task finished name='Task-46004' coro=<WebSocketProtocol13.write_message..wrapper() done, defined at /home/lucy/Downloads/general/lib/python3.9/site-packages/tornado/websocket.py:1100> exception=WebSocketClosedError()> Traceback (most recent call last): File "/home/lucy/Downloads/general/lib/python3.9/site-packages/tornado/websocket.py", line 1102, in wrapper await fut File "/usr/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup future.result() tornado.iostream.StreamClosedError: Stream is closed `

    opened by zeydabadi 0
Releases(v1.4.10)
  • v1.4.10(Apr 25, 2022)

    Change Log

    v.1.4.10 Changes

    • Added a function to construct features derived from PFS mutual information estimation that should be expected to be linearly related to the target.
    • Fixed a global name conflict in kxy.learning.base_learners.

    v.1.4.9 Changes

    • Change the activation function used by PFS from ReLU to switch/SILU.
    • Leaving it to the user to set the logging level.

    v.1.4.8 Changes

    • Froze the versions of all python packages in the docker file.

    v.1.4.7 Changes

    Changes related to optimizing Principal Feature Selection.

    • Made it easy to change PFS' default learning parameters.
    • Changed PFS' default learning parameters (learning rate is now 0.005 and epsilon 1e-04)
    • Adding a seed parameter to PFS' fit for reproducibility.

    To globally change the learning rate to 0.003, change Adam's epsilon to 1e-5, and the number of epochs to 25, do

    from kxy.misc.tf import set_default_parameter
    set_default_parameter('lr', 0.003)
    set_default_parameter('epsilon', 1e-5)
    set_default_parameter('epochs', 25)
    

    To change the number epochs for a single iteration of PFS, use the epochs argument of the fit method of your PFS object. The fit method now also has a seed parameter you may use to make the PFS implementation deterministic.

    Example:

    from kxy.pfs import PFS
    selector = PFS()
    selector.fit(x, y, epochs=25, seed=123)
    

    Alternatively, you may also use the kxy.misc.tf.set_seed method to make PFS deterministic.

    v.1.4.6 Changes

    Minor PFS improvements.

    • Adding more (robust) mutual information loss functions.
    • Exposing the learned total mutual information between principal features and target as an attribute of PFS.
    • Exposing the number of epochs as a parameter of PFS' fit.
    Source code(tar.gz)
    Source code(zip)
  • v1.4.9(Apr 12, 2022)

    Change Log

    v.1.4.9 Changes

    • Change the activation function used by PFS from ReLU to switch/SILU.
    • Leaving it to the user to set the logging level.

    v.1.4.8 Changes

    • Froze the versions of all python packages in the docker file.

    v.1.4.7 Changes

    Changes related to optimizing Principal Feature Selection.

    • Made it easy to change PFS' default learning parameters.
    • Changed PFS' default learning parameters (learning rate is now 0.005 and epsilon 1e-04)
    • Adding a seed parameter to PFS' fit for reproducibility.

    To globally change the learning rate to 0.003, change Adam's epsilon to 1e-5, and the number of epochs to 25, do

    from kxy.misc.tf import set_default_parameter
    set_default_parameter('lr', 0.003)
    set_default_parameter('epsilon', 1e-5)
    set_default_parameter('epochs', 25)
    

    To change the number epochs for a single iteration of PFS, use the epochs argument of the fit method of your PFS object. The fit method now also has a seed parameter you may use to make the PFS implementation deterministic.

    Example:

    from kxy.pfs import PFS
    selector = PFS()
    selector.fit(x, y, epochs=25, seed=123)
    

    Alternatively, you may also use the kxy.misc.tf.set_seed method to make PFS deterministic.

    v.1.4.6 Changes

    Minor PFS improvements.

    • Adding more (robust) mutual information loss functions.
    • Exposing the learned total mutual information between principal features and target as an attribute of PFS.
    • Exposing the number of epochs as a parameter of PFS' fit.
    Source code(tar.gz)
    Source code(zip)
  • v1.4.8(Apr 11, 2022)

    Change Log

    v.1.4.8 Changes

    • Froze the versions of all python packages in the docker file.

    v.1.4.7 Changes

    Changes related to optimizing Principal Feature Selection.

    • Made it easy to change PFS' default learning parameters.
    • Changed PFS' default learning parameters (learning rate is now 0.005 and epsilon 1e-04)
    • Adding a seed parameter to PFS' fit for reproducibility.

    To globally change the learning rate to 0.003, change Adam's epsilon to 1e-5, and the number of epochs to 25, do

    from kxy.misc.tf import set_default_parameter
    set_default_parameter('lr', 0.003)
    set_default_parameter('epsilon', 1e-5)
    set_default_parameter('epochs', 25)
    

    To change the number epochs for a single iteration of PFS, use the epochs argument of the fit method of your PFS object. The fit method now also has a seed parameter you may use to make the PFS implementation deterministic.

    Example:

    from kxy.pfs import PFS
    selector = PFS()
    selector.fit(x, y, epochs=25, seed=123)
    

    Alternatively, you may also use the kxy.misc.tf.set_seed method to make PFS deterministic.

    v.1.4.6 Changes

    Minor PFS improvements.

    • Adding more (robust) mutual information loss functions.
    • Exposing the learned total mutual information between principal features and target as an attribute of PFS.
    • Exposing the number of epochs as a parameter of PFS' fit.
    Source code(tar.gz)
    Source code(zip)
  • v1.4.7(Apr 10, 2022)

    Change Log

    v.1.4.7 Changes

    Changes related to optimizing Principal Feature Selection.

    • Made it easy to change PFS' default learning parameters.
    • Changed PFS' default learning parameters (learning rate is now 0.005 and epsilon 1e-04)
    • Adding a seed parameter to PFS' fit for reproducibility.

    To globally change the learning rate to 0.003, change Adam's epsilon to 1e-5, and the number of epochs to 25, do

    from kxy.misc.tf import set_default_parameter
    set_default_parameter('lr', 0.003)
    set_default_parameter('epsilon', 1e-5)
    set_default_parameter('epochs', 25)
    

    To change the number epochs for a single iteration of PFS, use the epochs argument of the fit method of your PFS object. The fit method now also has a seed parameter you may use to make the PFS implementation deterministic.

    Example:

    from kxy.pfs import PFS
    selector = PFS()
    selector.fit(x, y, epochs=25, seed=123)
    

    Alternatively, you may also use the kxy.misc.tf.set_seed method to make PFS deterministic.

    v.1.4.6 Changes

    Minor PFS improvements.

    • Adding more (robust) mutual information loss functions.
    • Exposing the learned total mutual information between principal features and target as an attribute of PFS.
    • Exposing the number of epochs as a parameter of PFS' fit.
    Source code(tar.gz)
    Source code(zip)
  • v1.4.6(Apr 10, 2022)

    Changes

    • Adding more (robust) mutual information loss functions.
    • Exposing the learned total mutual information between principal features and target as an attribute of PFS.
    • Exposing the number of epochs as a parameter of PFS' fit.
    Source code(tar.gz)
    Source code(zip)
  • v1.4.5(Apr 9, 2022)

  • v1.4.4(Apr 8, 2022)

  • v0.3.2(Aug 14, 2020)

  • v0.3.0(Aug 3, 2020)

    Adding a maximum-entropy based classifier (kxy.MaxEntClassifier) and regressor (kxy.MaxEntRegressor) following the scikit-learn signature for fitting and predicting.

    These models estimate the posterior mean E[u_y|x] and the posterior standard deviation sqrt(Var[u_y|x]) for any specific value of x, where the copula-uniform representations (u_y, u_x) follow the maximum-entropy distribution.

    Predictions in the primal are derived from E[u_y|x].

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jun 25, 2020)

    • Regression analyses now fully support categorical variables.
    • Foundations for multi-output regressions are laid.
    • Categorical variables are now systematically encoded and treated as continuous, consistent with what's done at the learning stage.
    • Regression and classification are further normalized, and most the compute for classification problems now takes place on the API side, and should be considerably faster.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.18(May 26, 2020)

  • v0.0.16(May 18, 2020)

  • v0.0.15(May 18, 2020)

  • v0.0.14(May 18, 2020)

  • v0.0.13(May 16, 2020)

  • v0.0.11(May 13, 2020)

  • v0.0.10(May 11, 2020)

Owner
KXY Technologies, Inc.
KXY Technologies, Inc.
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 07, 2023
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 02, 2023
PennyLane is a cross-platform Python library for differentiable programming of quantum computers

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural ne

PennyLaneAI 1.6k Jan 01, 2023
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Dec 22, 2022
Simulate & classify transient absorption spectroscopy (TAS) spectral features for bulk semiconducting materials (Post-DFT)

PyTASER PyTASER is a Python (3.9+) library and set of command-line tools for classifying spectral features in bulk materials, post-DFT. The goal of th

Materials Design Group 4 Dec 27, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 03, 2022
ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

ArviZ 1.3k Jan 05, 2023
A Python step-by-step primer for Machine Learning and Optimization

early-ML Presentation General Machine Learning tutorials A Python step-by-step primer for Machine Learning and Optimization This github repository gat

Dimitri Bettebghor 8 Dec 01, 2022
Distributed deep learning on Hadoop and Spark clusters.

Note: we're lovingly marking this project as Archived since we're no longer supporting it. You are welcome to read the code and fork your own version

Yahoo 1.3k Dec 28, 2022
Nixtla is an open-source time series forecasting library.

Nixtla Nixtla is an open-source time series forecasting library. We are helping data scientists and developers to have access to open source state-of-

Nixtla 401 Jan 08, 2023
Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

Jan Beitner 2.5k Jan 02, 2023
Estudos e projetos feitos com PySpark.

PySpark (Spark com Python) PySpark é uma biblioteca Spark escrita em Python, e seu objetivo é permitir a análise interativa dos dados em um ambiente d

Karinne Cristina 54 Nov 06, 2022
Projeto: Machine Learning: Linguagens de Programacao 2004-2001

Projeto: Machine Learning: Linguagens de Programacao 2004-2001 Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2

Victor Hugo Negrisoli 0 Jun 29, 2021
The Fuzzy Labs guide to the universe of open source MLOps

Open Source MLOps This is the Fuzzy Labs guide to the universe of free and open source MLOps tools. Contents What is MLOps, anyway? Data version contr

Fuzzy Labs 352 Dec 29, 2022
As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

Machine Learning Loot Crate 💻 🧰 🔴 Welcome contributors! As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Cra

Abhishek Sharma 89 Dec 28, 2022
Machine Learning Study 혼자 해보기

Machine Learning Study 혼자 해보기 기여자 (Contributors) ✨ Teddy Lee 🏠 HongJaeKwon 🏠 Seungwoo Han 🏠 Tae Heon Kim 🏠 Steve Kwon 🏠 SW Song 🏠 K1A2 🏠 Wooil

Teddy Lee 1.7k Jan 01, 2023
A visual dataflow programming language for sklearn

Persimmon What is it? Persimmon is a visual dataflow language for creating sklearn pipelines. It represents functions as blocks, inputs and outputs ar

Álvaro Bermejo 194 Jan 04, 2023
A Python package to preprocess time series

Disclaimer: This package is WIP. Do not take any APIs for granted. tspreprocess Time series can contain noise, may be sampled under a non fitting rate

Maximilian Christ 57 Dec 17, 2022
Hierarchical Time Series Forecasting using Prophet

htsprophet Hierarchical Time Series Forecasting using Prophet Credit to Rob J. Hyndman and research partners as much of the code was developed with th

Collin Rooney 131 Dec 02, 2022
slim-python is a package to learn customized scoring systems for decision-making problems.

slim-python is a package to learn customized scoring systems for decision-making problems. These are simple decision aids that let users make yes-no p

Berk Ustun 37 Nov 02, 2022