Machine Learning automation and tracking

Overview

Build Status License PyPI version fury.io Documentation Code style: black

MLRun logo

The Open-Source MLOps Orchestration Framework

MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-learning pipelines from early development through model development to full pipeline deployment in production. MLRun offers a convenient abstraction layer to a wide variety of technology stacks while empowering data engineers and data scientists to define the feature and models.

The MLRun Architecture

MLRun architecture

MLRun is composed of the following layers:

  • Feature and Artifact Store — handles the ingestion, processing, metadata, and storage of data and features across multiple repositories and technologies.
  • Elastic Serverless Runtimes — converts simple code to scalable and managed microservices with workload-specific runtime engines (such as Kubernetes jobs, Nuclio, Dask, Spark, and Horovod).
  • ML Pipeline Automation — automates data preparation, model training and testing, deployment of real-time production pipelines, and end-to-end monitoring.
  • Central Management — provides a unified portal for managing the entire MLOps workflow. The portal includes a UI, a CLI, and an SDK, which are accessible from anywhere.

Key Benefits

MLRun provides the following key benefits:

  • Rapid deployment of code to production pipelines
  • Elastic scaling of batch and real-time workloads
  • Feature management — ingestion, preparation, and monitoring
  • Works anywhere — your local IDE, multi-cloud, or on-prem

For more information, see the MLRun Python package documentation.

In This Document

General Concept and Motivation

The Challenge

As an ML developer or data scientist, you typically want to write code in your preferred local development environment (IDE) or web notebook, and then run the same code on a larger cluster using scale-out containers or functions. When you determine that the code is ready, you or someone else need to transfer the code to an automated ML workflow (for example, using Kubeflow Pipelines). This pipeline should be secure and include capabilities such as logging and monitoring, as well as allow adjustments to relevant components and easy redeployment.

However, the implementation is challenging: various environments ("runtimes") use different configurations, parameters, and data sources. In addition, multiple frameworks and platforms are used to focus on different stages of the development life cycle. This leads to constant development and DevOps/MLOps work.

Furthermore, as your project scales, you need greater computation power or GPUs, and you need to access large-scale data sets. This cannot work on laptops. You need a way to seamlessly run your code on a remote cluster and automatically scale it out.

Why MLRun?

When running ML experiments, you should ideally be able to record and version your code, configuration, outputs, and associated inputs (lineage), so you can easily reproduce and explain your results. The fact that you probably need to use different types of storage (such as files and AWS S3 buckets) and various databases, further complicates the implementation.

Wouldn't it be great if you could write the code once, using your preferred development environment and simple "local" semantics, and then run it as-is on different platforms? Imagine a layer that automates the build process, execution, data movement, scaling, versioning, parameterization, outputs tracking, and more. A world of easily developed, published, or consumed data or ML "functions" that can be used to form complex and large-scale ML pipelines.

In addition, imagine a marketplace of ML functions that includes both open-source templates and your internally developed functions, to support code reuse across projects and companies and thus further accelerate your work.

This is the goal of MLRun.

Note: The code is in early development stages and is provided as a reference. The hope is to foster wide industry collaboration and make all the resources pluggable, so that developers can code to a single API and use various open-source projects or commercial products.

Back to top

Installation

Run the following command from your Python development environment (such as Jupyter Notebook) to install the MLRun package (mlrun), which includes a Python API library and the mlrun command-line interface (CLI):

pip install mlrun

MLRun requires separate containers for the API and the dashboard (UI). You can also select to use the pre-baked JupyterLab image.

To install and run MLRun locally using Docker or Kubernetes, see the instructions in the MLRun documentation.

Installation on the Iguazio Data Science Platform

MLRun runs as a service on the Iguazio Data Science Platform (version 2.8 and above) —

To access MLRun UI select it from the services screen, consult with Iguazio support for further details.

Back to top

Examples and Tutorial Notebooks

MLRun has many code examples and tutorial Jupyter notebooks with embedded documentation, ranging from examples of basic tasks to full end-to-end use-case applications, including the following; note that some of the examples are found in other mlrun GitHub repositories:

Additional Examples

Back to top

Quick-Start Tutorial — Architecture and Usage Guidelines

Basic Components

MLRun has the following main components:

  • Project — a container for organizing all of your work on a particular activity. Projects consist of metadata, source code, workflows, data and artifacts, models, triggers, and member management for user collaboration.

  • Function — a software package with one or more methods and runtime-specific attributes (such as image, command, arguments, and environment).

  • Run — an object that contains information about an executed function. The run object is created as a result of running a function, and contains the function attributes (such as arguments, inputs, and outputs), as well the execution status and results (including links to output artifacts).

  • Artifact — versioned data artifacts (such as data sets, files and models) that are produced or consumed by functions, runs, and workflows.

  • Workflow — defines a functions pipeline or a directed acyclic graph (DAG) to execute using Kubeflow Pipelines.

  • UI — a graphical user interface (dashboard) for displaying and managing projects and their contained experiments, artifacts, and code.

Managed and Portable Execution

MLRun supports various types of "runtimes" — computation frameworks such as local, Kubernetes job, Dask, Nuclio, Spark, or MPI job (Horovod). Runtimes may support parallelism and clustering to distribute the work among multiple workers (processes/containers).

The following code example creates a task that defines a run specification — including the run parameters, inputs, and secrets. You run the task on a "job" function, and print the result output (in this case, the "model" artifact) or watch the run's progress. For more information and examples, see the examples/mlrun_basics.ipynb notebook.

# Create a task and set its attributes
task = NewTask(handler=handler, name='demo', params={'p1': 5})
task.with_secrets('file', 'secrets.txt').set_label('type', 'demo')

run = new_function(command='myfile.py', kind='job').run(task)
run.logs(watch=True)
run.show()
print(run.artifact('model'))

You can run the same task on different functions — enabling code portability, re-use, and AutoML. You can also use the same function to run different tasks or parameter combinations with minimal coding effort.

Moving from local notebook execution to remote execution — such as running a container job, a scaled-out framework, or an automated workflow engine like Kubeflow Pipelines — is seamless: just swap the runtime function or wire functions in a graph. Continuous build integration and deployment (CI/CD) steps can also be configured as part of the workflow, using the deploy_step function method.

Functions (function objects) can be created by using any of the following methods:

  • new_function — creates a function "from scratch" or from another function.
  • code_to_function — creates a function from local or remote source code or from a web notebook.
  • import_function — imports a function from a local or remote YAML function-configuration file or from a function object in the MLRun database (using a DB address of the format db://<project>/<name>[:<tag>]).
  • function_to_module — import MLRun function or code as a local python module (can also be used inside another parent function) You can use the save function method to save a function object in the MLRun database, or the export method to save a YAML function-configuration function to your preferred local or remote location. For function-method details and examples, see the embedded documentation/help text.

Back to top / Back to quick-start TOC

Automated Parameterization, Artifact Tracking, and Logging

After running a job, you need to be able to track it, including viewing the run parameters, inputs, and outputs. To support this, MLRun introduces a concept of a runtime "context": the code can be set up to get parameters and inputs from the context, as well as log run outputs, artifacts, tags, and time-series metrics in the context.

Example

The following code example from the train-xgboost.ipynb notebook of the MLRun XGBoost demo (demo-xgboost) defines two functions: the iris_generator function loads the Iris data set and saves it to the function's context object; the xgb_train function uses XGBoost to train an ML model on a data set and saves the log results in the function's context:

import xgboost as xgb
import os
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.metrics import accuracy_score
from mlrun.artifacts import PlotArtifact
import pandas as pd


def iris_generator(context):
    iris = load_iris()
    iris_dataset = pd.DataFrame(data=iris.data, columns=iris.feature_names)
    iris_labels = pd.DataFrame(data=iris.target, columns=['label'])
    iris_dataset = pd.concat([iris_dataset, iris_labels], axis=1)
    context.logger.info('Saving Iris data set to "{}"'.format(context.out_path))
    context.log_dataset('iris_dataset', df=iris_dataset)


def xgb_train(context,
              dataset='',
              model_name='model.bst',
              max_depth=6,
              num_class=10,
              eta=0.2,
              gamma=0.1,
              steps=20):

    df = pd.read_csv(dataset)
    X = df.drop(['label'], axis=1)
    y = df['label']

    X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.2)
    dtrain = xgb.DMatrix(X_train, label=Y_train)
    dtest = xgb.DMatrix(X_test, label=Y_test)

    # Get parameters from event
    param = {"max_depth": max_depth,
             "eta": eta, "nthread": 4,
             "num_class": num_class,
             "gamma": gamma,
             "objective": "multi:softprob"}

    xgb_model = xgb.train(param, dtrain, steps)

    preds = xgb_model.predict(dtest)
    best_preds = np.asarray([np.argmax(line) for line in preds])

    context.log_result('accuracy', float(accuracy_score(Y_test, best_preds)))
    context.log_model('model', body=bytes(xgb_model.save_raw()), 
                      model_file='model.txt', 
                      metrics=context.results, parameters={'xx':'abc'},
                      labels={'framework': 'xgboost'},
                      artifact_path=context.artifact_subpath('models'))

The example training function can be executed locally with parameters, and the run results and artifacts can be logged automatically into a database by using a single command, as demonstrated in the following example; the example sets the function's eta parameter:

train_run = run_local(handler=xgb_train, pramas={'eta': 0.3})

Alternatively, you can replace the function with a serverless runtime to run the same code on a remote cluster, which could result in a ~10x performance boost. You can find examples for different runtimes — such as a Kubernetes job, Nuclio, Dask, Spark, or an MPI job — in the MLRun examples directory.

If you run your code from the main function, you can get the runtime context by calling the get_or_create_ctx method, as demonstrated in the following code from the MLRun training.py example application. The code also demonstrates how you can use the context object to read and write execution metadata, parameters, secrets, inputs, and outputs:

from mlrun import get_or_create_ctx
from mlrun.artifacts import ChartArtifact
import pandas as pd


def my_job(context, p1=1, p2='x'):
    # load MLRUN runtime context (will be set by the runtime framework e.g. KubeFlow)

    # get parameters from the runtime context (or use defaults)

    # access input metadata, values, files, and secrets (passwords)
    print(f'Run: {context.name} (uid={context.uid})')
    print(f'Params: p1={p1}, p2={p2}')
    print('accesskey = {}'.format(context.get_secret('ACCESS_KEY')))
    print('file\n{}\n'.format(context.get_input('infile.txt', 'infile.txt').get()))
    
    # Run some useful code e.g. ML training, data prep, etc.

    # log scalar result values (job result metrics)
    context.log_result('accuracy', p1 * 2)
    context.log_result('loss', p1 * 3)
    context.set_label('framework', 'sklearn')

    # log various types of artifacts (file, web page, table), will be versioned and visible in the UI
    context.log_artifact('model', body=b'abc is 123', local_path='model.txt', labels={'framework': 'xgboost'})
    context.log_artifact('html_result', body=b'<b> Some HTML <b>', local_path='result.html')

    # create a chart output (will show in the pipelines UI)
    chart = ChartArtifact('chart')
    chart.labels = {'type': 'roc'}
    chart.header = ['Epoch', 'Accuracy', 'Loss']
    for i in range(1, 8):
        chart.add_row([i, i/20+0.75, 0.30-i/20])
    context.log_artifact(chart)

    raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
                'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
                'age': [42, 52, 36, 24, 73],
                'testScore': [25, 94, 57, 62, 70]}
    df = pd.DataFrame(raw_data, columns=[
        'first_name', 'last_name', 'age', 'testScore'])
    context.log_dataset('mydf', df=df, stats=True)


if __name__ == "__main__":
    context = get_or_create_ctx('train')
    p1 = context.get_param('p1', 1)
    p2 = context.get_param('p2', 'a-string')
    my_job(context, p1, p2)

The example training.py application can be invoked as a local task, as demonstrated in the following code from the MLRun mlrun_basics.ipynb example notebook:

run = run_local(task, command='training.py')

Alternatively, you can invoke the application by using the mlrun CLI; edit the parameters, inputs, and/or secret information, as needed, and ensure that training.py is found in the execution path or edit the file path in the command:

mlrun run --name train -p p2=5 -i infile.txt=s3://my-bucket/infile.txt -s file=secrets.txt training.py

Back to top / Back to quick-start TOC

Using Hyperparameters for Job Scaling

Data science involves long computation times and data-intensive tasks. To ensure efficiency and scalability, you need to implement parallelism whenever possible. MLRun supports this by using two mechanisms:

  1. Clustering — run the code on a distributed processing engine (such as Dask, Spark, or Horovod).
  2. Load-balancing/partitioning — split (partition) the work across multiple workers.

MLRun functions and tasks can accept hyperparameters or parameter lists, deploy many parallel workers, and partition the work among the deployed workers. The parallelism implementation is left to the runtime. Each runtime may have its own method of concurrent tasks execution. For example, the Nuclio serverless engine manages many micro threads in the same process, which can run multiple tasks in parallel. In a containerized system like Kubernetes, you can launch multiple containers, each processing a different task.

MLRun supports parallelism. For example, the following code demonstrates how to use hyperparameters to run the XGBoost model-training task from the example in the previous section (xgb_train) with different parameter combinations:

    parameters = {
         "eta":       [0.05, 0.10, 0.20, 0.30],
         "max_depth": [3, 4, 5, 6, 8, 10],
         "gamma":     [0.0, 0.1, 0.2, 0.3],
         }

    task = NewTask(handler=xgb_train, out_path='/User/mlrun/data').with_hyper_params(parameters, 'max.accuracy')
    run = run_local(task)

This code demonstrates how to instruct MLRun to run the same task while choosing the parameters from multiple lists (grid search). MLRun then records all the runs, but marks only the run with minimal loss as the selected result. For parallelism, it would be better to use runtimes like Dask, Nuclio, or jobs.

Alternatively, you can run a similar task (with hyperparameters) by using the MLRun CLI (mlrun); ensure that training.py is found in the execution path or edit the file path in the command:

mlrun run --name train_hyper -x p1="[3,7,5]" -x p2="[5,2,9]" --out-path '/User/mlrun/data' training.py

You can also use a parameters file if you want to control the parameter combinations or if the parameters are more complex. The following code from the example mlrun_basics.ipynb notebook demonstrates how to run a task that uses a CSV parameters file (params.csv in the current directory):

    task = NewTask(handler=xgb_train).with_param_file('params.csv', 'max.accuracy')
    run = run_local(task)

Note: Parameter lists can be used in various ways. For example, you can pass multiple parameter files and use multiple workers to process the files simultaneously instead of one at a time.

Back to top / Back to quick-start TOC

Automated Code Deployment and Containerization

MLRun adopts Nuclio serverless technologies for automatically packaging code and building containers. This enables you to provide code with some package requirements and let MLRun build and deploy your software.

To build or deploy a function, all you need is to call the function's deploy method, which initiates a build or deployment job. Deployment jobs can be incorporated in pipelines just like regular jobs (using the deploy_step method of the function or Kubernetes-job runtime), thus enabling full automation and CI/CD.

A functions can be built from source code or from a function specification, web notebook, Git repo, or TAR archive.

A function can also be built by using the mlrun CLI and providing it with the path to a YAML function-configuration file. You can generate such a file by using the to_yaml or export function method. For example, the following CLI code builds a function from a function.yaml file in the current directory:

mlrun build function.yaml

Following is an example function.yaml configuration file:

kind: job
metadata:
  name: remote-git-test
  project: default
  tag: latest
spec:
  command: 'myfunc.py'
  args: []
  image_pull_policy: Always
  build:
    commands: ['pip install pandas']
    base_image: mlrun/mlrun:dev
    source: git://github.com/mlrun/ci-demo.git

For more examples of building and running functions remotely using the MLRun CLI, see the remote example.

You can also convert your web notebook to a containerized job, as demonstrated in the following sample code; for a similar example with more details, see the mlrun_jobs.ipynb example:

# Create an ML function from the notebook code and annotations, and attach a
# v3io Iguazio Data Science Platform data volume to the function
fn = code_to_function(kind='job').apply(mount_v3io())

# Prepare an image from the dependencies to allow updating the code and
# parameters per run without the need to build a new image
fn.build(image='mlrun/nuctest:latest')

Back to top

Running an ML Workflow with Kubeflow Pipelines

ML pipeline execution with MLRun is similar to CLI execution. A pipeline is created by running an MLRun workflow. MLRun automatically saves outputs and artifacts in a way that is visible to Kubeflow Pipelines, and allows interconnecting steps.

For an example of a full ML pipeline that's implemented in a web notebook, see the Sklearn MLRun demo (demos/scikit-learn). The sklearn-project.ipynb demo notebook includes the following code for implementing an ML-training pipeline:

from kfp import dsl
from mlrun import mount_v3io

funcs = {}
DATASET = 'iris_dataset'
LABELS  = "label"

def init_functions(functions: dict, project=None, secrets=None):
    for f in functions.values():
        f.apply(mount_v3io())
        f.spec.image_pull_policy = 'Always'

@dsl.pipeline(
    name="My XGBoost training pipeline",
    description="Shows how to use mlrun."
)
def kfpipeline():
    
    # build our ingestion function (container image)
    builder = funcs['gen-iris'].deploy_step(skip_deployed=True)
    
    # run the ingestion function with the new image and params
    ingest = funcs['gen-iris'].as_step(
        name="get-data",
        handler='iris_generator',
        image=builder.outputs['image'],
        params={'format': 'pq'},
        outputs=[DATASET])

    # analyze our dataset
    describe = funcs["describe"].as_step(
        name="summary",
        params={"label_column": LABELS},
        inputs={"table": ingest.outputs[DATASET]})
    
    # train with hyper-paremeters 
    train = funcs["train"].as_step(
        name="train-skrf",
        params={"model_pkg_class" : "sklearn.ensemble.RandomForestClassifier",
                "sample"          : -1, 
                "label_column"    : LABELS,
                "test_size"       : 0.10},
        hyperparams={'CLASS_n_estimators': [100, 300, 500]},
        selector='max.accuracy',
        inputs={"dataset"         : ingest.outputs[DATASET]},
        outputs=['model', 'test_set'])

    # test and visualize our model
    test = funcs["test"].as_step(
        name="test",
        params={"label_column": LABELS},
        inputs={"models_path" : train.outputs['model'],
                "test_set"    : train.outputs['test_set']})

    # deploy our model as a serverless function
    deploy = funcs["serving"].deploy_step(models={f"{DATASET}_v1": train.outputs['model']})

Back to top / Back to quick-start TOC

Viewing Run Data and Performing Database Operations

When you configure an MLRun database, the results, parameters, and input and output artifacts of each run are recorded in the database. You can view the results and perform operations on the database by using either of the following methods:

Back to top / Back to quick-start TOC

The MLRun Dashboard

The MLRun dashboard is a graphical user interface (GUI) for working with MLRun and viewing run data.



Back to top / Back to quick-start TOC

MLRun Database Methods

You can use the get_run_db DB method to get an MLRun DB object for a configured MLRun database or API service. Then, use the DB object's connect method to connect to the database or API service, and use additional methods to perform different operations, such as listing run artifacts or deleting completed runs. For more information and examples, see the mlrun_db.ipynb example notebook, which includes the following sample DB method calls:

from mlrun import get_run_db

# Get an MLRun DB object and connect to an MLRun database/API service.
# Specify the DB path (for example, './' for the current directory) or
# the API URL ('http://mlrun-api:8080' for the default configuration).
db = get_run_db('./')

# List all runs
db.list_runs('').show()

# List all artifacts for version 'latest' (default)
db.list_artifacts('', tag='').show()

# Check different artifact versions
db.list_artifacts('ch', tag='*').show()

# Delete completed runs
db.del_runs(state='completed')

Back to top / Back to quick-start TOC

Additional Information and Examples

Replacing Runtime Context Parameters from the CLI

You can use the MLRun CLI (mlrun) to run MLRun functions or code and change the parameter values.

For example, the following CLI command runs the example XGBoost training code from the previous tutorial examples:

python -m mlrun run -p p1=5 -s file=secrets.txt -i infile.txt=s3://mybucket/infile.txt training.py

When running this sample command, the CLI executes the code in the training.py application using the provided run information:

  • The value of parameter p1 is set to 5, overwriting the current parameter value in the run context.
  • The file infile.txt is downloaded from a remote "mybucket" AWS S3 bucket.
  • The credentials for the S3 download are retrieved from a secrets.txt file in the current directory.

Remote Execution

You can also run the same MLRun code that you ran locally as a remote HTTP endpoint.

Nuclio Example

For example, you can wrap the XGBoost training code from the previous tutorial examples within a serverless Nuclio handler function, and execute the code remotely using a similar CLI command to the one that you used locally.

You can run the following code from a Jupyter Notebook to create a Nuclio function from the notebook code and annotations, and deploy the function to a remote cluster.

Note:

  • Before running the code, install the nuclio-jupyter package for using Nuclio from Jupyter Notebook.
  • The example uses apply(mount_v3io()to attach a v3io Iguazio Data Science Platform data-store volume to the function. By default, the v3io mount mounts the home directory of the platform's running user into the \\User function path.
# Create an `xgb_train` Nuclio function from the notebook code and annotations;
# add a v3io data volume and a multi-worker HTTP trigger for parallel execution
fn = code_to_function('xgb_train', runtime='nuclio:mlrun')
fn.apply(mount_v3io()).with_http(workers=32)

# Deploy the function
run = fn.run(task, handler='xgb_train')

To execute the code remotely, run the same CLI command as in the previous tutorial examples and just substitute the code file name at the end with your function's URL. For example, run the following command and replace <function endpoint> with your remote function endpoint:

mlrun run -p p1=5 -s file=secrets.txt -i infile.txt=s3://mybucket/infile.txt http://<function-endpoint>

Back to top / Back to quick-start TOC

Running an MLRun Service

An MLRun service is a web service that manages an MLRun database for tracking and logging MLRun run information, and exposes an HTTP API for working with the database and performing MLRun operations.

You can create and run an MLRun service by using either of the following methods:

Note: For both methods, you can optionally configure the service port and/or directory path by setting the MLRUN_httpdb__port and MLRUN_httpdb__dirpath environment variables instead of the respective run parameters or CLI options.

Using the MLRun CLI to Run an MLRun Service

Use the db command of the MLRun CLI (mlrun) to create and run an instance of the MLRun service from the command line:

mlrun db [OPTIONS]

To see the supported options, run mlrun db --help:

Options:
  -p, --port INTEGER  HTTP port for serving the API
  -d, --dirpath TEXT  Path to the MLRun service directory
Comments
  • Getting error while running workflow on kubernetes

    Getting error while running workflow on kubernetes

    Hi,

    I'm using minikube to run kubernetes in local system and trying to run workflow defined in demos/sklearn-pipe/sklearn-project.ipynb but getting the below error message.

    Jupyter Cell:

    artifact_path = path.abspath('./pipe/{{workflow.uid}}')
    
    run_id = skproj.run(
        'main',
        arguments={}, 
        artifact_path=artifact_path, 
        dirty=True)
    

    Error message: MaxRetryError: HTTPConnectionPool(host='ml-pipeline.default.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fea36705a90>: Failed to establish a new connection: [Errno -2] Name or service not known'))

    I have followed the instructions mentioned in below readme file https://github.com/mlrun/mlrun/blob/master/hack/local/README.md

    Can anyone help me in resolving the error?

    opened by narendra36 26
  • permission error when trying to run pipeline on kubeflow

    permission error when trying to run pipeline on kubeflow

    Hi, I am trying to run the demo notebook sklearn-project on a local kubernetes. I have installed kubeflow.

    I get this error when trying to send the pipeline to the api server: 400 Client Error: Bad Request for url: http://mlrun-api:8080/api/projects/sk-project/pipelines?namespace=mlrun&experiment=sk-project-main: details: {'reason': 'MLRunBadRequestError("Failed creating pipeline: HTTPConnectionPool(host='ml-pipeline.mlrun.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9f94278ed0>: Failed to establish a new connection: [Errno -2] Name or service not known'))")'}

    I noticed it tried to call the ml-pipeline on the wrong namespace (it uses the one mlrun is installed on). I changed the namespace to "kubeflow" and now I get this error: 400 Client Error: Bad Request for url: http://mlrun-api:8080/api/projects/sk-project/pipelines?namespace=kubeflow&experiment=sk-project-main: details: {'reason': 'MLRunBadRequestError('Failed creating pipeline: (400)\nReason: Bad Request\nHTTP response headers: HTTPHeaderDict({\'content-type\': \'application/json\', \'date\': \'Mon, 08 Nov 2021 13:40:04 GMT\', \'content-length\': \'708\', \'x-envoy-upstream-service-time\': \'1\', \'server\': \'istio-envoy\', \'x-envoy-decorator-operation\': \'ml-pipeline.kubeflow.svc.cluster.local:8888/*\'})\nHTTP response body: {"error":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","code":3,"message":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","error_details":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []"}]}\n')'}

    also keep in mind that the ml-pipeline service is installed on kubeflow, but I probably need to add the experiment on kubeflow-user-example-com namespace (the default example user namespace created when installing kubeflow).

    In any case - what am I doing wrong?

    opened by ran-haim 20
  • [Bug]: Max retries exceeded with url: /v2/models/cancer-classifier/infer

    [Bug]: Max retries exceeded with url: /v2/models/cancer-classifier/infer

    MLRun Version checks

    • [X] I have checked that this issue has not already been reported.

    • [X] I have confirmed this bug exists on the latest version of the MLRun Kit.

    Reproducible Example

    original 01-mlrun-basics.ipynb, issues see attached jupyter notebook, you can see this error in case of call serving_fn.invoke("/v2/models/cancer-classifier/infer", body=my_data)
    

    Issue Description

    in case of call serving_fn.invoke("/v2/models/cancer-classifier/infer", body=my_data) I got

    OSError: error: cannot run function at url http://127.0.0.1:54652/v2/models/cancer-classifier/infer, HTTPConnectionPool(host='127.0.0.1', port=54652): Max retries exceeded with url: /v2/models/cancer-classifier/infer (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f175c75b070>: Failed to establish a new connection: [Errno 111] Connection refused'))

    see jupyter Uploading 01-mlrun-basics.ipynb.txt…

    Expected Behavior

    Invoke without this issue, it can have relation to https://github.com/mlrun/mlrun/issues/2102

    Python Version

    3.8.8

    MLRun Version

    1.2.0

    Additional Information

    No response

    opened by j0terry 10
  • [Installation]: Docker-compose with Jupyter - SQLite database error

    [Installation]: Docker-compose with Jupyter - SQLite database error

    Installation check

    Installation OS

    Linux

    Installation Method

    Docker

    Kubernetes Cluster Type

    N/A - Docker

    MLRun Kit Helm Chart Version

    Issue Description

    SQLite error inside jupyter container.

    Installation Logs

    > 2022-08-16 15:35:54,802 [info] Initializing DB data
    jupyter_1   | > 2022-08-16 15:35:54,802 [debug] Waiting for database liveness
    jupyter_1   | > 2022-08-16 15:35:54,802 [debug] SQLite DB is used, liveness check not needed
    jupyter_1   | > 2022-08-16 15:35:54,849 [info] No projects in DB, assuming latest data version: {'exc': OperationalError('(sqlite3.OperationalError) unable to open database file'), 'latest_data_version': 2}
    jupyter_1   | > 2022-08-16 15:35:54,878 [info] No projects in DB, assuming latest data version: {'exc': OperationalError('(sqlite3.OperationalError) unable to open database file'), 'latest_data_version': 2}
    jupyter_1   | > 2022-08-16 15:35:54,878 [info] Checking if migration is needed: {'is_migration_from_scratch': True, 'is_schema_migration_needed': True, 'is_data_migration_needed': False, 'is_database_migration_needed': False, 'is_backup_needed': False, 'is_migration_needed': False}
    jupyter_1   | > 2022-08-16 15:35:54,879 [info] Creating initial data
    jupyter_1   | > 2022-08-16 15:35:54,879 [info] Performing schema migration
    jupyter_1   | > 2022-08-16 15:35:54,879 [debug] Performing alembic schema migrations
    jupyter_1   | > 2022-08-16 15:35:54,890 [warning] Migrations failed, changing API state: {'state': 'migrations_failed'}
    jupyter_1   | Traceback (most recent call last):
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3208, in _wrap_pool_connect
    jupyter_1   |     return fn()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 301, in connect
    jupyter_1   |     return _ConnectionFairy._checkout(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 761, in _checkout
    jupyter_1   |     fairy = _ConnectionRecord.checkout(pool)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 419, in checkout
    jupyter_1   |     rec = pool._do_get()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 259, in _do_get
    jupyter_1   |     return self._create_connection()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 247, in _create_connection
    jupyter_1   |     return _ConnectionRecord(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 362, in __init__
    jupyter_1   |     self.__connect()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 605, in __connect
    jupyter_1   |     pool.logger.debug("Error on connect(): %s", e)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    jupyter_1   |     compat.raise_(
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    jupyter_1   |     raise exception
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 599, in __connect
    jupyter_1   |     connection = pool._invoke_creator(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 578, in connect
    jupyter_1   |     return dialect.connect(*cargs, **cparams)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 584, in connect
    jupyter_1   |     return self.dbapi.connect(*cargs, **cparams)
    jupyter_1   | sqlite3.OperationalError: unable to open database file
    jupyter_1   | 
    jupyter_1   | The above exception was the direct cause of the following exception:
    jupyter_1   | 
    jupyter_1   | Traceback (most recent call last):
    jupyter_1   |   File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    jupyter_1   |     return _run_code(code, main_globals, None,
    jupyter_1   |   File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
    jupyter_1   |     exec(code, run_globals)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/main.py", line 256, in <module>
    jupyter_1   |     main()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/main.py", line 240, in main
    jupyter_1   |     init_data()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/initial_data.py", line 65, in init_data
    jupyter_1   |     _perform_schema_migrations(alembic_util)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/initial_data.py", line 160, in _perform_schema_migrations
    jupyter_1   |     alembic_util.init_alembic()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/utils/db/alembic.py", line 24, in init_alembic
    jupyter_1   |     alembic.command.upgrade(self._alembic_config, "head")
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/alembic/command.py", line 294, in upgrade
    jupyter_1   |     script.run_env()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/alembic/script/base.py", line 490, in run_env
    jupyter_1   |     util.load_python_file(self.dir, "env.py")
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 97, in load_python_file
    jupyter_1   |     module = load_module_py(module_id, path)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/alembic/util/compat.py", line 182, in load_module_py
    jupyter_1   |     spec.loader.exec_module(module)
    jupyter_1   |   File "<frozen importlib._bootstrap_external>", line 783, in exec_module
    jupyter_1   |   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/migrations_sqlite/env.py", line 82, in <module>
    jupyter_1   |     run_migrations_online()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/mlrun/api/migrations_sqlite/env.py", line 72, in run_migrations_online
    jupyter_1   |     with connectable.connect() as connection:
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3162, in connect
    jupyter_1   |     return self._connection_cls(self, close_with_result=close_with_result)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 92, in __init__
    jupyter_1   |     else engine.raw_connection()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3241, in raw_connection
    jupyter_1   |     return self._wrap_pool_connect(self.pool.connect, _connection)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3211, in _wrap_pool_connect
    jupyter_1   |     Connection._handle_dbapi_exception_noconnection(
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2061, in _handle_dbapi_exception_noconnection
    jupyter_1   |     util.raise_(
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    jupyter_1   |     raise exception
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3208, in _wrap_pool_connect
    jupyter_1   |     return fn()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 301, in connect
    jupyter_1   |     return _ConnectionFairy._checkout(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 761, in _checkout
    jupyter_1   |     fairy = _ConnectionRecord.checkout(pool)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 419, in checkout
    jupyter_1   |     rec = pool._do_get()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 259, in _do_get
    jupyter_1   |     return self._create_connection()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 247, in _create_connection
    jupyter_1   |     return _ConnectionRecord(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 362, in __init__
    jupyter_1   |     self.__connect()
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 605, in __connect
    jupyter_1   |     pool.logger.debug("Error on connect(): %s", e)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    jupyter_1   |     compat.raise_(
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    jupyter_1   |     raise exception
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 599, in __connect
    jupyter_1   |     connection = pool._invoke_creator(self)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 578, in connect
    jupyter_1   |     return dialect.connect(*cargs, **cparams)
    jupyter_1   |   File "/opt/conda/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 584, in connect
    jupyter_1   |     return self.dbapi.connect(*cargs, **cparams)
    jupyter_1   | sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
    jupyter_1   | (Background on this error at: http://sqlalche.me/e/14/e3q8)
    

    Additional Information

    No response

    opened by lbonini94 9
  • [Azure DataStore] Handle upload strings vs bytes and filepath formation when using adlfs

    [Azure DataStore] Handle upload strings vs bytes and filepath formation when using adlfs

    • When uploading strings to as model artifact attributes to abfs using put method and adlfs, operations were failing. Added ability to alter write method based on incoming data
    • get, listdir, stat filepath handling
    • Validated performance with private integration testing against adlfs
    opened by hayesgb 8
  • [Datastore] Extend Azure blob to support other auth methods

    [Datastore] Extend Azure blob to support other auth methods

    Currently, the only supported authentication methods against AzureBlobStore is AZURE_STORAGE_CONNECTION_STRING. , and possibly AZURE_STORAGE_KEY. This enables other authentication methods including use of ServicePrincipals and SAS tokens.

    opened by hayesgb 8
  • Issue in installing MLRun locally on windows

    Issue in installing MLRun locally on windows

    Hi,

    I am following instructions to install MLRun locally on Windows 10 as described in page /install/local-docker.html with jupyter image.

    set HOST_IP=localhost set SHARED_DIR=D:\MLRun mkdir %SHARED_DIR% docker-compose -f compose.with-jupyter.yaml up results in below error.

    invalid interpolation format for services.jupyter.volumes.[]. You may need to escape any $ with another $. required variable SHARED_DIR is missing a value: err

    opened by ganesh3 6
  • [Docs]: Add documentation for mlrun.feature_store.feature_set.FeatureSetSpec

    [Docs]: Add documentation for mlrun.feature_store.feature_set.FeatureSetSpec

    MLRun Kit version checks

    • [X] I have checked that the issue still exists on the latest versions of the docs here

    Location of the documentation

    https://docs.mlrun.org/en/latest/api/mlrun.feature_store.html#mlrun.feature_store.FeatureSet.spec

    Documentation problem

    Missing documentation for this class (it is without generation of documentation but some parts of source code contains relevant documentation)

    Suggested fix for documentation

    Please, generate documentation also for this class

    opened by george0st 6
  • [Feature Store] Add `MinMaxLenValidator` and `RegexValidator`

    [Feature Store] Add `MinMaxLenValidator` and `RegexValidator`

    Add MinMaxLenValidator for Feature, including add system test and add RegexValidator (validation based on regular expression). It is very similar such as existing MinMaxValidator for Feature.

    opened by george0st 6
  • Extend filter vector ability (focus on off-line featurestore)

    Extend filter vector ability (focus on off-line featurestore)

    It will be very useful to support in the vector rich filtering e.g.:

    High priority

    • support logical conditions 'fn2 > 500 and (fn3<=500 or fn4==500)'

    Medium priority

    • support like operator 'fn5 like %sdsd%'
    • support between and in

    Low priority

    • fuzzy match for string

    BTW: you are supporting in get_offline_features only exact match see part of the code

    data = pd.DataFrame({"fn0": [39560793709,35392257080], "fn1": [27203050525,13749105613]})
    resp = fstore.get_offline_features(vector, entity_rows=data)
    
    opened by george0st 6
  • [Azure DataStore] Handle storage options as secrets

    [Azure DataStore] Handle storage options as secrets

    This add the ability to pass standard dictionary keys from fsspec's storage_options parameter into mlrun.run.get_dataitem() as secrets.

    Enabling this will more easily allow users engaging in exploratory analysis to leverage the mlrun api to fetch data_items from Azure by enabling the following

    storage_options={'account_name': "<NAME>", 'credential': <CREDENTIAL>}
    df = mlrun.run.get_dataitem("az://CONTAINER/myfile.parquet", secrets=storage_options).as_df()
    
    opened by hayesgb 6
  • [Runtimes] Add container container image to serving function status

    [Runtimes] Add container container image to serving function status

    Add the built and pushed container image name, that is used to run the nuclio function container, to the function status.

    A followup to https://github.com/nuclio/nuclio/pull/2769 released in https://github.com/nuclio/nuclio/releases/tag/1.11.7 .

    Fixes https://jira.iguazeng.com/browse/IG-21462

    opened by TomerShor 0
  • [Data Store] My Sql - target, source and driver

    [Data Store] My Sql - target, source and driver

    This pr focus in implementation of source, target for SqlDB in mlrun. SqlDB target can create or read from exists sql collection. The collection is fixed and can't change is schema in the flow.

    https://jira.iguazeng.com/browse/ML-2610

    opened by davesh0812 0
  • Mlrun Jupyter- Image pull back off error

    Mlrun Jupyter- Image pull back off error

    MLRun Version checks

    • [X] I have checked that this issue has not already been reported.

    • [X] I have confirmed this bug exists on the latest version of MLRun CE.

    Reproducible Example

    kubectl create namespace mlrun
    helm repo add mlrun-ce https://mlrun.github.io/ce
    helm repo update
    
    kubectl --namespace mlrun create secret docker-registry registry-credentials 
    --docker-server="https://index.docker.io/v1/" 
    --docker-username="xyz" 
    --docker-password="xyz" 
    --docker-email="xyz" 
    
    helm --namespace mlrun 
    install mlrun-ce 
    --wait --timeout 960s 
    --set global.registry.url="index.docker.io/v1/xyz" 
    --set global.registry.secretName=registry-credentials 
    --set global.externalHostAddress=http://192.168.49.2 
    mlrun-ce/mlrun-ce
    

    Issue Description

    I'm following mlrun Kubernetes installation kit. I'm following the way suggested by them, still getting Mlrun Jupyter - "Image pull back error". so please guide me through this.

    Expected Behavior

    I'm using Kubectl, Minikube, helm, docker etc.

    Installation OS

    Windows

    Installation Method

    Kubernetes

    Python Version

    3.8

    MLRun Version

    1.2.0

    Additional Information

    No response

    opened by harishgawade1999 5
  • [Feature Request]: Add ability to delete data from Project

    [Feature Request]: Add ability to delete data from Project

    Feature Type

    • [X] Adding new functionality to MLRun

    • [X] Changing existing functionality in MLRun

    • [x] Removing existing functionality in MLRun

    Problem Description

    When I delete the project from GUI, everything is deleted (own project, jobs, artefacts, etc.) except the data.

    And the information in dialog is not fully readable (it is not about delete of all resources under the project), see the information from GUI:

    You try to delete project "jist-from-local". 'The project is not empty. Deleting it will also delete all of its resources, such as jobs, 'artifacts, and features.

    BTW: It is necessity to delete data (e.g. parquet files) manually e.g. via linux commands in specific directories and in case of delete missing (it can generate garbage in file system).

    Feature Description

    It will be useful to have ability to delete also data stored directly in project directory (parquet files, kv files, etc.).

    Alternative Solutions

    Add note to the delete dialog, that project data (parquets, ...) are out of delete procedure and have to be deleted manually.

    Additional Context

    No response

    opened by george0st 2
Releases(v1.2.1-rc12)
  • v1.2.1-rc12(Jan 4, 2023)

  • v1.2.1-rc11(Jan 4, 2023)

    Features / Enhancements

    • Requirements: Freeze the version from ~=1.0 to ~=1.0.0 [1.2.x], #2872, @guy1992l
    • API: Include the whole CE section in frontend-spec and client-spec [1.2.x], #2847, @quaark
    • UI: Features & enhancement

    Bug fixes

    • Scheduler: Update next run time after skipping run [1.2.x], #2862, @AlonMaor14
    • Serving: Revert to db.create_or_patch_model_endpoint [1.2.x], #2850, @davesh0812
    • Unknown: Revert "[Project] Mask credentials when project.set_function [1.2.x]", #2859, @tankilevitch
    • FileDB: Add warning message when initializing FileRunDB [1.2.x], #2856, @tankilevitch
    • Project: Fix sync_functions to sync the function names from the project.spec._function_definitions map [1.2.x], #2857, @tankilevitch
    • UI: Bug fixes

    Pull requests:

    93226538 [Requirements] Freeze the version from ~=1.0 to ~=1.0.0 [1.2.x] (#2872) ae8e7d7f [Scheduler] Update next run time after skipping run [1.2.x] (#2862) b6cfe864 [Serving] Revert to db.create_or_patch_model_endpoint [1.2.x] (#2850) 65dbff5d Revert "[Project] Mask credentials when project.set_function [1.2.x]" (#2859) 4cc137a6 [FileDB] Add warning message when initializing FileRunDB [1.2.x] (#2856) e21b4d79 [Project] Fix sync_functions to sync the function names from the project.spec._function_definitions map [1.2.x] (#2857) f87a4e96 [API] Include the whole CE section in frontend-spec and client-spec [1.2.x] (#2847)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0-rc4(Jan 3, 2023)

  • v1.2.1-rc10(Dec 31, 2022)

    Features / Enhancements

    • Project: Mask credentials when project.set_function [1.2.x], #2844, @tankilevitch
    • CLI: [Projects] Add overwrite scheduled workflow - fixed [1.2.x], #2841, @yonishelach
    • UI: Features & enhancement

    Bug fixes

    • Runtimes: Fix submitting job with hyper-params config param to use correct credentials [1.2.x], #2835, @theSaarco
    • UI: Bug fixes

    Pull requests:

    de93127d [Project] Mask credentials when project.set_function [1.2.x] (#2844) 8b8fad88 [CLI][Projects] Add overwrite scheduled workflow - fixed [1.2.x] (#2841) 1c61d8a5 [Runtimes] Fix submitting job with hyper-params config param to use correct credentials [1.2.x] (#2835)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc9(Dec 28, 2022)

    Features / Enhancements

    • Makefile: Isort ignore venvs [1.2.x], #2832, @AlonMaor14
    • API: Add Fields to Client Spec for Use in UI [1.2.x], #2828, @quaark
    • Serving: GET with router & inexplicit GET in test mode [1.2.x], #2815, @davesh0812
    • UI: Features & enhancement

    Bug fixes

    • Project: Fix load_project log message (#2825) [1.2.x], #2827, @AlonMaor14
    • SDK: Fix submit_job tries to update run state when run wasn't created [1.2.x], #2822, @quaark
    • Frameworks: Enable scikit-learn v1.2.0 to work with mlrun.frameworks [1.2.x], #2814, @guy1992l
    • UI: Bug fixes

    Pull requests:

    1ff96293 [Makefile] Isort ignore venvs [1.2.x] (#2832) f9a0f5e3 [API] Add Fields to Client Spec for Use in UI [1.2.x] (#2828) 14b59f56 [Project] Fix load_project log message (#2825) [1.2.x] (#2827) 22cb4f8f [SDK] Fix submit_job tries to update run state when run wasn't created [1.2.x] (#2822) 65c7b7d6 [Frameworks] Enable scikit-learn v1.2.0 to work with mlrun.frameworks [1.2.x] (#2814) 585b0398 [Serving] GET with router & inexplicit GET in test mode [1.2.x] (#2815)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Dec 28, 2022)

  • v1.1.3-rc4(Dec 27, 2022)

  • v1.3.0-rc3(Dec 26, 2022)

  • v1.2.1-rc8(Dec 25, 2022)

    Features / Enhancements

    • Projects: Slack notify remote workflow [1.2.x], #2805, @yonishelach

    • API: verify cookie session is iguazio-like sessions [1.2.x], #2797, @liranbg

    • Feature Store: Fix: set index before write to target in local merger [1.2.x], #2790, @gtopper

    • Run: Ignore bokeh installation with warning on import error [1.2.x], #2792, @AlonMaor14

    • UI: Features & enhancement

    Bug fixes

    Pull requests:

    f09586f5 [Projects] Slack notify remote workflow [1.2.x] (#2805) 721ed35a [API] verify cookie session is iguazio-like sessions [1.2.x] (#2797) 5171dd9f [Feature Store] Fix: set index before write to target in local merger [1.2.x] (#2790) 5d9ada27 [Run] Ignore bokeh installation with warning on import error [1.2.x] (#2792)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3-rc3(Dec 22, 2022)

  • v1.3.0-rc2(Dec 20, 2022)

  • v1.2.1-rc7(Dec 21, 2022)

    Features / Enhancements

    • Datastore: Backport fix handling of DataItem path in windows [1.2.x], #2785, @yaronha

    • Pipelines: Support list_pipelines with pagination & predicates [1.2.x], #2786, @theSaarco

    • CLI: [Runtimes] URL placeholder for using run args without URL [1.2.x], #2782, @AlonMaor14

    • FeatureStore: Fixing graph plot with multiple targets of same kind + adding storage kind to plot (#2766) [1.2.x], #2779, @theSaarco

    • Artifacts: Serialize DirArtifact to dictionary using new format (#2778) [1.2.x], #2780, @theSaarco

    • UI: Features & enhancement

    Bug fixes

    Pull requests:

    a8d20686 [Datastore] Backport fix handling of DataItem path in windows [1.2.x] (#2785) 9bfadd02 [Pipelines] Support list_pipelines with pagination & predicates [1.2.x] (#2786) 04136edc [CLI][Runtimes] URL placeholder for using run args without URL [1.2.x] (#2782) a4587012 [FeatureStore] Fixing graph plot with multiple targets of same kind + adding storage kind to plot (#2766) [1.2.x] (#2779) 38230699 [Artifacts] Serialize DirArtifact to dictionary using new format (#2778) [1.2.x] (#2780)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0-rc1(Dec 20, 2022)

    Features / Enhancements

    • API: Add project-scope files/filestat API that work with project secrets, #2714, @theSaarco

    • API: Fix run paramaters larger than int64 corrupting projects, #2671, @quaark

    • API: Remove print from api, #2734, @liranbg

    • API: verify cookie session is iguazio-like sessions, #2773, @liranbg

    • Artifacts: Don't resolve artifact target_path if explicitly request upload=False, #2732, @tankilevitch

    • Artifacts: Serialize DirArtifact to dictionary using new format, #2778, @theSaarco

    • Artifacts: Set dataset stats according to stats flag in log_dataset, #2710, @TomerShor

    • CI: Bump prefix version for build images, #2645, @tankilevitch

    • CI: Fix Open Source System Tests Fail Deploy, #2772, @quaark

    • CI: Run Open Source System Tests Against MLRun CE, #2667, @quaark

    • CI: Updated installation and bug report issue templates, #2193, @nschenone

    • CLI: Do not ignore unknown options, #2678, @AlonMaor14

    • CLI: Fix cli "get runtime" command, #2676, @yaronha

    • CLI: Fix watch when running function through CLI, #2730, @tankilevitch

    • CLI: Support default .env file location + CLI "config set" command, #2690, @yaronha

    • CLI: Validate base arguments, #2745, @AlonMaor14

    • CLI: Waiting for pod status with timeout fix when running project, #2635, @AlonMaor14

    • CLI: [Runtimes] URL placeholder for using run args without URL, #2765, @AlonMaor14

    • Config: Skip failures in first init of config (mlrun import), #2742, @yaronha

    • DataStore: Allow passing secrets to create datastore and don't cache datastores when running on API, #2633, @tankilevitch

    • DataStore: Fix how we resolve if running as API, #2680, @tankilevitch

    • DataStore: Fix makedirs not threadsafe, #2723, @liranbg

    • Datastore: Fix _write_dataframe to pass storage_options to pandas write operations, #2709, @gtopper

    • Datastore: Fix handling of DataItem path in windows + support path mappings, #2774, @yaronha

    • Docs: Add ecosystem, bug fixes, #2682, @jillnogold

    • Docs: Add some docs to artifact client code, #2775, @tankilevitch

    • Docs: Added MLRun Cheat Sheet, #2647, @nschenone

    • Docs: Better docstring for mount_s3, added warnings, #2737, @theSaarco

    • Docs: CLI project schedule, ML-3007, #2721, @jillnogold

    • Docs: Edit wording to make AWS Documentation more clear, #2697, @yevgenykhazan

    • Docs: Fix docstring errors, ML-2916, ML-2909, #2662, @jillnogold

    • Docs: Improve log_dataset docstring wrt local_path flag, #2753, @TomerShor

    • Docs: Typo mistakes, #2731, @george0st

    • Docs: Update MLRun CE Kubernetes Installation Docs for new "only full" Deployment, #2673, @quaark

    • Docs: Update CONTRIBUTING.md, #2754, @moranbental

    • Feature Store: Get time from time-column instead of event metadata, #2660, @gtopper

    • FeatureStore: Fix serving to support AVRO encoded kafka, #2658, @assaf758

    • FeatureStore: Fixing impute failures when using get_online_feature_service with a feature-vector uri, #2666, @theSaarco

    • FeatureStore: Fixing graph plot with multiple targets of same kind + adding storage kind to plot, #2766, @theSaarco

    • Frameworks: Adjusted the model servers to support step_to_dict, #2653, @guy1992l

    • Frameworks: Remove the handling of regression models, #2722, @guy1992l

    • MPI: Fix local variable resp referenced before assignment, #2639, @tankilevitch

    • Model Monitoring: Add abstraction for model endpoint store target, #2378, @Eyal-Danieli

    • Model Monitoring: Add model_monitoring package to setup.py, #2674, @Eyal-Danieli

    • Model Monitoring: Ensure auth info for model monitoring batch job function, #2688, @Eyal-Danieli

    • Project: Set default sync=True in project.get_function(), #2720, @yaronha

    • Projects: Raise error for workflow scheduling with non-remote project, #2711, @yonishelach

    • Projects: Set default context path to working directory, #2740, @liranbg

    • Run: Fix not updating the run state when running local fails on pre-loading of the function, #2762, @tankilevitch

    • Run: Fix outputs wait for completion, #2663, @tankilevitch

    • Run: Normalize function name in new_function, #2696, @TomerShor

    • Runtime: Fix resolving completion time, #2738, @liranbg

    • Runtime: Move some k8s logic to k8s helpers, #2733, @liranbg

    • Schedules: Fix label handling when reloading schedules (ML-3014), #2719, @theSaarco

    • Schedules: Scheduled tasks access-key usage refactor, #2695, @theSaarco

    • SecretStore: Fix overwriting SecretStore credentials on API, #2661, @tankilevitch

    • Secrets: Add a global get_secret_or_env function to retrieve secret values, #2659, @theSaarco

    • Secrets: Fix get_secret_or_env, #2675, @theSaarco

    • Serving: Fixing - OneHotEncoder with pandas engine - fails when the values has spaces or hyphens in them, #2713, @davesh0812

    • Serving: Improve mock server handling, #2726, @yaronha

    • Spark: Shut local spark context down when ingest completes, #2692, @gtopper

    • Test: fix mkdir race condition - round 2, #2750, @liranbg

    • Tests: Fix failing feature store system tests, #2643, @gtopper

    • Unknown: Raise error on ingest with KafkaSource, #2654, @gtopper

    • Unknown: Revert "[CLI] Do not ignore unknown options", #2703, @AlonMaor14

    • Unknown: Revert "[Spark] Shut spark context down when ingest completes", #2752, @gtopper

    • Utils: - Remove spammy logs, #2724, @liranbg

    • UI: Features & enhancement

    Bug fixes

    Pull requests:

    27e8a957 [Datastore] Fix handling of DataItem path in windows + support path mappings (#2774) c08f2d10 [CLI][Runtimes] URL placeholder for using run args without URL (#2765) b1d35de9 [Feature Store] Get time from time-column instead of event metadata (#2660) fb2f4a70 [Artifacts] Serialize DirArtifact to dictionary using new format (#2778) 598015a9 [Docs] Add some docs to artifact client code (#2775) 183506b2 [FeatureStore] Fixing graph plot with multiple targets of same kind + adding storage kind to plot (#2766) 83e52077 [API] verify cookie session is iguazio-like sessions (#2773) 1613b0d8 [CI] Fix Open Source System Tests Fail Deploy (#2772) 0124e17b [CI] Updated installation and bug report issue templates (#2193) d7a7aa6b [Docs] Update MLRun CE Kubernetes Installation Docs for new "only full" Deployment (#2673) eea96ddf [Run] Fix not updating the run state when running local fails on pre-loading of the function (#2762) aa7e2d03 [API] Fix run paramaters larger than int64 corrupting projects (#2671) 11f31303 [CLI] Validate base arguments (#2745) 70b09e54 [Docs] Improve log_dataset docstring wrt local_path flag (#2753) 915bcf13 [Config] Skip failures in first init of config (mlrun import) (#2742) 49187555 [Test] fix mkdir race condition - round 2 (#2750) dbeaadde [Serving] Improve mock server handling (#2726) e8902c4e [Model Monitoring] Ensure auth info for model monitoring batch job function (#2688) 7fdaaa75 [Docs] Update CONTRIBUTING.md (#2754) cb89a7b3 Revert "[Spark] Shut spark context down when ingest completes" (#2752) bdaf21fd [Docs] Better docstring for mount_s3, added warnings (#2737) b71feaad [Projects] Set default context path to working directory (#2740) e0719288 [Runtime] Move some k8s logic to k8s helpers (#2733) e103e8ce [Runtime] Fix resolving completion time (#2738) 1e627882 [CLI] Fix watch when running function through CLI (#2730) 7f176494 [Docs] CLI project schedule, ML-3007 (#2721) 4b4aba12 [API] Remove print from api (#2734) b67e88c5 [Docs] Typo mistakes (#2731) 50820728 [Artifacts] Don't resolve artifact target_path if explicitly request upload=False (#2732) c9b6a91c [Frameworks] Remove the handling of regression models (#2722) 23bbdf98 [Docs] Add ecosystem, bug fixes (#2682) 59a9616b [Project] Set default sync=True in project.get_function() (#2720) e77d7088 [DataStore] Fix makedirs not threadsafe (#2723) ab2be0f4 [Utils] - Remove spammy logs (#2724) 10f72b60 [API] Add project-scope files/filestat API that work with project secrets (#2714) c0f866f4 [Schedules] Fix label handling when reloading schedules (ML-3014) (#2719) 9ab0f43d [Docs] Edit wording to make AWS Documentation more clear (#2697) a2e75a0a [Projects] Raise error for workflow scheduling with non-remote project (#2711) 0918ae03 [CLI] Support default .env file location + CLI "config set" command (#2690) daee4ccc [Serving] Fixing - OneHotEncoder with pandas engine - fails when the values has spaces or hyphens in them (#2713) d85e7e2f [Artifacts] Set dataset stats according to stats flag in log_dataset (#2710) 6edc7ba9 [Datastore] Fix _write_dataframe to pass storage_options to pandas write operations (#2709) 2589e441 [Schedules] Scheduled tasks access-key usage refactor (#2695) 5497c842 Revert "[CLI] Do not ignore unknown options" (#2703) 47de2bba [Run] Normalize function name in new_function (#2696) 3fc9b84d [CLI] Fix cli "get runtime" command (#2676) 2a9316b7 [Spark] Shut local spark context down when ingest completes (#2692) f93d8fee [CLI] Do not ignore unknown options (#2678) 671247a7 [DataStore] Fix how we resolve if running as API (#2680) e07e4c21 [Secrets] Fix get_secret_or_env (#2675) 7e3f9888 [Frameworks] Adjusted the model servers to support step_to_dict (#2653) c359b618 [Model Monitoring] Add model_monitoring package to setup.py (#2674) d560118d [CI] Run Open Source System Tests Against MLRun CE (#2667) 08dc2e9a [FeatureStore] Fix serving to support AVRO encoded kafka (#2658) 3c12c5c4 [Docs] Fix docstring errors, ML-2916, ML-2909 (#2662) 09c3e9ae [Run] Fix outputs wait for completion (#2663) 47a0a621 [FeatureStore] Fixing impute failures when using get_online_feature_service with a feature-vector uri (#2666) 773a623e [MPI] Fix local variable resp referenced before assignment (#2639) 14cba17b [Docs] Added MLRun Cheat Sheet (#2647) 1f07fcd0 [Secrets] Add a global get_secret_or_env function to retrieve secret values (#2659) 2676f739 [SecretStore] Fix overwriting SecretStore credentials on API (#2661) 264da255 [Model Monitoring] Add abstraction for model endpoint store target (#2378) 79557146 [DataStore] Allow passing secrets to create datastore and don't cache datastores when running on API (#2633) ca635b88 Raise error on ingest with KafkaSource (#2654) c693606d [Tests] Fix failing feature store system tests (#2643) b0be0c94 [CI] Bump prefix version for build images (#2645) d57ea8d3 [CLI] Waiting for pod status with timeout fix when running project (#2635)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc6(Dec 19, 2022)

    Features / Enhancements

    • Spark: Use FeatureSet's timestamp_key as fallback for source time_field [1.2.x], #2771, @gtopper

    • Frameworks: Remove regression special handling [1.2.x], #2770, @guy1992l

    • UI: Features & enhancement

    Bug fixes

    Pull requests:

    c6ced7a4 [Spark] Use FeatureSet's timestamp_key as fallback for source time_field [1.2.x] (#2771) 8690cb62 [Frameworks] Remove regression special handling [1.2.x] (#2770)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc5(Dec 16, 2022)

    Features / Enhancements

    • Backports: Cherry pick latest updates to [1.2.x], #2768, @yaronha
    • Model Monitoring: Ensure auth info for model monitoring batch job function [1.2.x], #2756, @Eyal-Danieli
    • CLI: Validate base arguments (#2745) [1.2.x], #2761, @AlonMaor14
    • Requirements: Bump storey to 1.2.5 [1.2.x], #2758, @gtopper Docs: Better docstring for mount_s3, added warnings (#2737) [1.2.x], #2746, @theSaarco
    • Projects: Set default context path to working directory [1.2.x], #2744, @liranbg
    • Docs: Improve log_dataset docstring wrt local_path flag [1.2.x], #2743, @TomerShor
    • UI: Features & enhancement

    Bug fixes

    • API: Fix run paramaters larger than int64 corrupting projects [1.2.x], #2763, @quaark
    • Run: Fix not updating the run state when running local fails on pre-loading of the function [1.2.x], #2747, @tankilevitch
    • Unknown: Revert "[Spark] Shut local spark context down when ingest completes [1.2.x], #2751, @gtopper
    • UI: Bug fixes

    Pull requests:

    90a86df6 [Backports] Cherry pick latest updates to [1.2.x] (#2768) 18e9c7d5 [Model Monitoring] Ensure auth info for model monitoring batch job function [1.2.x] (#2756) e047ea34 [CLI] Validate base arguments (#2745) [1.2.x] (#2761) 87920ef6 [API] Fix run paramaters larger than int64 corrupting projects [1.2.x] (#2763) 0e4c79c0 [Requirements] Bump storey to 1.2.5 [1.2.x] (#2758) ac41d7d2 [Run] Fix not updating the run state when running local fails on pre-loading of the function [1.2.x] (#2747) 7989cc99 Revert "[Spark] Shut local spark context down when ingest completes [1.2.x] (#2751) 53584bf1 [Docs] Better docstring for mount_s3, added warnings (#2737) [1.2.x] (#2746) 92e59aea [Projects] Set default context path to working directory [1.2.x] (#2744) 1286b087 [Docs] Improve log_dataset docstring wrt local_path flag [1.2.x] (#2743)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3-rc2(Dec 16, 2022)

    Features / Enhancements

    • API: Add timeouts for requests which are getting rerouted to chief [1.1.x], #2764, @tankilevitch
    • API: Configure Uvicorn Keep Alive Timeout [1.1.x], #2760, @quaark
    • UI: Features & enhancement

    Bug fixes

    • Unknown: Revert "[Projects] Raise error for workflow scheduling with non-remote project [1.1.x]", #2767, @tankilevitch
    • Projects: Raise error for workflow scheduling with non-remote project [1.1.x], #2708, @yonishelach
    • UI: Bug fixes

    Pull requests:

    b4e78454 [API] Add timeouts for requests which are getting rerouted to chief [1.1.x] (#2764) 2c602f77 Revert "[Projects] Raise error for workflow scheduling with non-remote project [1.1.x]" (#2767) 6de4131f [API] Configure Uvicorn Keep Alive Timeout [1.1.x] (#2760) 13880f4e [Projects] Raise error for workflow scheduling with non-remote project [1.1.x] (#2708)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc4(Dec 13, 2022)

    Features / Enhancements

    • API: Add project-scope files/filestat API that work with project secrets [1.2.x], #2728, @theSaarco
    • UI: Features & enhancement

    Bug fixes

    • Runtime: Fix resolving completion time [1.2.x], #2741, @liranbg
    • CLI: Backport - Fix watch when running function through CLI [1.2.x], #2739, @tankilevitch
    • Unknown: Revert "[Projects] Raise error for workflow scheduling with non-remote project [1.2.x]", #2736, @tankilevitch
    • API: Remove print from api [1.2.x], #2717, @liranbg
    • Artifacts: Don't resolve artifact target_path if explicitly request upload=False [1.2.x], #2704, @tankilevitch
    • Schedules: Fix label handling when reloading schedules [1.2.x], #2725, @theSaarco
    • Utils: - Remove spammy logs [1.2.x], #2729, @liranbg
    • CLI: Merge fixes for get runtime command and config set environment [1.2.x], #2715, @yaronha
    • Datastore: Fix _write_dataframe to pass storage_options to pandas write operations [1.2.x], #2707, @gtopper
    • UI: Bug fixes

    Pull requests:

    38bffe6d [Runtime] Fix resolving completion time [1.2.x] (#2741) 9b909a3f [CLI] Backport - Fix watch when running function through CLI [1.2.x] (#2739) dbc826a0 Revert "[Projects] Raise error for workflow scheduling with non-remote project [1.2.x]" (#2736) 947fb955 [API] Remove print from api [1.2.x] (#2717) 2dc67dbd [Artifacts] Don't resolve artifact target_path if explicitly request upload=False [1.2.x] (#2704) 7890fb51 [Schedules] Fix label handling when reloading schedules [1.2.x] (#2725) f8eb7b5b [API] Add project-scope files/filestat API that work with project secrets [1.2.x] (#2728) a03f2848 [Utils] - Remove spammy logs [1.2.x] (#2729) 28d39d6d [CLI] Merge fixes for get runtime command and config set environment [1.2.x] (#2715) 90dc095c [Datastore] Fix _write_dataframe to pass storage_options to pandas write operations [1.2.x] (#2707)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc3(Dec 9, 2022)

    Features / Enhancements

    • Artifacts: Set dataset stats according to stats flag in log_dataset [1.2.x], #2698, @TomerShor
    • Schedules: Scheduled tasks access-key usage refactor (#2695) [1.2.x], #2705, @theSaarco
    • Run: Normalize function name in new_function [1.2.x], #2701, @TomerShor
    • Spark: Shut local spark context down when ingest completes [1.2.x], #2694, @gtopper
    • Projects: Add missing overwrite field to WorkflowSpec [1.2.x], #2683, @yonishelach
    • UI: Features & enhancement

    Bug fixes

    Pull requests:

    cece4e9e [Artifacts] Set dataset stats according to stats flag in log_dataset [1.2.x] (#2698) e5983c41 [Schedules] Scheduled tasks access-key usage refactor (#2695) [1.2.x] (#2705) 4bd22b40 Revert "[CLI] Do not ignore unknown options [1.2.x]" (#2702) c04bcff3 [Run] Normalize function name in new_function [1.2.x] (#2701) 88e9a519 [Projects] Raise error for workflow scheduling with non-remote project [1.2.x] (#2689) 05eb0daf [Spark] Shut local spark context down when ingest completes [1.2.x] (#2694) 603ebeea [Projects] Add missing overwrite field to WorkflowSpec [1.2.x] (#2683)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc2(Dec 6, 2022)

    Features / Enhancements

    • CLI: Do not ignore unknown options [1.2.x], #2665, @AlonMaor14
    • Project: Add option to overwrite workflow schedule [1.2.x], #2657, @yonishelach
    • UI: Features & enhancement

    Bug fixes

    • DataStore: Fix how we resolve if running as API [1.2.x], #2681, @tankilevitch
    • Secrets: Fix get_secret_or_env [1.2.x], #2677, @theSaarco
    • CLI: Waiting for pod status with timeout fix when running project [1.2.x], #2652, @AlonMaor14
    • FeatureStore: Fix serving to support AVRO encoded kafka (#2658), #2672, @assaf758
    • Run: Fix outputs wait for completion [1.2.x], #2670, @tankilevitch
    • MPI: Fix local variable resp referenced before assignment [1.2.x], #2668, @tankilevitch
    • FeatureStore: Fixing impute failures when using get_online_feature_service with a feature-vector uri (#2666) [1.2.x], #2669, @theSaarco
    • UI: Bug fixes

    Pull requests:

    4a5a417a [DataStore] Fix how we resolve if running as API [1.2.x] (#2681) e5313dde [CLI] Do not ignore unknown options [1.2.x] (#2665) 80903494 [Secrets] Fix get_secret_or_env [1.2.x] (#2677) 753948cd [CLI] Waiting for pod status with timeout fix when running project [1.2.x] (#2652) 2e1eb3f2 [FeatureStore] Fix serving to support AVRO encoded kafka (#2658) (#2672) b991d310 [Run] Fix outputs wait for completion [1.2.x] (#2670) 1f083781 [MPI] Fix local variable resp referenced before assignment [1.2.x] (#2668) edfd00f7 [FeatureStore] Fixing impute failures when using get_online_feature_service with a feature-vector uri (#2666) [1.2.x] (#2669) c1a12fb2 [Project] Add option to overwrite workflow schedule [1.2.x] (#2657)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1-rc1(Dec 5, 2022)

  • v1.1.3-rc1(Dec 5, 2022)

    Features / Enhancements

    Bug fixes

    • Project: Add option to overwrite workflow schedule [1.1.x], #2651, @yonishelach
    • CLI: Waiting for pod status with timeout fix when running project [1.1.x], #2638, @AlonMaor14
    • UI: Bug fixes

    Pull requests:

    a7928716 [CI] Bump prefix version to 1.1.3 (#2656) 0a4b1881 [Project] Add option to overwrite workflow schedule [1.1.x] (#2651) 071571f6 [CLI] Waiting for pod status with timeout fix when running project [1.1.x] (#2638)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Dec 1, 2022)

    Artifacts

    • Support for artifact tagging SDK: Add tag_artifacts and delete_artifacts_tags that can be used to modify existing artifacts tags and have more than one version for an artifact. API: Introduce new endpoints in /projects/<project>/tags.

    Auth

    • Support S3 profile and assume-role when using fsspec.
    • Support GitHub fine grained tokens.

    Functions

    • Add function.with_annotations({"framework":"tensorflow"}) to user created functions.
    • Add overwrite_build_params to project.build_function() so the user can choose whether or not to keep the build params that were used in previous function builds.

    Feature Store

    • Support Redis as an online feature set, for storey engine only.
    • Support GCP objects as a data source for the feature store.
    • Fully support ingesting with pandas engine - now equivalent to ingestion with storey engine: Support DataFrame with multi-index. Support mlrun steps when using pandas engine: OneHotEncoder , DateExtractor, MapValue, Imputer and FeatureValidation.
    • Add new step: DropFeature for pandas and storey engines.
    • Add param query for get_offline_feature for filtering the output.

    Frameworks

    • Add HuggingFaceModelServer to mlrun.frameworks at mlrun.frameworks.huggingface to serve HuggingFace models.

    Installation

    • Add option to install google-cloud requirements using mlrun[google-cloud]: when installing MLRun for integration with GCP clients, only compatible packages are installed.

    Documentation

    • Restructured, and new content

    Third party integrations

    • Supports Confluent Kafka (Tech Preview)

    Internal

    • Refactor artifacts endpoints to follow the MLRun convention of /projects/<project>/artifacts/... .
    • Add /api/_internal/memory-reports/ endpoints for memory related metrics to better understand the memory consumption of the API.
    • Improve the HTTP retry mechanism.
    • Support a new lightweight mechanism for KFP pods to pull the run state they triggered. Default behavior is legacy, which pulls the logs of the run to figure out the run state. The new behavior can be enabled using a feature flag configured in the API.

    Breaking changes

    • Feature store: Ingestion using pandas now takes the dataframe and creates an index out of the entity column (and removes it from being a column in this df). This could cause breakage for existing custom steps when using a pandas engine.

    Bug fixes:

    • Support logging artifacts larger than 5GB to V3IO. #2455
    • Limit KFP to kfp~=1.8.0, <1.8.14 due to non-backwards changes done in 1.8.14 for ParallelFor, which isn’t compatible with the MLRun managed KFP server (1.8.1). #2516
    • Add artifact_path enrichment from project artifact_path . Previously, the parameter wasn't applied to project runs when defining project.artifact_path. #2507
    • Align timeouts for requests that are getting re-routed from worker to chief (for projects/background related endpoints). #2565
    • Fix legacy artifacts load when loading a project. Fixed corner cases when legacy artifacts were saved to yaml and loaded back into the system using load_project(). #2584
    • Fix artifact latest tag enrichment to happen also when user defined a specific tag. #2572
    • Fix zip source extraction during function build. #2588
    • Fix Docker compose deployment so Nuclio is configured properly with a platformConfig file that sets proper mounts and network configuration for Nuclio functions, meaning that they run in the same network as MLRun. #2601
    • Workaround for background tasks getting cancelled prematurely, due to the current FastAPI version that has a bug in the starlette package it uses. The bug caused the task to get cancelled if the client’s http connection was closed before the task was done. #2618
    • Fix run fails after deploying function without defined image. #2530.
    • Scheduled jobs failed on GKE with resource quota error. #2520.
    • Can now delete a model via tag. #2433.
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc22(Nov 29, 2022)

    Features / Enhancements

    Bug fixes

    • Run: Fix handler param in run_function was overwritten by default handler, #2631, @yaronha
    • UI: Bug fixes

    Pull requests:

    e6c7e521 [Requirements] Bump storey version to 1.2.4 (#2634) f691e7f9 [Run] Fix handler param in run_function was overwritten by default handler (#2631)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc21(Nov 28, 2022)

    Features / Enhancements

    Bug fixes

    • Artifacts: Fix querying artifacts with tag name while link artifact is not tagged, #2627, @theSaarco
    • Artifacts: Fix list artifacts when having multiple hyper params runs, #2625, @tankilevitch
    • UI: Bug fixes

    Pull requests:

    d01483fb [Requirements] Bump v3io version to 0.5.20 (#2628) 22852ad5 [Artifacts] Fix querying artifacts with tag name while link artifact is not tagged (#2627) 6581610d [Artifacts] Fix list artifacts when having multiple hyper params runs (#2625)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc20(Nov 24, 2022)

  • v1.2.0-rc19(Nov 24, 2022)

    Features / Enhancements

    Bug fixes

    • Projects: Fix project doesn't persist changes on a function when using project.build/run_function, #2624, @tankilevitch
    • Docs: Fix docs generating of frameworks, #2623, @yonishelach
    • API: fix check for k8s in get logs, #2622, @yaronha
    • UI: Bug fixes

    Pull requests:

    06e353a8 [Projects] Fix project doesn't persist changes on a function when using project.build/run_function (#2624) dcc030bc [Docs] Fix docs generating of frameworks (#2623) 16e3a481 [API] fix check for k8s in get logs (#2622)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc18(Nov 22, 2022)

    Features / Enhancements

    Bug fixes

    • Pipelines: Fix how we pull logs when httpdb.logs.pipeline.pull_state.mode=enabled, #2620, @tankilevitch
    • API: Fix Background Tasks Being Cancelled Prematurely, #2618, @quaark
    • System Tests: Add restart to datanode docker registry before cleanup, #2619, @tankilevitch
    • Builder: Build function from source fixes, #2617, @AlonMaor14
    • UI: Bug fixes

    Pull requests:

    3d4802d1 [Pipelines] Fix how we pull logs when httpdb.logs.pipeline.pull_state.mode=enabled (#2620) edc00dfa [SDK] Support string values in min/max replicas (#2606) 9c1b8c52 [API] Fix Background Tasks Being Cancelled Prematurely (#2618) 01730067 [System Tests] Add restart to datanode docker registry before cleanup (#2619) da9da17e [Builder] Build function from source fixes (#2617)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc17(Nov 21, 2022)

    Features / Enhancements

    • Build: Add overwrite_build_params for build_function, #2604, @tankilevitch
    • Job: Load source at runtime or build time fix, #2588, @AlonMaor14
    • Pipelines: Change the way the kfp pod pulls the run state, #2548, @tankilevitch
    • Frameworks: Add alias for the SciKit-Learn model server, #2615, @guy1992l
    • API: Adding cloud storage to default allowed file paths, #2614, @theSaarco
    • Projects: Support GH fine grained tokens, #2611, @theSaarco
    • Tests: Adopt test_sync_pipeline_chunks to entities set to df index, #2599, @assaf758
    • Docs: figure updated with Function Hub, #2602, @jillnogold
    • UI: Features & enhancement

    Bug fixes

    • SDK: Fix console notification being printed when also ipython notification is displayed, #2610, @quaark
    • Feature Store: Fix target path in spark merger, #2605, @gtopper
    • Tests: Fix test_schedule_on_filtered_by_time, #2598, @gtopper
    • Unknown: Fix Docker Compose Deployment, #2601, @quaark
    • Frameworks: Fixed bug of no validation set in training, #2608, @guy1992l
    • System Tests: Add checkout before copying cleanup.py, #2612, @tankilevitch
    • API: Configure Uvicorn Keep Alive Timeout, #2613, @quaark
    • Client-Spec: Pass logs config through the client spec, #2616, @tankilevitch
    • Run: Ignore returned None values for logging, #2603, @guy1992l
    • UI: Bug fixes

    Pull requests:

    85d5016e [Client-Spec] Pass logs config through the client spec (#2616) 15e20ef6 [Build] Add overwrite_build_params for build_function (#2604) 0ad3530e [SDK] Fix console notification being printed when also ipython notification is displayed (#2610) a57b4b92 [Job] Load source at runtime or build time fix (#2588) edb1b723 [Pipelines] Change the way the kfp pod pulls the run state (#2548) 04a96f2c [Frameworks] Add alias for the SciKit-Learn model server (#2615) 926732ce Fix Docker Compose Deployment (#2601) 0744d011 [Frameworks] Fixed bug of no validation set in training (#2608) efec5fac [API] Configure Uvicorn Keep Alive Timeout (#2613) 1e4db64d [API] Adding cloud storage to default allowed file paths (#2614) b062157f [Projects] Support GH fine grained tokens (#2611) 024aa277 [System Tests] Add checkout before copying cleanup.py (#2612) 2b391945 [Feature Store] Fix target path in spark merger (#2605) 6efd5574 [Tests] Fix test_schedule_on_filtered_by_time (#2598) db4a9be8 [Tests] Adopt test_sync_pipeline_chunks to entities set to df index (#2599) 63f3db6a [Run] Ignore returned None values for logging (#2603) ec1a12bf [Docs] figure updated with Function Hub (#2602)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0-rc16(Nov 17, 2022)

    Features / Enhancements

    • System Tests: Change order of system tests cleanup and add datanode docker registry restart, #2600, @tankilevitch
    • Docs: Change marketplace to Function Hub, co-located data ingestion topics, #2590, @jillnogold
    • Docs: AWS install with policy, #2591, @gilad-shaham
    • Artifacts: List Artifacts enrich with tag for all tags, #2589, @tankilevitch
    • Frameworks: HuggingFace model server, #2594, @guy1992l
    • Unknown: [DataStore] Fix v3io listdir: fix logic error and move to httpclient transport, #2592, @assaf758
    • UI: Features & enhancement

    Bug fixes

    • Tags: Fix warning Too much data for declared Content-Length in Delete Tags endpoint, #2597, @tankilevitch
    • Tests: Fix test_pandas_write_parquet, #2596, @gtopper
    • Datastore: Fix double passing of named parameter, #2578, @gtopper
    • UI: Bug fixes

    Pull requests:

    f45da1ff [System Tests] Change order of system tests cleanup and add datanode docker registry restart (#2600) c3a97564 [Docs] Change marketplace to Function Hub, co-located data ingestion topics (#2590) d5ad0bce [Tags] Fix warning Too much data for declared Content-Length in Delete Tags endpoint (#2597) 47c6616e [Docs] AWS install with policy (#2591) f4c45fc0 [Artifacts] List Artifacts enrich with tag for all tags (#2589) 747f45ef [Tests] Fix test_pandas_write_parquet (#2596) ff4d0a64 [Frameworks] HuggingFace model server (#2594) 15d7693d [DataStore] Fix v3io listdir: fix logic error and move to httpclient transport (#2592) 62e2e7d6 [Datastore] Fix double passing of named parameter (#2578)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.2-rc3(Nov 18, 2022)

Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

CoCa - Pytorch Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch. They were able to elegantly fit in contras

Phil Wang 565 Dec 30, 2022
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p

ESPnet 5.9k Jan 04, 2023
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.

Subspace Adversarial Training Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust. However,

15 Sep 02, 2022
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 01, 2023
Code to reproduce the results for Compositional Attention

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

Sarthak Mittal 58 Nov 30, 2022
Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

Distant Supervision for Scene Graph Generation Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation. Introduction The pape

THUNLP 23 Dec 31, 2022
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m

Tawara 12 Apr 07, 2022
Official Repository for the paper "Improving Baselines in the Wild".

iWildCam and FMoW baselines (WILDS) This repository was originally forked from the official repository of WILDS datasets (commit 7e103ed) For general

Kazuki Irie 3 Nov 24, 2022
PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

SimSiam: Exploring Simple Siamese Representation Learning This is a PyTorch implementation of the SimSiam paper: @Article{chen2020simsiam, author =

Facebook Research 834 Dec 30, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
Visual dialog agents with pre-trained vision-and-language encoders.

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation Or READ-UP: Referring Expression Agent Dialog with Unified Pretr

7 Oct 08, 2022
Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).

Zero-shot Natural Language Video Localization (ZSNLVL) by Pseudo-Supervised Video Localization (PSVL) This repository is for Zero-shot Natural Languag

Computer Vision Lab. @ GIST 37 Dec 27, 2022
The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"

pretraining-learning-curves This is the repository for the paper When Do You Need Billions of Words of Pretraining Data? Edge Probing We use jiant1 fo

ML² AT CILVR 19 Nov 25, 2022
MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

Mat Leonard 304 Dec 25, 2022
gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images

gACSON gACSON software is to visualize, segment, and analyze the morphology of neurons in three-dimensional electron microscopy images. If you use any

Andrea Behanova 2 May 31, 2022
Code for project: "Learning to Minimize Remainder in Supervised Learning".

Learning to Minimize Remainder in Supervised Learning Code for project: "Learning to Minimize Remainder in Supervised Learning". Requirements and Envi

Yan Luo 0 Jul 18, 2021
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"

Visually Grounded Bert Language Model This repository is the official implementation of Explainable Semantic Space by Grounding Language to Vision wit

17 Dec 17, 2022