Azure plugins for Feast (FEAture STore)

Overview

Feast on Azure

This project provides resources to enable running a feast feature store on Azure.

Feast Azure Provider

The Feast Azure provider acts like a plugin that allows Feast users to connect to:

  • Azure SQL DB and/or Synapse SQL as the offline store
  • Azure cache for Redis as the online store
  • Azure blob storage for the feast registry store

📐 Architecture

The interoperable design of feast means that many Azure services can be used to produce and/or consume features (for example: Azure ML, Synapse, Azure Databricks, Azure functions, etc).

azure provider architecture

For more details, including setup please navigate to the provider directory in this repo

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Comments
  • SQL error on creating table - notebook part1-load-data.ipynb

    SQL error on creating table - notebook part1-load-data.ipynb

    I have followed all steps mentioned in https://github.com/Azure/feast-azure/tree/main/provider/tutorial and had even changes roleDefinitionID parameter as per previous defect but still I'm getting same error. image

    opened by delta0123 6
  • SQL error on creating table - notebook part1-load-data.ipynb

    SQL error on creating table - notebook part1-load-data.ipynb

    I've followed the steps at https://github.com/Azure/feast-azure/tree/main/provider/tutorial but couldnt get table for customer data to load. Did a printf and made sure keyvault secret has a value. I am using the one button deploy method (not k8 cluster).

    Screenshot: Error: OperationalError: (pyodbc.OperationalError) ('HYT00', '[HYT00] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)') (Background on this error at: https://sqlalche.me/e/14/e3q8)

    image

    documentation provider 
    opened by cbtham 6
  • Sample notebooks

    Sample notebooks

    Hey all, Glad to see this initiative. Earlier this year, I tried Feast 0.9 on GCP/GKE : https://paravatha.medium.com/feast-setup-your-own-ml-feature-store-on-kubernetes-5b3193c2b62c

    Unfortunately, I don't have access to create Redis. So, If you have any sample notebooks for just the offline store part, I'd like to give it a try.

    documentation 
    opened by paravatha 6
  • Adding local helm chart and changing install script

    Adding local helm chart and changing install script

    Adding helm chart and changing script, so feast can be installed from local helm chart that contains the updates rbac apis thats compatible with latest versions of Kubernetes.

    This PR solves the issue where the older rbac v1beta apis are not compatible with latest Kubernetes version. This fixes that problem but the chart needs to be installed from local feast-0.9.5-helmchart folder.

    opened by jainr 5
  • Error - Cannot update to latest feast 0.17

    Error - Cannot update to latest feast 0.17

    Updating to 0.17 with pip install --upgrade feast fails with error message:

    ERROR: feast-azure-provider 0.2.2 has requirement feast[redis]==0.15.1, but you'll have feast 0.17.0 which is incompatible.

    Tried modifying setup.py in my own fork with deployment template pointing to my own fork that has ==0.17 specified but still error out. own fork: https://github.com/cbtham/feast-azure/blob/main/provider/sdk/setup.py

    It seems like the source of feast-azure-provider(URL below) needs to be updated to support 0.17 because the current version 0.2.2 hardcoded ==0.15.1

    setup install_requires=[ "feast[redis]==0.15.1"

    https://pypi.org/project/feast-azure-provider/

    priority/p0 provider 
    opened by cbtham 4
  • feast cluster installation fails with rbac.authorization.k8s.io/v1beta API version (ClusterRole)

    feast cluster installation fails with rbac.authorization.k8s.io/v1beta API version (ClusterRole)

    Following the instructions listed in cluster installation on AKS cluster v1.22.4: https://github.com/Azure/feast-azure/tree/main/cluster/setup

    Initial installation steps pass, but api versions are not recognized in newer version of AKS (stable API version rbac.authorization.k8s.io/v1 should work instead of rbac.authorization.k8s.io/v1beta1).

    Any change to specify a newer version of helm chart in the installation shell script?

    Here is the issue which I encounter:

    sudo ./installfeast.sh (parameters passed here) INFO: Install feast error: failed to create secret secrets "feast-postgresql" already exists Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "Role" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"]

    cluster 
    opened by andrijaperovic 4
  • Issue on loading feast - missing module

    Issue on loading feast - missing module

    In the tutorial - step 2 in notebook 2 "Registre..." fails with missing module. The error is triggered by the command "fs = FeatureStore("./feature_repo")" The Feast evrsion is 0.15.1 and the Azure feast provider is 0.2.1.

    Error "snippet" first lines and last line. ModuleNotFoundError Traceback (most recent call last) /anaconda/envs/azureml_py38/lib/python3.8/site-packages/feast/infra/online_stores/redis.py in 43 from redis import Redis ---> 44 from rediscluster import RedisCluster ......... FeastModuleImportError: Could not import RedisOnlineStoreConfig module 'feast.infra.online_stores.redis'

    bug 
    opened by jcordtz 2
  • Race Condition with Blobfuse Mount and CI SetupScript

    Race Condition with Blobfuse Mount and CI SetupScript

    When trying to clone the feast-azure repo, I'm running into a race condition - where the Users directory hasn't yet been mounted to the CI when the cd /home/azureuser/cloudfiles/code/Users portion of the script is run.

    Here's the error in the logs...

    Successfully installed pymssql-2.2.2
    /mnt/batch/tasks/startup/wd/scripts/creation.sh: line 1: cd: /home/azureuser/cloudfiles/code/Users: No such file or directory
    Cloning into 'feast-azure'...
    
    bug tutorial 
    opened by ezwiefel 2
  • Conda Environment Creation Fails for Deployment

    Conda Environment Creation Fails for Deployment

    Summary: When attempting to create the conda environment "feast-env" as found in the training and deployment notebook, the environment creation stalls out after an hour during "installing pip dependencies" and never completes.

    Details: First occurrence of this issue was during attempts to deploy a scoring container, which timed out after an hour. Looking at the logs, it simply stops at the line "Installing pip dependencies: ... working ...", with no further detail.

    Attempting to build the conda environment from the "model_service_env.yml" in a local terminal fail in the same spot. Manually removing the pip packages from the yml file and manually installing those packages in the conda environment does not fail. Also, removing just the "feast-azure-provider" package causes to build to succeed, and it can then be manually installed. However, since Azure ML does not allow the pip installs to be done in separate steps, this is not a viable workaround for the container deployment.

    bug 
    opened by tarockey 2
  • Incorrect Redis Cache Connection string translation

    Incorrect Redis Cache Connection string translation

    Hello , In using the feature online_store with the sample code form your provider example here.

    With the current redis client version for Python 3.8 ( using code version as of date ) 0.3.0 https://pypi.org/project/feast-azure-provider/?msclkid=6debb6a2b9b511ec876e161bfda67531

    Looks like the connection string does not exclude the default values for the Redis Cache connection string .elements Sample : myfsonlinesotre.redis.cache.windows.net:6380,password=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz=,ssl=True,abortConnect=False

    If you provide the connection string as is , Python Redis client throw exception about the abortConnect element

    The only way the connection string work is by removing the ,abortConnect=False portion

    Working Connection String myfsonlinesotre.redis.cache.windows.net:6380,password=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz=,ssl=True

    -George Gergues. Best of luck

    opened by Gergues 1
  • Error when deploying service  - ImportError: cannot import name 'json' from itsdangerous

    Error when deploying service - ImportError: cannot import name 'json' from itsdangerous

    I got following error when deploying service based on tutorial in:

    Error: ... ImportError: cannot import name 'json' from itsdangerous ...

    I have fixed that by pinning version of itsdangerous library in inference.dockerfile to 2.0.1 (newer version doesn't work):

    RUN pip install 'azureml-defaults==1.35.0' \
                    'feast-azure-provider==0.2.2' \
                    'scikit-learn==0.22.2.post1' \
    		'joblib===1.1.0'\
                    'itsdangerous==2.0.1'
    
    

    I followed this SO: https://stackoverflow.com/questions/71189819/python-docker-importerror-cannot-import-name-json-from-itsdangerous

    opened by michalmar 1
  • Error occurred while loading customers, drivers, orders data in /provider/tutorial/notebooks/part1-load-data.ipynb

    Error occurred while loading customers, drivers, orders data in /provider/tutorial/notebooks/part1-load-data.ipynb

    Hi, I'm new to feast and was trying to reproduce the tutorial in azure portal. Faced the below error.

    ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Not able to validate external location because The remote server returned an error: (409) Conflict. (105215) (SQLExecDirectW)')
    [SQL: COPY INTO dbo.driver_hourly
    FROM 'https://feastonazuredatasamples.blob.core.windows.net/feastdatasamples/driver_hourly.csv'
    WITH
    (
    	FILE_TYPE = 'CSV'
    	,FIRSTROW = 2
    	,MAXERRORS = 0
    )
    ]
    

    Thanks in advance!

    opened by likhith00 0
  • Bump pyspark from 3.1.3 to 3.2.2 in /cluster/sdk/python

    Bump pyspark from 3.1.3 to 3.2.2 in /cluster/sdk/python

    Bumps pyspark from 3.1.3 to 3.2.2.

    Commits
    • 78a5825 Preparing Spark release v3.2.2-rc1
    • ba978b3 [SPARK-39099][BUILD] Add dependencies to Dockerfile for building Spark releases
    • 001d8b0 [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image d...
    • 9dd4c07 [SPARK-37730][PYTHON][FOLLOWUP] Split comments to comply pycodestyle check
    • bc54a3f [SPARK-37730][PYTHON] Replace use of MPLPlot._add_legend_handle with MPLPlot....
    • c5983c1 [SPARK-38018][SQL][3.2] Fix ColumnVectorUtils.populate to handle CalendarInte...
    • 32aff86 [SPARK-39447][SQL][3.2] Avoid AssertionError in AdaptiveSparkPlanExec.doExecu...
    • be891ad [SPARK-39551][SQL][3.2] Add AQE invalid plan check
    • 1c0bd4c [SPARK-39656][SQL][3.2] Fix wrong namespace in DescribeNamespaceExec
    • 3d084fe [SPARK-39677][SQL][DOCS][3.2] Fix args formatting of the regexp and like func...
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 0
  • How to load historical features directly into spark dataframe

    How to load historical features directly into spark dataframe

    We have been using Feast with a SQL db as an offline store and used JDBC to append features from a Spark dataframe directly to a table in SQL. Now for a recommender we'd like to build a historical dataset to train models on which will use a couple hundred-millions rows. Each is a customer with a timestamp. Feast's get_historical_features only takes a pandas dataframe as entity or a SQL query, so a workaround has been to store the entity df in the SQL db and use the query to fetch the features like so:

    sql_job = fs.get_historical_features(
        entity_df="SELECT * FROM test_entitity_df",
        features=[
            'feature_view1:feature1',
            'feature_view1:feature2',
        ]
    )
    

    However, the sql_job only has to_df, to_arrow, or persist functionality. My question is, how to load features efficiently into a Spark DF for training? One solution would be to store the result of the Feast query in a sql table and use JDBC again to load that into Spark, however, I cannot seem to get the persist functionality to work as the docs on SavedDatasetStorage is very limited. Please advice.

    Resources: https://docs.feast.dev/reference/offline-stores/overview#functionality https://docs.feast.dev/getting-started/concepts/dataset#creating-a-saved-dataset-from-historical-retrieval

    opened by VincentPe 0
  • Adding Terraform as Infa as a Code

    Adding Terraform as Infa as a Code

    Dear All,

    I am quite used to work with terraform and Azure and I am happy to help adding the terraform script for creation of the two current infra setup.

    Any suggestion or mention to add ?

    Thanks Rob

    opened by kamakay 0
  • Example deployment fails

    Example deployment fails

    train and deploy with Feast notebook fails (Part 3). When the following block is executed:

    import uuid
    from azureml.core.model import InferenceConfig
    from azureml.core.environment import Environment
    from azureml.core.model import Model
    
    # get the registered model
    model = Model(ws, "order_model")
    
    # create an inference config i.e. the scoring script and environment
    inference_config = InferenceConfig(
        entry_script="score.py", 
        environment=env, 
        source_directory="src"
    )
    
    # deploy the service
    service_name = "orders-service" + str(uuid.uuid4())[:4]
    service = Model.deploy(
        workspace=ws,
        name=service_name,
        models=[model],
        inference_config=inference_config,
        deployment_config=aciconfig,
    )
    
    service.wait_for_deployment(show_output=True)
    

    The error is: ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils'

    Longer trace: Error:

    {
      "code": "AciDeploymentFailed",
      "statusCode": 400,
      "message": "Aci Deployment failed with exception: Error in entry script, ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils' (/azureml-envs/feast/lib/python3.8/site-packages/azure/core/utils/__init__.py), please run print(service.get_logs()) to get details.",
      "details": [
        {
          "code": "CrashLoopBackOff",
          "message": "Error in entry script, ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils' (/azureml-envs/feast/lib/python3.8/site-packages/azure/core/utils/__init__.py), please run print(service.get_logs()) to get details."
        }
      ]
    }
    
    opened by Ritaja 3
  • Feast azure to support Feast 0.19.x+ version

    Feast azure to support Feast 0.19.x+ version

    Hi, thankyou for creating this extension on feast, I am just curious if there will be any update on version on feast that will be used with feast-azure-provider, right now the default feast version its downloading is 0.18.x, but we are using 0.19+ version, so is there any way I can use feast azure provider with updated version of feast.

    opened by mdrijwan123 0
Releases(v0.3.0)
  • v0.3.0(Mar 15, 2022)

    What's Changed

    • Fixed tutorial issues by @samuel100 in https://github.com/Azure/feast-azure/pull/25
    • Bump commons-io from 2.5 to 2.7 in /cluster/sdk/spark/ingestion by @dependabot in https://github.com/Azure/feast-azure/pull/13
    • Bump pyyaml from 5.3.1 to 5.4 in /cluster/sdk/python by @dependabot in https://github.com/Azure/feast-azure/pull/14
    • Bump cryptography from 3.1 to 3.3.2 in /cluster/sdk/python by @dependabot in https://github.com/Azure/feast-azure/pull/15
    • Adding missing db-dtypes by @jainr in https://github.com/Azure/feast-azure/pull/44
    • Adding spark.hadoop.fs.azure properties needed for NativeAzureFileSy… by @andrijaperovic in https://github.com/Azure/feast-azure/pull/43
    • Adding local helm chart and changing install script by @jainr in https://github.com/Azure/feast-azure/pull/45
    • Jainr patch 1 by @jainr in https://github.com/Azure/feast-azure/pull/50
    • Support pandas df for point-in-time join and account key auth for blob by @bastrik in https://github.com/Azure/feast-azure/pull/53
    • Update Feast to support 0.18.x and later by @cbtham in https://github.com/Azure/feast-azure/pull/51

    New Contributors

    • @dependabot made their first contribution in https://github.com/Azure/feast-azure/pull/13
    • @jainr made their first contribution in https://github.com/Azure/feast-azure/pull/44
    • @andrijaperovic made their first contribution in https://github.com/Azure/feast-azure/pull/43
    • @cbtham made their first contribution in https://github.com/Azure/feast-azure/pull/51

    Full Changelog: https://github.com/Azure/feast-azure/compare/v0.2.2...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Nov 16, 2021)

  • v0.2.1(Nov 14, 2021)

  • v0.2.0(Nov 14, 2021)

  • v0.1.0(Oct 14, 2021)

    What's Changed

    • mssql server offline store by @DvirDukhan in https://github.com/Azure/feast-azure/pull/1
    • Add CI/CD (build/publish, test) pipelines by @bastrik in https://github.com/Azure/feast-azure/pull/2

    New Contributors

    • @DvirDukhan made their first contribution in https://github.com/Azure/feast-azure/pull/1
    • @bastrik made their first contribution in https://github.com/Azure/feast-azure/pull/2

    Full Changelog: https://github.com/Azure/feast-azure/commits/v0.1.0

    Source code(tar.gz)
    Source code(zip)
Owner
Microsoft Azure
APIs, SDKs and open source projects from Microsoft Azure
Microsoft Azure
This repository contains code examples and documentation for learning how applications can be developed with Kubernetes

BigBitBus KAT Components Click on the diagram to enlarge, or follow this link for detailed documentation Introduction Welcome to the BigBitBus Kuberne

51 Oct 16, 2022
Checkmk kube agent - Checkmk Kubernetes Cluster and Node Collectors

Checkmk Kubernetes Cluster and Node Collectors Checkmk cluster and node collecto

tribe29 GmbH 15 Dec 26, 2022
MLops tools review for execution on multiple cluster types: slurm, kubernetes, dask...

MLops tools review focused on execution using multiple cluster types: slurm, kubernetes, dask...

4 Nov 30, 2022
Play Wordle from any Kubernetes cluster.

wordle-operator 🟩 ⬛ 🟩 🟨 ⬛ Play Wordle from any Kubernetes cluster. Using the power of CustomResourceDefinitions and Kubernetes Operators, now you c

Lucas Melin 1 Jan 15, 2022
Python IMDB Docker - A docker tutorial to containerize a python script.

Python_IMDB_Docker A docker tutorial to containerize a python script. Build the docker in the current directory: docker build -t python-imdb . Run the

Sarthak Babbar 1 Dec 30, 2021
DataOps framework for Machine Learning projects.

Noronha DataOps Noronha is a Python framework designed to help you orchestrate and manage ML projects life-cycle. It hosts Machine Learning models ins

52 Oct 30, 2022
Cross-platform lib for process and system monitoring in Python

Home Install Documentation Download Forum Blog Funding What's new Summary psutil (process and system utilities) is a cross-platform library for retrie

Giampaolo Rodola 9k Jan 02, 2023
GitGoat enables DevOps and Engineering teams to test security products intending to integrate with GitHub

GitGoat is an open source tool that was built to enable DevOps and Engineering teams to design and implement a sustainable misconfiguration prevention strategy. It can be used to test with products w

Arnica 149 Dec 22, 2022
Rundeck / Grafana / Prometheus / Rundeck Exporter integration demo

Rundeck / Prometheus / Grafana integration demo via Rundeck Exporter This is a demo environment that shows how to monitor a Rundeck instance using Run

Reiner 4 Oct 14, 2022
Jenkins-AWS-CICD - Implement Jenkins CI/CD with AWS CodeBuild and AWS CodeDeploy, build a python flask web application.

Jenkins-AWS-CICD - Implement Jenkins CI/CD with AWS CodeBuild and AWS CodeDeploy, build a python flask web application.

Ning 1 Jan 01, 2022
Deploying a production-ready Django project using Nginx and Gunicorn

django-nginx-gunicorn This project is for deploying a production-ready Django project using Nginx and Gunicorn. Running a local server of Django is no

Arash Sayareh 8 Jul 03, 2022
Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Cilium 2.4k Jan 04, 2023
Azure plugins for Feast (FEAture STore)

Feast on Azure This project provides resources to enable running a feast feature store on Azure. Feast Azure Provider The Feast Azure provider acts li

Microsoft Azure 70 Dec 31, 2022
This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).

Deploying ML models with CPU based TFServing, Docker, and Kubernetes By: Chansung Park and Sayak Paul This project shows how to serve a TensorFlow ima

Chansung Park 104 Dec 28, 2022
Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty

Cabot Maintainers wanted Cabot is stable and used by hundreds of companies and individuals in production, but it is not actively maintained. We would

Arachnys 5.4k Dec 23, 2022
Run Oracle on Kubernetes with El Carro

El Carro is a new project that offers a way to run Oracle databases in Kubernetes as a portable, open source, community driven, no vendor lock-in container orchestration system. El Carro provides a p

Google Cloud Platform 205 Dec 30, 2022
Nagios status monitor for your desktop.

Nagstamon Nagstamon is a status monitor for the desktop. It connects to multiple Nagios, Icinga, Opsview, Centreon, Op5 Monitor/Ninja, Checkmk Multisi

Henri Wahl 361 Jan 05, 2023
A Python library for the Docker Engine API

Docker SDK for Python A Python library for the Docker Engine API. It lets you do anything the docker command does, but from within Python apps – run c

Docker 6.1k Dec 31, 2022
Manage your azure VM easily!

Azure-manager Manage your VM in Azure using cookies.

Team 1injex 129 Dec 17, 2022
Caboto, the Kubernetes semantic analysis tool

Caboto Caboto, the Kubernetes semantic analysis toolkit. It contains a lightweight Python library for semantic analysis of plain Kubernetes manifests

Michael Schilonka 8 Nov 26, 2022