A lightweight tool to get an AI Infrastructure Stack up in minutes not days.

Overview


Welcome to K3ai Project

K3ai is a lightweight tool to get an AI Infrastructure Stack up in minutes not days.

cli version  go version  go report  license


NOTE on the K3ai origins

Original K3ai Project has been developed at the end of October 2020 in 2 weeks by:

K3ai v1.0 has been entirely re-written by Alessandro Festa during the month of October 2021 to offer a better User Experience.

Thanks to the amazing and incredible people and projects that have been instrumental to create K3ai project repositories,website,etc...

โšก๏ธ Quick start

Let's discover K3ai in three simple steps.

๐ŸŒ˜ Getting Started

Get started by download k3ai from the release page here.

Or try K3ai companion script using this command:

curl -LO https://get.k3ai.in | sh -

๐ŸŒ— Load K3ai configuration

Let's start loading the configuration:

k3ai up

First time k3ai run will ask for a Github PAT (Personal Access Token) that we will use to avoid API calls limitations. Check Github Documentation to learn how to create one. Your personal GH PAT only need read repository permission.


๐ŸŒ– Configure the base infrastructure

Choose your favourite Kubernetes flavor and run it:

To know which K8s flavors are available

k3ai cluster list --all

it should print something like:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ INFRASTRUCTURE                                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ TYPE  โ”‚ DESCRIPTION                                         โ”‚ KIND  โ”‚ TAG    โ”‚ VERSION โ”‚ STATUS         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ CIVO  โ”‚ The First Cloud Native Service Provider Power...    โ”‚ infra โ”‚ cloud  โ”‚ latest  โ”‚ Available      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ EKS-A โ”‚ Amazon Eks Anywhere Is A New Deployment Option...   โ”‚ infra โ”‚ hybrid โ”‚ v0.5.0  โ”‚ Available      โ”‚
โ”‚       โ”‚ ate And Operate Kubernetes Clusters On Custome...   โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ K3S   โ”‚ K3s Is A Highly Available, Certified Kubernetes...  โ”‚ infra โ”‚ local  โ”‚ latest  โ”‚ Available      โ”‚
โ”‚       โ”‚ oads In Unattended, Resource-Constrained...         โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ KIND  โ”‚ Kind Is A Tool For Running Local Kubernetes...      โ”‚ infra โ”‚ local  โ”‚ v0.11.2 โ”‚ Available      โ”‚
โ”‚       โ”‚ as Primarily Designed For Testing Kubernetes...     โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”‚       โ”‚  Or Ci.                                             โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ TANZU โ”‚ Tanzu Community Edition Is A Fully-Featured...      โ”‚ infra โ”‚ hybrid โ”‚ latest  โ”‚ In Development โ”‚
โ”‚       โ”‚ ers And Users. It Is A Freely Available...          โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”‚       โ”‚  Of Vmware Tanzu.                                   โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Now let start with something super fast and super simple:

k3ai cluster deploy --type k3s --n mycluster

๐ŸŒ Install a plugin to do your AI experimentations

Now that the server is up and running let's type:

k3ai plugin deploy -n mlflow -t mycluster

K3ai will print the url where you may access to the MLFLow tracking server at the end of the installation. That's all now just start having fun with K3ai!

๐ŸŒˆ Push a piece of code to the AI tools and focus on your goals

Let's push some code to the AI tool (i.e.: MLFlow)

k3ai run --source https://github.com/k3ai/quickstart --target mycluster --backend mlflow

wait the run to complete and login the backend AI tolls (i.e.: on the MLFlow UI http:// :30500 )

Current Implementation support

Operating Systems

Operating System K3ai v1.0.0
Linux Yes
Windows In Progress
MacOs In Progress
Arm In Progress

Clusters

K8s Clusters K3ai v1.0.0
Rancher K3s Yes
Vmware Tanzu Community Ed. Yes
Amazon EKS Anywhere Yes
KinD Yes

Plugins

Plugins K3ai v1.0.0
Kuebflow Components Yes
MLFlow Yes
Apache Airflow Yes
Argo Workflows Yes

โญ๏ธ Project assistance

If you want to say thank you or/and support active development of K3ai Project:

Together, we can make this project better every day! ๐Ÿ˜˜

โš ๏ธ License

K3ai is free and open-source software licensed under the BSD 3-Clause. Official logo was created by Alessandro Festa.

Comments
  • [Core] - Initial work for v3 version

    [Core] - Initial work for v3 version

    This PR is the initial work to re-write k3ai into a more flexible tool. This PR implements:

    • [x] #3
    • [x] #4
    • [x] #5
    • [ ] #6
    • [x] #7
    • [x] #8
    • [ ] #9

    This PR also include the Issues in the Plugin repo:

    • [x] https://github.com/k3ai/plugins/issues/1
    • [x] https://github.com/k3ai/plugins/issues/2
    • [x] https://github.com/k3ai/plugins/issues/3
    done 
    opened by alefesta 11
  • [BUG] k3ai up yields version `GLIBC_2.28' not found error

    [BUG] k3ai up yields version `GLIBC_2.28' not found error

    Describe the bug After following the installation instructions, the following error is reported:

    k3ai: /lib/x86_64-linux-gnu/libc.so.6: versionGLIBC_2.28' not found (required by k3ai)`

    To Reproduce Steps to reproduce the behavior:

    1. curl -LO https://get.k3ai.in | sh -
    2. k3ai up

    Expected behavior It should have spun up the cluster!

    OS: Ubuntu 18.04

    done bug 
    opened by htahir1 7
  • [Feature] - Running Kubeflow and MLFLow code through

    [Feature] - Running Kubeflow and MLFLow code through "One-Click" approach

    This PR address:

    • #10
    • #14 Also introduce -x as Extra and -e ad Entrypoint

    Examples:

    k3ai run -s https://github.com/alefesta/sample/mlflow -b mlflow -t <clustername>
    k3ai run -s https://github.com/alefesta/sample/kfp -b kfp -e condition.py -t <clustername>
    

    For MLFlow remains to manage the need of boto3 in the conda.yaml file that is a requirement to run on K8s. This need to be addressed before merge this PR. We may:

    • try to inject boto3 in the conda.yaml file at runtime
    • Force the user to fork the example first , change conda and run k3ai The first seems more compliant with k3ai goals of making life of user easier
    done 
    opened by alefesta 6
  • [BUG] - Kubeflow Pipelines Quickstart Repository Missing

    [BUG] - Kubeflow Pipelines Quickstart Repository Missing

    Describe the bug I was trying to follow the kubeflow pipelines tutorial as described in the k3ai website. It seems the final step of running the pipeline fails because the quickstart repository for kubeflow pipelines does not exist.

    To Reproduce Steps to reproduce the behavior:

    1. k3ai up
    2. k3ai cluster deploy -t k3s -n myk3scluster
    3. k3ai plugin deploy -n kf-pa -t myk3scluster
    4. k3ai run -s https://github.com/k3ai/quickstart/kfp -b kfp -e condition.py -t mycluster

    Expected behavior Pipeline to run successfully.

    Actual behavior Pipeline run fails.

    done 
    opened by harshitmahapatra 5
  • [Feature] - Add support for k3d

    [Feature] - Add support for k3d

    ๐Ÿš€ Is your feature request related to a problem? Please describe. Currently, k3ai doesn't have support for k3d.

    k3s is known to have issues with WSL2 deployment (systemd requirement, etc.), so it would be better to have k3d support.

    ๐Ÿ’ก Describe the solution you'd like We can add k3d support to k3ai in a subsequent release. (would require some work on pkg/io/execution).

    epic done 
    opened by burntcarrot 5
  • Fix lint issues

    Fix lint issues

    Fixed 100+ issues related to ineffectual assignments and added error checks. Added golangci-lint workflow to check linting issues while pushing code.

    The log level used while error checking is Fatal (log.Fatal(err)).

    opened by burntcarrot 4
  • [BUG] - MLFlow endpoint doesn't work in WSL2

    [BUG] - MLFlow endpoint doesn't work in WSL2

    Describe the bug

    While running the MLFlow plugin, the endpoint URI displayed by k3ai is not accessible.

    k3ai-mlflow

    The following endpoints are not accessible:

    • http://172.29.170.187:30500/ (displayed by k3ai)
    • http://172.29.170.187:5000/
    • http://10.96.150.194:30500/
    • http://10.244.0.7:30500/

    The IP address for the WSL2 machine is (through wsl hostname -I): 172.29.170.187

    WSL2 uses dynamic IP allocation.

    To Reproduce Steps to reproduce the behavior:

    k3ai run -s https://github.com/k3ai/quickstart -b mlflow
    

    Expected behavior The MLFlow endpoint exposed through k3ai should have worked.

    bug 
    opened by burntcarrot 4
  • [Feature] - Implement a system domain to automatically bind the plugins

    [Feature] - Implement a system domain to automatically bind the plugins

    K3ai should implement an automatice system domain (i.e: sslip.io or nip.io) so that any plugin installed could be exposed with the standard: <plugin-name>.<clusterIP>.nip.io This way we may use the same IP in cases like:

    • WSL
    • Laptops
    epic 
    opened by alefesta 4
  • [BUG] - runtime error with index out of range when running quickstart

    [BUG] - runtime error with index out of range when running quickstart

    Describe the bug A clear and concise description of what the bug is.

    To Reproduce Steps to reproduce the behavior:

    1. Follow quickstart steps:
    k3ai up
    k3ai cluster deploy -t k3s -n mycluster
    k3ai plugin deploy -n mlflow -t mycluster
    
    1. Try running quickstart: $ k3ai run -s https://github.com/k3ai/quickstart -b mlflow
    2. Receive error:
    ๐Ÿงช	Initializing code...
    panic: runtime error: index out of range [0] with length 0
    
    goroutine 1 [running]:
    github.com/k3ai/pkg/runner.Loader({0x7fff7fe7bb4e, 0x22}, {0x0, 0x0}, {0x7fff7fe7bb74, 0x6}, {0x0, 0x0}, {0x0, 0x0})
    	/home/joshec/git/k3ai/pkg/runner/run.go:78 +0x10f6
    github.com/k3ai/cmd.runCommand.func1(0xc000403680, {0xc0003d57c0, 0x0, 0x4})
    	/home/joshec/git/k3ai/cmd/run.go:71 +0x58b
    github.com/spf13/cobra.(*Command).execute(0xc000403680, {0xc0003d5780, 0x4, 0x4})
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:860 +0x5f8
    github.com/spf13/cobra.(*Command).ExecuteC(0x2254960)
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:974 +0x3bc
    github.com/spf13/cobra.(*Command).Execute(...)
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:902
    github.com/k3ai/cmd.Execute(...)
    	/home/joshec/git/k3ai/cmd/root.go:34
    main.main()
    	/home/joshec/git/k3ai/main.go:10 +0x25
    

    Expected behavior A clear and concise description of what you expected to happen.

    A successful run with proper artifact storage and tracking URI settings

    Screenshots If applicable, add screenshots to help explain your problem.

    in progress bug 
    opened by jeinstei 3
  • [CI/CD] - Add Lint support

    [CI/CD] - Add Lint support

    On running golangci-lint on my local machine, I was able to find 40+ linting issues.

    10 of them were deadcode issues, so it can be ignored as they're a part of adding code for future releases.

    The rest are ineffectual assignments and skipped error checks. We can log the error message for the skipped error checks; it would help us more in debugging.

    I know this sounds like a minor issue, but with more code coming in the subsequent releases, addressing this earlier can help us save a lot of time maintaining good quality code.

    Suggested Fix: Add golangci-lint action as workflow to check linting issues. We can add a rule for excluding deadcode issues for now.

    done 
    opened by burntcarrot 3
  • 'invalid argument'

    'invalid argument'

    Hello - I am trying out the Mlflow deployment as in the tutorials and I get a stream of logs that say "invalid argument" and after a while I get "We tried to publish MLFLow at:http://172.17.0.2:30500" .. but when I go to this page there is no Mlflow server.

    Would appreciate the help. Thanks.

    Great work btw! this library is amazing!

    opened by jsnanavati 2
  • [BUG] - Kubeflow Pipelines not starting

    [BUG] - Kubeflow Pipelines not starting

    Describe the bug I am trying to run the kubeflow plugin on a single node 8vcpu / 16gb ram.

    To Reproduce curl -sfL https://get.k3ai.in | sh - k3ai up k3ai cluster deploy --type k3s -n mycluster k3ai plugin deploy -n kf-pa -t mycluster

    Issue Installation never ends, seems the pods are not being started correctly

    ubuntu:~$ k3s kubectl get pods -n kubeflow
    NAME                                              READY   STATUS                   RESTARTS        AGE
    workflow-controller-b7f95d6c6-q2wkf               1/1     Running                  0               4m22s
    ml-pipeline-scheduledworkflow-5c549bc5f5-drkmn    1/1     Running                  0               4m23s
    ml-pipeline-viewer-crd-7555c4d55f-fpd2m           1/1     Running                  0               4m23s
    metadata-envoy-deployment-7654b98955-rkt2g        1/1     Running                  0               4m24s
    ml-pipeline-ui-656466fdc9-qg9xv                   1/1     Running                  0               4m23s
    mysql-55778745b6-g4vbd                            1/1     Running                  0               4m22s
    minio-6d6d45469f-xgmz2                            1/1     Running                  0               4m24s
    cache-deployer-deployment-6f8ff5b986-tvwn4        1/1     Running                  0               4m24s
    metadata-grpc-deployment-5c8599b99c-b45jf         1/1     Running                  1 (3m17s ago)   4m24s
    ml-pipeline-8995b746f-dhznz                       1/1     Running                  1 (2m31s ago)   4m23s
    cache-server-74494cbf5-k956w                      0/1     Pending                  0               2m20s
    cache-server-74494cbf5-6v5lj                      0/1     ContainerStatusUnknown   0               4m24s
    ml-pipeline-persistenceagent-59689585f6-s8dhd     1/1     Running                  1 (2m5s ago)    4m23s
    ml-pipeline-visualizationserver-6b8fb8c44-mmrk8   0/1     ContainerStatusUnknown   0               4m22s
    ml-pipeline-visualizationserver-6b8fb8c44-svm25   0/1     Pending                  0               113s
    metadata-writer-fd965db48-9lw22                   0/1     Error                    0               4m24s
    metadata-writer-fd965db48-rqt7d                   0/1     Pending                  0               82s
    

    Pod metadata-writer-fd965db48-9lw22 error : message: 'The node was low on resource: ephemeral-storage. Container main was using 392Ki, which exceeds its request of 0. '

    Any ideas? Thanks!

    needs-triage bug 
    opened by tonxxd 1
  • [BUG] - postgress crashes when deploying mlflow on k3s / intel

    [BUG] - postgress crashes when deploying mlflow on k3s / intel

    Describe the bug in postgres pod: Bus error (core dumped)

    running on: (base) [email protected]:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal (base) [email protected]:~$

    To Reproduce

    k3ai cluster deploy --type k3s --name arrakis rk3ai plugin deploy -n mlflow -t arrakis k3s kubectl logs postgres-0

    Expected behavior successful mlflow startup

    Screenshots (base) [email protected]:~$ kubectl get all -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system pod/local-path-provisioner-6c79684f77-996dg 1/1 Running 0 7m28s kube-system pod/coredns-d76bd69b-8zqzz 1/1 Running 0 7m28s kube-system pod/metrics-server-7cd5fcb6b7-6zwwl 1/1 Running 0 7m28s default pod/minio-0 1/1 Running 0 6m40s default pod/mlflow-7c6768c4c-m6j6d 1/1 Running 0 6m23s default pod/postgres-0 0/1 CrashLoopBackOff 6 (20s ago) 6m31s

    NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.43.0.1 443/TCP 7m43s kube-system service/kube-dns ClusterIP 10.43.0.10 53/UDP,53/TCP,9153/TCP 7m40s kube-system service/metrics-server ClusterIP 10.43.39.68 443/TCP 7m39s default service/minio-service ClusterIP 10.43.144.140 9000/TCP 6m40s default service/postgres-service ClusterIP 10.43.236.158 5432/TCP 6m23s default service/mlflow-service NodePort 10.43.192.251 5000:30500/TCP 6m8s

    NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system deployment.apps/local-path-provisioner 1/1 1 1 7m40s kube-system deployment.apps/coredns 1/1 1 1 7m40s kube-system deployment.apps/metrics-server 1/1 1 1 7m39s default deployment.apps/mlflow 1/1 1 1 6m23s

    NAMESPACE NAME DESIRED CURRENT READY AGE kube-system replicaset.apps/local-path-provisioner-6c79684f77 1 1 1 7m29s kube-system replicaset.apps/coredns-d76bd69b 1 1 1 7m29s kube-system replicaset.apps/metrics-server-7cd5fcb6b7 1 1 1 7m29s default replicaset.apps/mlflow-7c6768c4c 1 1 1 6m23s

    NAMESPACE NAME READY AGE default statefulset.apps/minio 1/1 6m40s default statefulset.apps/postgres 0/1 6m31s

    (base) [email protected]:~$ kubectl logs postgres-0 The files belonging to this database system will be owned by user "postgres". This user must also own the server process.

    The database cluster will be initialized with locale "en_US.utf8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english".

    Data page checksums are disabled.

    fixing permissions on existing directory /var/lib/postgresql/mlflow/data ... ok creating subdirectories ... ok selecting default max_connections ... 20 selecting default shared_buffers ... 400kB selecting default timezone ... Etc/UTC selecting dynamic shared memory implementation ... posix creating configuration files ... ok

    bug todo :spiral_notepad: 
    opened by paxinos 1
  • [BUG] - Incompatible k3s version for kubeflow

    [BUG] - Incompatible k3s version for kubeflow

    Describe the bug There's bug on kubeflow part, they currently don't support k8s 1.22, so at the moment kubeflow pipelines seems to work, with k3s, but e.g. kf-dashboard is failing, which might be related to unsupported k8s API version.

    ...
     โณ     Working...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-crds/base": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "Gateway" in version "networking.istio.io/v1alpha3"
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "AuthorizationPolicy" in version "security.istio.io/v1beta1"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "ValidatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
    ...
    

    To Reproduce

    k3ai plugin deploy -n kf-dashboard -t myk3scluster

    Expected behavior

    Successful deployment of all kubeflow components on k3s.

    in progress docs 
    opened by Adiqq 1
  • [Feature] - Clean up CLI error messages

    [Feature] - Clean up CLI error messages

    ๐Ÿš€ Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

    See Issue https://github.com/k3ai/k3ai/issues/53 regarding k3ai run not returning a useful error when a required argument was missing

    ๐Ÿ’ก Describe the solution you'd like A clear and concise description of what you want to happen.

    a better CLI handler with more descriptive errors, or at least a fix for this bug on this command's handling

    ๐Ÿคฉ Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

    None yet

    help-wanted epic todo :spiral_notepad: docs 
    opened by jeinstei 0
  • Implement Plugin remove

    Implement Plugin remove

    Hey, great work with K3ai. Works For most of the operations pretty smooth.

    After experimenting a bit with Kubeflow I wanted to remove a plugin, but it seems that the command is not implemented:

    โžœ k3ai  plugin remove --name kf-pa
    Remove a given plugin based on NAME
    
    Usage:
      k3ai[options] plugin remove [-n NAME] [other flags]
    
    Flags:
      -n, --name string     NAME of plugin to be created/deleted
      -t, --target string   Target from where to remove plugin.
      -q, --quiet           Suppress output messages. Useful when k3ai is used within scripts.
      -c, --config string   Configure K3ai using a custom config file.[-c /path/tofile] [-c https://urlToFile]
    

    See here: https://github.com/k3ai/k3ai/blob/main/cmd/plugin.go#L138

    I am not sure, whether I am missing something but couldn't find anything related in the issues or Roadmap.


    • Your operating system name and version: Ubuntu 18.04
    • Detailed steps to reproduce the bug: Follow exact steps from documentation or README to deploy a plugin
    help-wanted epic todo :spiral_notepad: 
    opened by daniel-vera-g 2
  • [Feature] - Use Github Actions to create issues for exported reports

    [Feature] - Use Github Actions to create issues for exported reports

    ๐Ÿš€ Is your feature request related to a problem? Please describe. Related:

    • #37

    Using the metrics report exported through the executor, we can use Github Actions workflows to create automated issues containing the reports.

    ๐Ÿ’ก Describe the solution you'd like Create issues with the exported report as the content using Github Actions.

    epic todo :spiral_notepad: 
    opened by burntcarrot 1
Releases(1.0.1)
  • 1.0.1(Dec 7, 2021)

    Full Changelog: https://github.com/k3ai/k3ai/compare/1.0...1.0.1

    What's Changed

    K3ai Features:

    ARM support #13 @alefesta Kubeflow one-click pipeline #14 @alefesta Implementing GH actions (GHA) as a method to run K3ai from within the repo #19 Minimal documentation to run K3ai as GH @burntcarrot Implementing a config file to mimic an e2e workflow #18 @burntcarrot Add support for k3d #25 @burntcarrot

    K3ai Plugins:

    https://github.com/k3ai/plugins/issues/6 @alefesta https://github.com/k3ai/plugins/issues/7 @alefesta

    Bugs

    • [BUG] - certain plugins fail to install by @alefesta in https://github.com/k3ai/k3ai/pull/23
    • [BUG] - Fixes right tools download for Architecture by @alefesta in https://github.com/k3ai/k3ai/pull/43
    • [BUG] - minor fixes on download tools by @alefesta in https://github.com/k3ai/k3ai/pull/44
    • [BUG] - Fixes on Civo CLI for ARM by @alefesta in https://github.com/k3ai/k3ai/pull/45

    New Contributors

    • @burntcarrot made their first contribution in https://github.com/k3ai/k3ai/pull/27
    Source code(tar.gz)
    Source code(zip)
    k3ai(43.51 MB)
    k3ai.arm64(40.75 MB)
    k3ai.darwin.amd64(32.96 MB)
  • 1.0(Nov 1, 2021)

    Full Changelog: https://github.com/k3ai/k3ai/commits/1.0

    What's Changed

    • [Core] - Initial work for v1.0.0 version by @alefesta in https://github.com/k3ai/k3ai/pull/2

    New Contributors

    • @alefesta made their first contribution in https://github.com/k3ai/k3ai/pull/2

    Full Changelog:

    • Introducing K3ai DB to manage clusters and plugins dynamically
    • Introducing new CLI logic : K3ai [COMMAND] [ACTION] [OPTIONS]
    • Introducing the One Click experience to run training over deployed plugins.

    Current Operating Systems supported

    • Linux x64
    • macOS (Not Tested) Have fun with K3ai
    Source code(tar.gz)
    Source code(zip)
    k3ai(30.78 MB)
This program will stylize your photos with fast neural style transfer.

Neural Style Transfer (NST) Using TensorFlow Demo TensorFlow TensorFlow is an end-to-end open source platform for machine learning. It has a comprehen

Ismail Boularbah 1 Aug 08, 2022
Normalizing Flows with a resampled base distribution

Resampling Base Distributions of Normalizing Flows Normalizing flows are a popular class of models for approximating probability distributions. Howeve

Vincent Stimper 24 Nov 03, 2022
PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

FInite volume Neural Network (FINN) This repository contains the PyTorch code for models, training, and testing, and Python code for data generation t

Cognitive Modeling 20 Dec 18, 2022
A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

Karoo GP Karoo GP is an evolutionary algorithm, a genetic programming application suite written in Python which supports both symbolic regression and

Kai Staats 149 Jan 09, 2023
Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

TDEER ๐ŸฆŒ ๐Ÿฆ’ Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEE

33 Dec 23, 2022
Python/Rust implementations and notes from Proofs Arguments and Zero Knowledge

What is this? This is where I'll be collecting resources related to the Study Group on Dr. Justin Thaler's Proofs Arguments And Zero Knowledge Book. T

Thor 66 Jan 04, 2023
Rule Extraction Methods for Interactive eXplainability

REMIX: Rule Extraction Methods for Interactive eXplainability This repository contains a variety of tools and methods for extracting interpretable rul

Mateo Espinosa Zarlenga 21 Jan 03, 2023
This is a project based on retinaface face detection, including ghostnet and mobilenetv3

English | ็ฎ€ไฝ“ไธญๆ–‡ RetinaFace in PyTorch Chinese detailed blog๏ผšhttps://zhuanlan.zhihu.com/p/379730820 Face recognition with masks is still robust---------

pogg 59 Dec 21, 2022
Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION This is the official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSU

็”ต็บฟๆ† 14 Dec 15, 2022
Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

Andy_Ge 62 Dec 21, 2022
Pytorch implementation for M^3L

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification (CVPR 2021) Introduction This is the Py

Yuyang Zhao 45 Dec 26, 2022
Node for thenewboston digital currency network.

Project setup For project setup see INSTALL.rst Community Join the community to stay updated on the most recent developments, project roadmaps, and ra

thenewboston 27 Jul 08, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
This is a simple plugin for Vim that allows you to use OpenAI Codex.

๐Ÿค– Vim Codex An AI plugin that does the work for you. This is a simple plugin for Vim that will allow you to use OpenAI Codex. To use this plugin you

Tom Dรถrr 195 Dec 28, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
Learning to Initialize Neural Networks for Stable and Efficient Training

GradInit This repository hosts the code for experiments in the paper, GradInit: Learning to Initialize Neural Networks for Stable and Efficient Traini

Chen Zhu 124 Dec 30, 2022
NeurIPS 2021 paper 'Representation Learning on Spatial Networks' code

Representation Learning on Spatial Networks This repository is the official implementation of Representation Learning on Spatial Networks. Training Ex

13 Dec 29, 2022
(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

Yue Zhao 6.6k Jan 03, 2023
Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

gaurav pathak 86 Oct 28, 2022
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022