Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects




Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.

For more information, see Metaflow's website and documentation.

Getting Started

Getting up and running with Metaflow is easy.


Install metaflow from pypi:

pip install metaflow

and access tutorials by typing:

metaflow tutorials pull


Install Metaflow from github:

devtools::install_github("Netflix/metaflow", subdir="R")

and access tutorials by typing:


Get in Touch

There are several ways to get in touch with us:


We welcome contributions to Metaflow. Please see our contribution guide for more details.

Code style

We use black as a code formatter. The easiest way to ensure your commits are always formatted with the correct version of black it is to use pre-commit: install it and then run pre-commit install once in your local copy of the repo.

  • ERROR: Encountered corrupt package tarball

    ERROR: Encountered corrupt package tarball


    My step works flawlessly in local mode, but when I tried it with batch mode it failed with the message: /bin/sh: 1: metaflow_CleanFlow_linux-64_54816c55859cfd0f8c3c9b2e51678ce87bc33a38/bin/python: not found To understand what might be the reason, I locally ran a docker image(python3:6) and ran all the commands that run on the batch side.

    I noticed when conda creating the environment, some of the packages inside pkgs folder were failing to install. I digged deeper into it and noticed that somehow a few tarballs were not fully copied(~80%) to s3 bucket in the beginning and therefore they were incomplete. I manually downloaded those tarballs and they all started to work fine. What might be the reason for these incomplete tarball uploads to s3 bucket?

    Computer: Mac OSX: 10.15.1 Conda: Anaconda 4.7.12 Metaflow: 2.0.1 List of the tarballs that have failed: chardet-3.0.4-py36_1003.tar.bz2, six-1.14.0-py36_0.tar.bz2, setuptools-45.1.0-py36_0.tar.bz2, pip-20.0.2-py36_1.tar.bz2

    Example error message: ERROR: Encountered corrupt package tarball at /root/.aws/metaflow/conda/pkgs/setuptools-45.1.0-py36_0.tar.bz2. Conda has left it in place. Please report this to the maintainers of your package. For the defaults channel, please report to

    bug enhancement 
    opened by abaspinar 30
  • Support for another public cloud - Microsoft Azure

    Support for another public cloud - Microsoft Azure

    Currently, Metaflow is set up to work with AWS as the default public cloud. The architecture of Metaflow allows for additional public clouds to be supported.

    Adding support for Microsoft Azure might broaden the potential user base, which could increase the adaption rate. This, in turn, could lead to increased community attention.

    opened by leifericf 21
  • Support for AWS Step functions

    Support for AWS Step functions

    Metaflow on AWS currently requires a human-in-the-loop to execute and cannot automatically be scheduled. Metaflow could be made to work with AWS Step functions to allow the orchestration of Metaflow steps to be done by AWS.

    opened by romain-intel 20
  • Typo repair and PEP8 cleanup

    Typo repair and PEP8 cleanup

    I've made a number of changes to address misspellings, grammar issues, and other text-things that needed clarified. I was pretty aggressive, so please feel free to reject anything that you don't agree with!

    Some of the changes were made to function names with clear misspellings (i.e. "kubernetes" spelled as "kuberentes")

    opened by jimbudarz 18
  • Support for Kubernetes (with Argo)

    Support for Kubernetes (with Argo)

    Another implementation of #16

    This idea is to provide metaflow with native kubernetes implementation using Argo ( for the workflow part.

    opened by nlaille 17
  • Adding support for Azure Blob Storage as a datastore

    Adding support for Azure Blob Storage as a datastore

    The primary change is implementing AzureStorage (analogous to existing S3Storage, LocalStorage). We are consciously deferring the decision of having first class "data tools" support for Azure.

    There are some necessary changes to ensure full Azure support on all Metaflow surfaces:

    • includefile
    • conda
    • cards
    • mflog
    • kubernetes
    • argo

    We take care to ensure there is no cross disruption to users not using Azure. More specifically:

    • Users need to setup AWS dependencies (boto3, config params), iff they are using AWS.
    • Users need to setup Azure dependencies (azure SDK libs, config params), iff they are using Azure.

    We aggressively use local imports to achieve this.

    Much effort was also spent to ensure good performance of Metaflow's usage of Azure Blob Storage. See context docs for more details.

    Some docs for context:

    ok-to-test mergeable 
    opened by jackie-ob 15
  • Parse environment variables passed from CLI

    Parse environment variables passed from CLI

    Right now if one attempted to pass environment variables to the @environment decorator using the following type of syntax:

    python workflow/ --with environment:vars=FOO:$(FOO),BAR:$(BAR)

    vars would incorrectly register as a string rather than parsing the the contents as a dict.

    This PR enables parsing of environment for the environment decorator which are passed using --with

    opened by LarsDu 14
  • Documentation / Explanation on how to use GPU

    Documentation / Explanation on how to use GPU


    Not clear how GPU will be used. If Metaflow looks after installing NVIDA drivers or utilise nvidea toolkit doccker instances. If using the G/P/INF instances would automatically enable using GPU.

    In the documentation, I can only see Using AWS Batch where GPU is referenced.

    Note that in this case the resources decorator is used as a prescription for the size of the box that Batch should run the job on; please be sure that this resource requirement can be met. In addition to cpu and memory you can specify gpu=N to request N GPUs for the instance.

    Does this mean regardless with the EC2 instance type, GPU will be allocated in someway e.g. AWS Elastic Inference?

    A Review of Netflix’s Metaflow tells:

    You cannot specify the type of GPU requested with the @resources context manager. AWS offers various types of GPUs, and they can be added to your AWS Batch cluster, but selecting the GPU you would like to use for your machine learning model is not possible in Metaflow. In Metaflow you can use the @resources decorator to define the required resources of a step. These resources will then be provided by AWS Batch if they are available in the cluster. The decorator also allows you to request GPUs

    In the Gitter Metaflow/Community, there are multiple mentions GPU but not clear what are the right steps to use GPU.

    Savin @savingoyal Jan 20 16:58 @amanbedi23 Looks like you need to use a custom GPU enabled AMI when setting up your batch compute environment with G2 instances. For P2 instanced, amazon will automatically provision GPU enabled AMIs.

    Savin @savingoyal Jan 21 15:27 Are you using a custom AMI for your instanced. AWS Batch for p instances launches with the appropriate accelerator enabled AMIs out of the box. Yes, we have run GPU workloads on AWS Batch using metaflow.

    russellbrooks @russellbrooks Jan 23 17:29 @amanbedi23 Your best bet is probably to build your own docker image with most/all the dependencies you expect to encounter, upload that to ECR, and then reference the image in the @batch decorator or when invoking the flow from the CLI (--with batch:image=YourCustomImageName:latest). Another benefit of this approach is better performance at runtime and less copy/pasted conda decorators for commonly shared dependencies.


    Please update the documentation to detail the requirements, steps on how to use GPU.

    • Do we need to setup a custom AMI or need to use specific ML AMI from AWS unless EC2 instance type is P?
    • Does @resource is enough to use GPU regardless the EC2 instance type?
    • Does Metaflow looks after the NVIDIA driver installation or use Nvidia Toolkit docker by default? How GPU drivers are handled in the docker instances managed by Metarflow?
    opened by oonisim 14
  • Pytorch Parallel Decorator and test

    Pytorch Parallel Decorator and test

    Follow-up for @parallel, with @pytorch_parallel that sets the environment for Pytorch's DDP (distributed data parallel).

    Making this work requires small change to the parallel decorator so that the initilaiization code is run at the right time.

    Added a test flow:

    python  test/parallel/ --no-pylint  run

    (--no-pylint is needed for torch)

    opened by akyrola 13
  • Task crashed due to CannotInspectContainerError: Could not transition to inspecting; timed out after waiting 30s

    Task crashed due to CannotInspectContainerError: Could not transition to inspecting; timed out after waiting 30s

    Been seeing the following issue in our flows.

    Hypothesis I have explored.

    • Doesn't seem to be any permission or improper setup thing, since dozens of identical branches triggered through foreach process just fine. And they have the exact s3 roles and permissions, so not even sure what the credentials could not be located message indicates.
    • it executes every line of code from the step and does as expected, but then it leads to this timeout error during container exit at the end which makes the entire flow go haywire and fail.
    • I thought maybe it has something to do with trying to snapshot large dataframes, but this error seems to appear randomly regardless of the amount of data being processed in the task.

    Any insights?

    timestamp | message
    -- | --
    1.6107E+12 | 2021-01-15 14:47:29.067 [40293/process_a_provider/211754 (pid 17969)]   Task is starting.
    1.6107E+12 | 2021-01-15 14:47:35.168 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status SUBMITTED)...
    1.6107E+12 | 2021-01-15 14:48:01.009 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:48:01.009 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:48:31.192 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:49:01.335 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:49:31.467 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:50:01.628 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:50:33.037 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNABLE)...
    1.6107E+12 | 2021-01-15 14:50:35.746 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status STARTING)...
    1.6107E+12 | 2021-01-15 14:51:05.856 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status STARTING)...
    1.6107E+12 | 2021-01-15 14:51:29.848 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting (status RUNNING)...
    1.6107E+12 | 2021-01-15 14:51:39.622 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Setting up task environment.
    1.6107E+12 | 2021-01-15 14:59:17.185 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] WARNING: Retrying (Retry(total=4,   connect=None, read=None, redirect=None, status=None)) after connection broken   by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]   certificate verify failed: unable to get local issuer certificate   (_ssl.c:1123)'))': /simple/awscli/
    1.6107E+12 | 2021-01-15 15:03:31.869 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Downloading code package.
    1.6107E+12 | 2021-01-15 15:03:41.894 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Code package downloaded.
    1.6107E+12 | 2021-01-15 15:03:47.287 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Bootstrapping environment.
    1.6107E+12 | 2021-01-15 15:08:27.341 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Environment bootstrapped.
    1.6107E+12 | 2021-01-15 15:09:32.012 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Task is starting.
    1.6107E+12 | 2021-01-15 15:10:31.520 [40293/process_a_provider/211754 (pid 17969)]   [c94edb3a-091c-4e35-a885-d9bc00ba2bad] Provider processed successfully -   0655011
    1.6107E+12 | 2021-01-15 15:11:05.938 [40293/process_a_provider/211754 (pid 17969)]   Batch error:
    1.6107E+12 | 2021-01-15 15:11:05.938 [40293/process_a_provider/211754 (pid 17969)]   Task crashed due to CannotInspectContainerError: Could not transition to   inspecting; timed out after waiting 30s .This could be a transient error. Use   @retry to retry.
    1.6107E+12 | 2021-01-15 15:11:05.938 [40293/process_a_provider/211754 (pid   17969)]
    1.6107E+12 | 2021-01-15 15:11:06.448 [40293/process_a_provider/211754 (pid 17969)]   Task failed.
    1.6107E+12 | 2021-01-15 15:11:07.060 [40293/process_a_provider/211754 (pid 18823)]   Task fallback is starting to handle the failure.
    1.6107E+12 | 2021-01-15 15:11:07.775 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 7 more times..
    1.6107E+12 | 2021-01-15 15:11:09.790 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 6 more times..
    1.6107E+12 | 2021-01-15 15:11:16.810 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 5 more times..
    1.6107E+12 | 2021-01-15 15:11:20.825 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 4 more times..
    1.6107E+12 | 2021-01-15 15:11:32.850 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 3 more times..
    1.6107E+12 | 2021-01-15 15:11:50.880 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 2 more times..
    1.6107E+12 | 2021-01-15 15:12:27.929 [40293/process_a_provider/211754 (pid 18823)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 1 more times..
    1.6107E+12 | 2021-01-15 15:13:31.995 [40293/process_a_provider/211754 (pid 18823)]   Internal error
    1.6107E+12 | 2021-01-15 15:13:31.999 [40293/process_a_provider/211754 (pid 18823)]   Traceback (most recent call last):
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/", line 853, in main
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   start(auto_envvar_prefix='METAFLOW', obj=state)
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 764, in __call__
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   return self.main(args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 717, in main
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)] rv   = self.invoke(ctx)
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 1137, in invoke
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   return _process_result(sub_ctx.command.invoke(sub_ctx))
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 956, in invoke
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   return ctx.invoke(self.callback, ctx.params)
    1.6107E+12 | 2021-01-15 15:13:32.000 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 555, in invoke
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   return callback(args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/click/",   line 27, in new_func
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   return f(get_current_context().obj, args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/", line 430, in step
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   max_user_code_retries)
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/", line 278, in run_step
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   monitor=self.monitor)
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/", line 34, in   __init__
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   super(S3DataStore, self).__init__(args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.001 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/", line 362,   in __init__
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   self.save_metadata('attempt', {'time': time.time()})
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/", line 49,   in method
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   return f(self, args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/", line 168, in   save_metadata
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   self._put_s3_object(path, json.dumps(data).encode('utf-8'))
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/util/", line 37,   in retry_wrapper
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   raise last_exc
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/util/", line 21,   in retry_wrapper
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   return f(self, args, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File "/tmp/tmpolxsu4uf/metaflow/datastore/", line 65, in   _put_s3_object
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   self.s3.upload_fileobj(buf, url.netloc, url.path.lstrip('/'))
    1.6107E+12 | 2021-01-15 15:13:32.002 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/boto3/s3/",   line 539, in upload_fileobj
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   return future.result()
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/s3transfer/",   line 106, in result
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   return self._coordinator.result()
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/s3transfer/",   line 265, in result
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   raise self._exception
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/s3transfer/",   line 126, in __call__
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   return self._execute_main(kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/s3transfer/",   line 150, in _execute_main
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   return_value = self._main(kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/s3transfer/",   line 692, in _main
    1.6107E+12 | 2021-01-15 15:13:32.003 [40293/process_a_provider/211754 (pid 18823)]   client.put_object(Bucket=bucket, Key=key, Body=body, extra_args)
    1.6107E+12 | 2021-01-15 15:13:32.004 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 357, in _api_call
    1.6107E+12 | 2021-01-15 15:13:32.004 [40293/process_a_provider/211754 (pid 18823)]   return self._make_api_call(operation_name, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.004 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 648, in _make_api_call
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   operation_model, request_dict, request_context)
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 667, in _make_request
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   return self._endpoint.make_request(operation_model, request_dict)
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 102, in make_request
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   return self._send_request(request_dict, operation_model)
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 132, in _send_request
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   request = self.create_request(request_dict, operation_model)
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 116, in create_request
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 356, in emit
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   return self._emitter.emit(aliased_event_name, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.111 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 228, in emit
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   return self._emit(event_name, kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 211, in _emit
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   response = handler(kwargs)
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 90, in handler
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   return self.sign(operation_name, request)
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 157, in sign
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   auth.add_auth(request)
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 425, in add_auth
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   super(S3SigV4Auth, self).add_auth(request)
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   File   "/home/ec2-user/setup-workspace/miniconda/envs/metaflow_AnalysisFlow_linux-64_4b7b525aec038862f563ddf6c526fef7d88b1900/lib/python3.6/site-packages/botocore/",   line 357, in add_auth
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   raise NoCredentialsError
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid 18823)]   botocore.exceptions.NoCredentialsError: Unable to locate credentials
    1.6107E+12 | 2021-01-15 15:13:32.112 [40293/process_a_provider/211754 (pid   18823)]
    1.6107E+12 | 2021-01-15 15:13:32.477 [40293/process_a_provider/211754 (pid 18823)]   Task failed.
    1.6107E+12 | 2021-01-15 15:13:33.071 [40293/process_a_provider/211754 (pid 18937)]   Task fallback is starting to handle the failure.
    1.6107E+12 | 2021-01-15 15:13:33.763 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 7 more times..
    1.6107E+12 | 2021-01-15 15:13:39.179 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 6 more times..
    1.6107E+12 | 2021-01-15 15:13:42.798 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 5 more times..
    1.6107E+12 | 2021-01-15 15:13:49.822 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 4 more times..
    1.6107E+12 | 2021-01-15 15:14:00.841 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 3 more times..
    1.6107E+12 | 2021-01-15 15:14:18.866 [40293/process_a_provider/211754 (pid 18937)] S3   datastore operation _put_s3_object failed (Unable to locate credentials).   Retrying 2 more times..
    opened by Viveckh 13
  • Conditional branch documentation + example usage

    Conditional branch documentation + example usage

    From the flowspec docstring:

    - Conditional branch:, self.if_false, condition='boolean_variable')
      In this situation, both `if_true` and `if_false` are methods in the current class
      decorated with the `@step` decorator and `boolean_variable` is a variable name
      in the current class that evaluates to True or False. The `if_true` step will be
      executed if thecondition variable evaluates to True and the `if_false` step will
      be executed otherwise

    It'd be great to have this mentioned on Metaflow's website with an example.

    On a related note, this capability makes it easier to introduce cycles into the DAG and while the documentation mentions:

    Metaflow infers a directed (typically acyclic) graph based on the transitions between step functions.

    The acyclicity seems to be checked by the linter, however that piece of validation is not disabled using the --no-pylint flag. It's alluded to the possibility of a graph with cycles, but it doesn't seem possible to do for now.


    from metaflow import FlowSpec, step
    class TestFlowConditional(FlowSpec):
        A toy flow to mimic a hyperparameter tuning strategy.
        The flow performs the following steps:
        1) Load data.
        2) Generate hyperparameter candidates.
        3) Fan-out training over hyperparameter candidates to evaluate using foreach.
        4) Join results.
        5) Conditionally stop at max iterations or keep evaluating.
        def start(self):
            # = ...
            self._iteration = 0
            self._max_iteration = 15
            self._num_candidates = 3
            self.results = []
        def generate_candidates(self):
            candidates = []
            for _ in range(self._num_candidates):
                candidate = {
                    "hyperparameters": { 
                        # ... 
            self.candidates = candidates
            self._iteration += len(candidates)
  , foreach='candidates')
        def train(self):
            hyperparams = self.input['hyperparameters']
            # ...
        def join(self, inputs):
            Combine results for hyperparameter candidates.
            # ...
        def should_stop(self):
            Conditional branch to end when max iterations is reached, otherwise evaluate more candidates.
            self._done = self._iteration < self._max_iteration
  , self.generate_candidates, condition='_done')
        def end(self):
    if __name__ == '__main__':
    python --no-pylint run

    Results in:

    Metaflow 2.0.1 executing TestFlowConditional for user:russell
    Validating your flow...
        Validity checker found an issue on line 62:
        There is a loop in your flow: generate_candidates->train->join->should_stop->generate_candidates. Break the loop by fixing transitions.

    Keep up the great work and I've been enjoying Metaflow so far!

    opened by russellbrooks 13
  • Add support for custom tags at run time

    Add support for custom tags at run time

    It would be nice one could add custom tags via a flow/step decorator or environmental variables. Currently the tagging system seems optimized for post-run tagging, but often times one wants to tag a specific run in order to find that run later. I would propose either a decorator on the flow or environmental variables.

    # within flow decorator or on a step
    @project(name="my_project", tags=("tag1", "tag2:tag2value" , "tag3"))
    class MyFlow(FlowSpec):
      @tags(("start_tag1", "start_tag2:start_tag2_value"))
      def start(self):
    # with an env var
    export METAFLOW_USER_TAGS="tag1|tag2:tag2value|tag3"
    opened by dhpollack 0
  • Introduce support for micromamba for @conda

    Introduce support for micromamba for @conda

    Includes -

    1. Integrating with micromamba server for env resolution and set up
    2. Pipelined caching of conda packages
    3. Support for virtual packages
    4. Utilizing micromamba for cached env setup on remote environments
    opened by savingoyal 0
  • Invalidate .metaflow folder if suitable changes to metaflow config are detected

    Invalidate .metaflow folder if suitable changes to metaflow config are detected

    If certain changes are made to metaflow config (or AWS config) then the state in .metaflow folder may not be reflective of the new reality (latest run ids, conda lock files, local metadata etc). In those scenarios, we can invalidate the folder rather that asking the user to nuke it manually.

    opened by savingoyal 0
  • Record GIT information in metadata

    Record GIT information in metadata

    It may be interesting to record the commit and possibly the current diff file (similar to what Comet does) when running a flow to allow the user to recreate the development environment (and not just the code package).

    opened by romain-intel 0
  • Support AWS inferentia instances (e.g. `inf1.xlarge`, `trn1.2xlarge`)

    Support AWS inferentia instances (e.g. `inf1.xlarge`, `trn1.2xlarge`)

    AWS Inferentia (and Tranium) instances are a custom ASICs designed specifically for running ML models.

    We'd like to run inferentia based workflows on AWS Batch using Metaflow. This would support workflows with Metaflow such as:

    • Using neuron for training
    • Compiling trained models with neuron
    • Benchmarking neuron models
    • Running neuron inference

    The necessary changes are (using the API definitions here and docs for Inferentia on AWS Batch here):

    • Support adding AWS::Batch::JobDefinition.ContainerProperties.LinuxParameters.Devices in the batch job definition to mount the neuron devices.

      For example:

      	"devices": [
      	        "containerPath": "/dev/neuron0",
      	        "hostPath": "/dev/neuron0",
      	        "permissions": [
    • Support setting AWS::Batch::JobDefinition.Priviliged to true - this is in lieu of support for setting capabilities on AWS Batch Job Definitions ( which is more specific and preferable

      For example:

      	"privileged": true
    opened by Limess 0
  • 2.7.18(Dec 8, 2022)

    What's Changed

    • Adds check for tutorials dir and flattens if necessary by @ashrielbrian in
    • Fix bug with datastore backend instantiation by @savingoyal in

    New Contributors

    • @ashrielbrian made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.17(Dec 7, 2022)

    What's Changed

    • Fix regression causing CL tool to not work. by @romain-intel in
    • Bump qs from 6.5.2 to 6.5.3 in /metaflow/plugins/cards/ui by @dependabot in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.16(Dec 6, 2022)

    What's Changed

    • Deal with transient errors (like SlowDowns) more effectively for S3 by @romain-intel in
    • Fix/move data files by @romain-intel in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.15(Dec 2, 2022)

    What's Changed

    • Handle aborted Kubernetes workloads. by @shrinandj in
    • Bump loader-utils from 3.2.0 to 3.2.1 in /metaflow/plugins/cards/ui by @dependabot in
    • Fix ._orig access for submodules for MF extensions by @romain-intel in
    • Update black to latest version by @savingoyal in
    • allow equal sign in decorator spec values by @amerberg in
    • Typo repair and PEP8 cleanup by @jimbudarz in
    • Pin GH tests to Ubuntu 20.04 by @savingoyal in
    • Set gpu resources correctly "--with kubernetes" by @shrinandj in
    • Clean up configuration variables by @romain-intel in
    • GCP datastore implementation by @jackie-ob in
    • Bump version; remove R tests by @romain-intel in

    New Contributors

    • @shrinandj made their first contribution in
    • @amerberg made their first contribution in
    • @jimbudarz made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.14(Nov 3, 2022)

    What's Changed

    • fix pandas call bug by @mbalajew in
    • Metaflow pathspec in Airflow UI by @valayDave in
    • Allow the input paths to be passed via a file by @romain-intel in
    • Check compatibility for R 4.2 by @savingoyal in
    • issue 1040 fix: apply _sanitize to template names in Argo workflows by @johnaparker in

    New Contributors

    • @mbalajew made their first contribution in
    • @johnaparker made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.13(Oct 18, 2022)

    What's Changed

    • Add cmd extension point to allow MF extensions to extend it by @romain-intel in
    • Fix periodic messages printed at runtime by @romain-intel and @jackie-ob in, and
    • Pass datastore_type to validate_environment by @romain-intel in
    • Support kubernetes_conn_id in Airflow integration by @valayDave in
    • Use json to dump/load decorator specs by @romain-intel in
    • argo use kubernetes client class by @oavdeev in
    • Rewrite IncludeFile implementation by @romain-intel in
    • Add options to make card generation faster in some cases by @romain-intel in
    • Env escape improvements and bug fixes by @romain-intel in
    • Allow figures in Image.from_matplotlib by @valayDave in
    • Bump for release by @romain-intel in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.12(Sep 26, 2022)

    The Metaflow 2.7.12 release is a minor release

    What's Changed

    • Make a well-formed module by @savingoyal in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.11(Sep 16, 2022)

    The Metaflow 2.7.11 release is a minor release


    • Fix DeprecationWarning on invalid escape sequence by @tommybrecher in
    • fix cpu value formatting for aws batch/sfn by @oavdeev in

    New Contributors

    • @tommybrecher made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.10(Sep 8, 2022)

    What's Changed

    • Card bug fix when task-ids are non-unique by @valayDave in
    • Bump version to 2.7.10 to prepare for release by @romain-intel in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.9(Sep 5, 2022)

    What's Changed

    • Fix issue with S3 URLs for packages by @savingoyal in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.8(Sep 3, 2022)

    What's Changed

    • Support airflow with metaflow on azure by @valayDave in
    • Fix issue with S3 invocation for conda bootstrap by @savingoyal in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.7(Aug 25, 2022)

    Metaflow 2.7.7 Release Notes

    The Metaflow 2.7.7 release is a minor release


    • Fix an issue with get_cards not respecting a Task's ds-root in
    • more robust resource type conversions for aws batch/sfn in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.6(Aug 9, 2022)

    What's Changed

    • Fix another issue with the escape hatch and paths by @romain-intel in
    • Bump to 2.7.6 by @romain-intel in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.5(Aug 5, 2022)

    What's Changed

    • Fix for env_escape bug when importing local packages by @hunsdiecker in
    • Bump to 2.7.5 by @romain-intel in

    New Contributors

    • @hunsdiecker made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.4(Aug 3, 2022)

    What's Changed

    • Fix docstrings for the upcoming API reference (no functional changes!) by @tuulos in
    • Move a sys.path modification in s3op to main by @romain-intel in
    • Airflow Support by @valayDave in
    • Move sys.path insert earlier in by @romain-intel in
    • bump version to 2.7.4 for release by @savingoyal in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.3(Jul 29, 2022)

    Metaflow 2.7.3 Release Notes

    The Metaflow 2.7.3 release is a minor release


    • Fix fractional resoure handling for batch in


    • Metadata version check flag by @mrfalconer in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.2(Jul 14, 2022)

    Metaflow 2.7.2 Release Notes

    Metaflow 2.7.2 is a minor release


    • Support M1 Macs for @conda in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.1(Jun 17, 2022)

    This is a patch release addressing a behavior of the environment escape mechanism.

    Bug Fixes

    • Previously, if the environment escape mechanism provided a package, a failure would occur if that package was also present in the inner environment. This is now changed and, in that case, the package present is used and the environment escape mechanism is not used.

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.7.0(Jun 16, 2022)

    This is a minor release which primarily adds the ability to do runtime tagging.


    • Adds the ability to mutate a run's tags during or after a run. A CLI tool is provided (tag) as well as methods in the client add_tags, replace_tags and remove_tags. If using the Metaflow Metadata service, a version greater than 2.3.0 is required to use this feature.

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.6.3(May 26, 2022)

    Metaflow 2.6.3 Release Notes

    The Metaflow 2.6.3 release is a minor release

    Bug Fixes

    • Fix instance metadata calls for IMDSV2 in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.6.2(May 26, 2022)

    Metaflow 2.6.2 Release Notes

    The Metaflow 2.6.2 release is a minor release


    • Support setting default secrets for @kubernetes (#1048 ). Metaflow allows you mount secrets in Kubernetes containers created by tasks. Now you can specify a set of secrets to be mounted by default via METAFLOW_KUBERNETES_SECRETS configuration option, in addition to existing @kubernetes(secrets="...") API.


    • When using --run-id-file, the file is now written prior to execution when resuming a flow (#1051). That matches how run command behaves already.

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.6.1(May 13, 2022)

    Metaflow 2.6.1 Release Notes

    The Metaflow 2.6.1 release is a minor release.


    • Proper support for custom S3 endpoints. This enables using S3-compatible object storages like MinIO or Dell EMC-ECS as data stores for Metaflow ( )

    Bug fixes

    • Fixed card rendering for tables with some NaN values ( in
    • current.pathspec to return None when used outside Flow in
    • Fixed bug in the card list command in
    • Fixed issues with S3 get and ranges in
    • Fix _new_task calling bug in LocalMetadataProvider in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.6.0(Apr 25, 2022)

    Metaflow 2.6.0 Release Notes

    The Metaflow 2.6.0 release is a minor release and introduces Metaflow's integration with Kubernetes and Argo Workflows

    • Features
      • Add capability to launch Metaflow tasks on Kubernetes and schedule Metaflow flows with Argo Workflows.
      • Expose tags in current object.


    Add capability to launch Metaflow tasks on Kubernetes and schedule Metaflow flows with Argo Workflows.

    This release enables brand new capabilities for Metaflow on top of Kubernetes. You can now run --with kubernetes all or parts of any Metaflow flow on top of any Kubernetes cluster from your workstation. To execute your flow asynchronously, you can deploy the flow to Argo Workflows (a Kubernetes-native workflow scheduler) with a single command - argo-workflows create.

    To get started, take a look at the deployment guide for Kubernetes. Your feedback and feature requests are highly appreciated! - please reach out to us at

    PR #992 addressed issue #50.

    Expose tags in current object.

    Metaflow tags are now available as part of the current singleton object.

    def my_step(self):
        from metaflow import current
        tags = current.tags

    PR #1019 fixed issue #1007.

    Source code(tar.gz)
    Source code(zip)
  • 2.5.4(Mar 25, 2022)

    Metaflow 2.5.4 Release Notes

    The Metaflow 2.5.4 release is a minor release.

    Bug Fixes

    • Card bug fixes (, )
    • importlib_metadata fixes for Python 3.5 ( )
    • Configurable temp root when pulling artifacts from s3 by @kgullikson88 in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.5.3(Mar 8, 2022)

    Metaflow 2.5.3 Release Notes

    The Metaflow 2.5.3 release is a minor release.


    • Fix "Too many symbolic links" error when using Conda + Batch on MacOS by @bishax in
    • Emit app tag for AWS Batch jobs ( )

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.5.2(Feb 17, 2022)

    Metaflow 2.5.2 Release Notes

    The Metaflow 2.5.2 release is a minor release.


    • follow symlinks when creating code packages

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.5.1(Feb 15, 2022)

    Metaflow 2.5.1 Release Notes

    The Metaflow 2.5.1 release is a minor release.

    New Features

    • Introduce Mamba as a dependency solver for @conda in . Mamba promises faster package dependency resolution times, which should result in an appreciable speedup in flow environment initialization. It is not yet enabled by default; to use it you need to set METAFLOW_CONDA_DEPENDENCY_RESOLVER to mamba in Metaflow config.


    • Vendor in click to reduce chances of dependency conflicts with user code in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.5.0(Jan 25, 2022)

    Metaflow 2.5.0 Release Notes

    The Metaflow 2.5.0 release is a minor release.

    New Features

    :sparkles: Metaflow cards are now publicly available! For details, see a new section in the documentation, Visualizing Results, and a release blog post.

    Bug Fixes

    • Fix issue in Step Functions integration with CLI defined decorators ( )
    • Fix compute_resources to take into account string values ( )

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.4.9(Jan 19, 2022)

    Metaflow 2.4.9 Release Notes

    The Metaflow 2.4.9 release is a patch release.


    • Store information about the DAG being executed in an artifact. This will allow to render execution DAG in a @card ( )

    Bug Fixes

    • Fixed cli command when task_id provided ( by @zhugejun in )
    • Fix with metadata syncing on AWS Batch when running without remote metadata service ( )
    • Fix default resource math. Previously we sometimes computed vCPU and memory settings incorrectly, in cases when they were set to something less than the default value ( , fixes )

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • 2.4.8(Jan 11, 2022)

    Metaflow 2.4.8 Release Notes

    The Metaflow 2.4.8 release is a patch release.

    Bug fixes

    • aws_retry's S3_RETRY_COUNT now has to be >=1 ( )
    • fix argument type handling for host_volumes when used with --with and Step Functions ( )


    • Improved validation logic to capture reserved keywords ( fixes #589 )
    • Remove default use of ( )

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
Netflix, Inc.
Netflix Open Source Platform
Netflix, Inc.
artisan: visual scope for coffee roasters

Artisan Visual scope for coffee roasters WARNING: pre-release builds may not work. Use at your own risk. Summary Artisan is a software that helps coff

Artisan – Visual Scope for Coffee Roasters 705 Jan 05, 2023
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

6.4k Jan 02, 2023
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8.1k Dec 30, 2022
Incubator for useful bioinformatics code, primarily in Python and R

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics. All code, images and

Brad Chapman 560 Dec 24, 2022
PennyLane is a cross-platform Python library for differentiable programming of quantum computers.

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

PennyLaneAI 1.6k Jan 04, 2023
Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations

DGFEM Maxwell Equations Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations. Work in progress. Currently, the 1D Maxwell equati

Rafael de la Fuente 9 Aug 16, 2022
A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

Arnav Jindal 2 Nov 14, 2021
collection of interesting Computer Science resources

collection of interesting Computer Science resources

Kirill Bobyrev 137 Dec 22, 2022
3D medical imaging reconstruction software

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

443 Jan 01, 2023
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
Read-only mirror of

Pybliographer Pybliographer provides a framework for working with bibliographic databases. This software is licensed under the GPLv2. For more informa

GNOME Github Mirror 15 May 07, 2022
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica

Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica. It is free both as in "free beer" and as in "freedom".

Mathics 535 Jan 04, 2023
ReproZip is a tool that simplifies the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science.

ReproZip ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used comm

267 Jan 01, 2023
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

Jon C Cline 0 Sep 05, 2021
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Kenji Hiranabe 3.2k Jan 08, 2023
CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation

CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation. The code should run on any Linux system, from massively parallel computer

Jeppe Dakin 62 Dec 08, 2022
Datamol is a python library to work with molecules

Datamol is a python library to work with molecules. It's a layer built on top of RDKit and aims to be as light as possible.

datamol 276 Dec 19, 2022
Data intensive science for everyone.

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

Galaxy Project 1k Jan 08, 2023
Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects

Metaflow Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects. Metaflow

Netflix, Inc. 6.3k Jan 03, 2023
Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

Quantopian, Inc. 15.7k Jan 07, 2023