What is pymc-learn?

pymc-learn is a library for practical probabilistic machine learning in Python.

It provides a variety of state-of-the art probabilistic models for supervised and unsupervised machine learning. It is inspired by scikit-learn and focuses on bringing probabilistic machine learning to non-specialists. It uses a syntax that mimics scikit-learn. Emphasis is put on ease of use, productivity, flexibility, performance, documentation, and an API consistent with scikit-learn. It depends on scikit-learn and PyMC3 and is distributed under the new BSD-3 license, encouraging its use in both academia and industry.

Users can now have calibrated quantities of uncertainty in their models using powerful inference algorithms -- such as MCMC or Variational inference -- provided by PyMC3. See :doc:`why` for a more detailed description of why pymc-learn was created.


pymc-learn leverages and extends the Base template provided by the PyMC3 Models project:

Transitioning from PyMC3 to PyMC4

.@pymc_learn has been following closely the development of #PyMC4 with the aim of switching its backend from #PyMC3 to PyMC4 as the latter grows to maturity. Core devs are invited. Here's the tentative roadmap for PyMC4: cc @pymc_devs

— pymc-learn (@pymc_learn) November 5, 2018

Familiar user interface

pymc-learn mimics scikit-learn. You don't have to completely rewrite your scikit-learn ML code.

from sklearn.linear_model \                         from pmlearn.linear_model \
  import LinearRegression                             import LinearRegression
lr = LinearRegression()                             lr = LinearRegression(), y)                              , y)

The difference between the two models is that pymc-learn estimates model parameters using Bayesian inference algorithms such as MCMC or variational inference. This produces calibrated quantities of uncertainty for model parameters and predictions.

Quick Install

pymc-learn requires a working Python interpreter (2.7 or 3.5+). It is recommend installing Python and key numerical libraries using the Anaconda Distribution, which has one-click installers available on all major platforms.

Assuming a standard Python environment is installed on your machine (including pip), pymc-learn itself can be installed in one line using pip:

You can install pymc-learn from PyPi using pip as follows:

pip install pymc-learn

Or from source as follows:

pip install git+


pymc-learn is under heavy development.

It is recommended installing pymc-learn in a Conda environment because it provides Math Kernel Library (MKL) routines to accelerate math functions. If you are having trouble, try using a distribution of Python that includes these packages like Anaconda.


pymc-learn is tested on Python 2.7, 3.5 & 3.6 and depends on Theano, PyMC3, Scikit-learn, NumPy, SciPy, and Matplotlib (see requirements.txt for version information).

Quick Start

# For regression using Bayesian Nonparametrics
>>> from sklearn.datasets import make_friedman2
>>> from pmlearn.gaussian_process import GaussianProcessRegressor
>>> from pmlearn.gaussian_process.kernels import DotProduct, WhiteKernel
>>> X, y = make_friedman2(n_samples=500, noise=0, random_state=0)
>>> kernel = DotProduct() + WhiteKernel()
>>> gpr = GaussianProcessRegressor(kernel=kernel).fit(X, y)
>>> gpr.score(X, y)
>>> gpr.predict(X[:2,:], return_std=True)
(array([653.0..., 592.1...]), array([316.6..., 316.6...]))

Scales to Big Data & Complex Models

Recent research has led to the development of variational inference algorithms that are fast and almost as flexible as MCMC. For instance Automatic Differentation Variational Inference (ADVI) is illustrated in the code below.

from pmlearn.neural_network import MLPClassifier
model = MLPClassifier(), y_train, inference_type="advi")

Instead of drawing samples from the posterior, these algorithms fit a distribution (e.g. normal) to the posterior turning a sampling problem into an optimization problem. ADVI is provided PyMC3.

Citing pymc-learn

To cite pymc-learn in publications, please use the following:

Emaasit, Daniel (2018). Pymc-learn: Practical probabilistic machine
learning in Python. arXiv preprint arXiv:1811.00542.

Or using BibTex as follows:

  title={Pymc-learn: Practical probabilistic machine learning in {P}ython},
  author={Emaasit, Daniel and others},
  journal={arXiv preprint arXiv:1811.00542},

If you want to cite pymc-learn for its API, you may also want to consider this reference:

Carlson, Nicole (2018). Custom PyMC3 models built on top of the scikit-learn

Or using BibTex as follows:

  title={pymc3_models: Custom PyMC3 models built on top of the scikit-learn API,
  author={Carlson, Nicole},


New BSD-3 license


Getting Started

.. toctree::
   :maxdepth: 1
   :caption: Getting Started


User Guide

The main documentation. This contains an in-depth description of all models and how to apply them.

.. toctree::
   :maxdepth: 1
   :caption: User Guide



Pymc-learn provides probabilistic models for machine learning, in a familiar scikit-learn syntax.

.. toctree::
   :maxdepth: 1
   :caption: Examples


API Reference

pymc-learn leverages and extends the Base template provided by the PyMC3 Models project:

.. toctree::
   :maxdepth: 1
   :caption: API Reference


Help & reference

.. toctree::
   :maxdepth: 1
   :caption: Help & reference


  Dependent packages are pinned too specifically

    Dependent packages are pinned too specifically

    An unfortunate consequence of the manner in which the package dependencies are specified in the requirements.txt file is that it forces obsolete versions of important packages like numpy, pandas and scikit-learn on anyone who installs via pip install pymc-learn. I think it would be best to change the requirements.txt file to use >= instead of == wherever possible, unless it has been demonstrated that newer versions of specific dependencies genuinely exhibit incompatibilities. Otherwise, it might be better to warn prospective users that they ought to create a brand new virtual environment to use this package.

    opened by alapite 3
  • PyMCon


    Hi everyone!

    As you may have already seen on Twitter or on PyMC Discourse, we are planning a virtual conference for the PyMC community. All the information is available in the Discourse post.

    We are currently looking for conference chairs and volunteers and would be very grateful if you could share the word! We also want to encourage any of you to, if you are interested and available, apply to be a conference chair.

    opened by OriolAbril 1
  No requirements.txt while in the PyPi release

    No requirements.txt while in the PyPi release

    When I download the pymc-learn whl from PyPi, it looks like it is missing the requirements.txt file. You may want to make sure the build and packaging are correct.

    opened by Emaasit 1
  pymc-learn fails with a theano error

    pymc-learn fails with a theano error


    First time trying to install and use pymc-learn on a Windows10, Anaconda, pymc (latest stable) environment

    Here are the commands I ran and the error I get from the pymc-learn quick start guide:

    (pymc_learn) C:\Users\sreedatta>python
    Python 3.10.2 | packaged by conda-forge | (main, Jan 14 2022, 07:58:58) [MSC v.1929 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from sklearn.datasets import make_friedman2
    >>> from pmlearn.gaussian_process import GaussianProcessRegressor
    The imported Theano(-PyMC) module is broken.
    It was imported from _NamespacePath(['C:\\Users\\sreedatta\\Anaconda3\\envs\\pymc_learn\\lib\\site-packages\\theano'])
    Try to uninstall/reinstall it after closing all active sessions/notebooks.
    Also see for installation instructions.
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Users\sreedatta\Anaconda3\envs\pymc_learn\lib\site-packages\pmlearn\gaussian_process\", line 12, in <module>
        from .gpr import GaussianProcessRegressor
      File "C:\Users\sreedatta\Anaconda3\envs\pymc_learn\lib\site-packages\pmlearn\gaussian_process\", line 8, in <module>
        import pymc3 as pm
      File "C:\Users\sreedatta\Anaconda3\envs\pymc_learn\lib\site-packages\pymc3\", line 79, in <module>
      File "C:\Users\sreedatta\Anaconda3\envs\pymc_learn\lib\site-packages\pymc3\", line 61, in __set_compiler_flags
        current = theano.config.gcc__cxxflags
    AttributeError: module 'theano' has no attribute 'config'

    Here is my environment setup (conda list output) for pymc-learn:

    opened by sreedta8 2
  PyMC3 has been renamed PyMC

    PyMC3 has been renamed PyMC

    Hi, PyMC3 has been renamed PyMC. If this affects you and you have questions, or you want someone to direct your rage at I'm available! Do let me know how i, or any of the PyMC devs can help.


    opened by canyon289 0
  Optimizing test

    Optimizing test


    I was able to reduce the running time of TestGaussianProcessRegressorPredict::test_predict_returns_predictions test from 236 seconds to about 52 seconds on my local machine by changing n to 200.

    I also ran the test several times to ensure that the test is not flaky.


    python 3.8.5

    Is this something that you guys will be interested in? If yes, then I can help you optimize some other tests in this repo as well. If you have any other suggestions/edits I will be happy to integrate them as well. Please let me know.


    opened by loopylangur 0
  Changed sample_ppc to the newer PyMC3 method

    Changed sample_ppc to the newer PyMC3 method

    Just one tiny change. Changed the pm dot sample_ppc to pm dot sample_posterior_predictive method introduced in newer PyMC3 v3.6+

    Full Disclosure : I'm very new to Bayesian statistics so it might be possible I don't know what I'm talking about.

    Anyway, I was trying to work through the tutorials from the site and specifically the one on bayesian linear regression. I got stuck at model.predict as I kept getting this error

    AttributeError                            Traceback (most recent call last)
    <ipython-input-15-f3f0823146e6> in <module>()
    ----> 1 y_predict = model.predict(X_test)
    /usr/local/lib/python3.6/dist-packages/pmlearn/ in predict(self, X, return_std)
        277                                'model_output': np.zeros(num_samples)})
    --> 279         ppc = pm.sample_ppc(self.trace, model=self.cached_model, samples=2000)
        281         if return_std:
    AttributeError: module 'pymc3' has no attribute 'sample_ppc'

    Upon Googling, found this issue and as far as I understand, sample_ppcis now sample_posterior_predictive. So that's all I changed.

    Platforms and Versions

    Tried this only on the Google Colab VM.

    import pmlearn
    import pymc3 as pm
    opened by sharmaabhishekk 0
  pymc-learn: RuntimeError in JupterLab 1.2.6

    pymc-learn: RuntimeError in JupterLab 1.2.6

    I am learning to use pymc-learn. However, whenever I try to run an MCMC simulation with pymc-learn in JupterLab 1.2.6 , I get the following RuntimeError: RuntimeError: The communication pipe between the main process and its spawned children is broken.

    In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process. The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost. A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback. Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.

    As an example, I am using a copy-paste from the 'LinearRegression' example of the pymc-learn: model2 = LinearRegression(), y_train, inference_type='nuts')

    opened by ZhiqiangZhangCUGB 0
