Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

Last update: Dec 30, 2022

Overview

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.

Check out the getting started guide, or interact with live examples using Binder! For questions on PyMC3, head on over to our PyMC Discourse forum.

The future of PyMC3 & Theano

There have been many questions and uncertainty around the future of PyMC3 since Theano stopped getting developed by the original authors, and we started experiments with PyMC4.

We are happy to announce that PyMC3 on Theano (which we are developing further) with a new JAX backend is the future. PyMC4 will not be developed further.

See the full announcement for more details.

Features

Intuitive model specification syntax, for example, x ~ N(0,1) translates to x = Normal('x',0,1)
Powerful sampling algorithms, such as the No U-Turn Sampler, allow complex models with thousands of parameters with little specialized knowledge of fitting algorithms.
Variational inference: ADVI for fast approximate posterior estimation as well as mini-batch ADVI for large data sets.
Relies on Theano-PyMC which provides:
- Computation optimization and dynamic C or JAX compilation
- Numpy broadcasting and advanced indexing
- Linear algebra operators
- Simple extensibility
Transparent support for missing value imputation

Getting started

If you already know about Bayesian statistics:

Learn Bayesian statistics with a book together with PyMC3:

Probabilistic Programming and Bayesian Methods for Hackers: Fantastic book with many applied code examples.
PyMC3 port of the book "Doing Bayesian Data Analysis" by John Kruschke as well as the second edition: Principled introduction to Bayesian data analysis.
PyMC3 port of the book "Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath
PyMC3 port of the book "Bayesian Cognitive Modeling" by Michael Lee and EJ Wagenmakers: Focused on using Bayesian statistics in cognitive modeling.
Bayesian Analysis with Python (second edition) by Osvaldo Martin: Great introductory book. (code and errata).

PyMC3 talks

There are also several talks on PyMC3 which are gathered in this YouTube playlist and as part of PyMCon 2020

Installation

To install PyMC3 on your system, follow the instructions on the appropriate installation guide:

Citing PyMC3

Salvatier J., Wiecki T.V., Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 DOI: 10.7717/peerj-cs.55.

Contact

We are using discourse.pymc.io as our main communication channel. You can also follow us on Twitter @pymc_devs for updates and other announcements.

To ask a question regarding modeling or usage of PyMC3 we encourage posting to our Discourse forum under the “Questions” Category. You can also suggest feature in the “Development” Category.

To report an issue with PyMC3 please use the issue tracker.

Finally, if you need to get in touch for non-technical information about the project, send us an e-mail.

License

Apache License, Version 2.0

Software using PyMC3

Exoplanet: a toolkit for modeling of transit and/or radial velocity observations of exoplanets and other astronomical time series.
Bambi: BAyesian Model-Building Interface (BAMBI) in Python.
pymc3_models: Custom PyMC3 models built on top of the scikit-learn API.
PMProphet: PyMC3 port of Facebook's Prophet model for timeseries modeling
webmc3: A web interface for exploring PyMC3 traces
sampled: Decorator for PyMC3 models.
NiPyMC: Bayesian mixed-effects modeling of fMRI data in Python.
beat: Bayesian Earthquake Analysis Tool.
pymc-learn: Custom PyMC models built on top of pymc3_models/scikit-learn API
fenics-pymc3: Differentiable interface to FEniCS, a library for solving partial differential equations.
cell2location: Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics.

Please contact us if your software is not listed here.

Papers citing PyMC3

See Google Scholar for a continuously updated list.

Contributors

See the GitHub contributor page. Also read our Code of Conduct guidelines for a better contributing experience.

Support

PyMC3 is a non-profit project under NumFOCUS umbrella. If you want to support PyMC3 financially, you can donate here.

PyMC for enterprise

PyMC is now available as part of the Tidelift Subscription!

Tidelift is working with PyMC and the maintainers of thousands of other open source projects to deliver commercial support and maintenance for the open source dependencies you use to build your applications. Save time, reduce risk, and improve code health, while contributing financially to PyMC -- making it even more robust, reliable and, let's face it, amazing!

Sponsors

Comments

Rebuild conda packages

The conda packages for pymc 2.3 are not compatible with current anaconda versions. We thus need to rebuild and reupload. I can do linux 64bit, @fonnesbeck can you do osx 64bit?
release

opened by twiecki 188
Installation Instructions for Pymc3 on Windows 10 using Anaconda 3
Hi @michaelosthege - here is the first version of the Installation Reference. Let me know what changes may be needed.

For Pymc3 Windows users who may not have a programming background or have comfort with tool chains and such, the installation instructions on this page (https://github.com/pymc-devs/pymc3/wiki/Installation-Guide-(Windows)) may not be sufficient. The instructions posted below go beyond the basic installation process posted at the above link.

In addition, there are a large community of users, who use both R and Python (Anaconda). The RTools mingw tool chain has to be first on the System Environment variables so that R packages that need compilation such as rstan, brms will run correctly. In this situation, the compilation of pymc3 models will break and additional post-install User Environment Variables has to be done so pymc3 works correctly.

Versions and main components

PyMC3 Version: 3.11.2

Aesara/Theano Version: 1.1.2 ( using theano-pymc)

Python Version: 3.7 / 3.8 using Anaconda3 64-bit

Operating system: Windows 10 64-bit, with 1904 Update

How did you install PyMC3: pip

C & C++ compilers: Installed m2w64 tool chain from conda-forge

Microsoft VS C++ compiler(s) present? No

Any Competing C++ compiler(s) present? Yes - RTools mingw tool chain on the System PATH

The essence of a solid PyMC3 installation on Windows is to install most of the dependencies through conda. The reason installation via PyPI is difficult is that Theano/Aesara require compilation against MKL, which is difficult to set up, while Conda comes with its own compilers and MKL installation.

⚠ Do not pip install without first installing dependencies with conda. ⚠

Method 1: Run conda env create -f environment.yml to create a fresh environment in one step - use Notepad++, if possible to create the said environment.yml file.

environment.yml (copy from name: pm3env and ending with pymc3 and save it in C:\Users\Your_User_Name )

channels: - conda-forge - defaults dependencies: - libpython - blas - mkl-service - m2w64-toolchain - numba - pip - python=3.8 - python-graphviz - scipy - pip: - pymc3

You can change name to something meaningful for you such as pym3 or env_pym3. You do not have to use pm3env. Keep it simple and meaningful so you can use the environment with ease when using the command line.

Method 2: You can create pymc3 specific environment also directly from the Anaconda3 Command Prompt using the following command: conda create -n pm3env -c conda-forge "python=3.8" libpython mkl-service m2w64-toolchain numba python-graphviz scipy

After you have created the environment, you can test it has been created successfully by typing: conda info --envs - the output of the command will show all the environments that exist in the current Anaconda3 install. Hopefully you will see your environment for pymc3

Activate your environment by typing the command: conda activate pm3env (or whatever name you chose) Your command prompt will look like: (pm3env) C:\Users\Your_User_Name

Check Packages Installed using the command conda list and all packages installed in pm3env will be shown. Check if either one of theano or theano-pymc has been installed. Make note of it (It should not be)

Install Pymc3: Now you can install Pymc3 using the command: pip install pymc3 and then if all requirements are met, all packages from pip and their dependencies will be installed,. Of critical importance to note is whether theano-pymc has been installed along with pymc3 - pymc3 will not run without this - Here theano-pymc should be installed. In addition, in the pip output check that dependencies such as arviz (for working with pymc3 objects) and matplotlib (for general purpose data graphing) have also been installed.

Sometimes these packages may not be installed in your new environment but will be installed likely in the location below on Windows 10: c:\users\your_user_name\appdata\roaming\python\python38\site-packages - note the number 38 next to python38 in the folder name. This means that versions relevant for python3.8 have been installed here. These packages will not appear in the output of conda list. Ensure that the python you specified (python=3.8) matches what you see in the folder name here (python38)

[@michaelosthege The original instructions ask to install theano-pymc using conda-forge. However, I found that in the new dependency installation of pymc3, theano-pymc is being automatically being installed and it is working correctly. This is one where I need your input on whether we remove the note to update theano-pymc. I think this is outdated]

Now there are additional ways to install ``pymc3and additional variants ofpymc3```. Refer to the next section, which may be more appropriate for advanced users.

Developer Installation

If you want to tinker with PyMC3 itself, first clone the repository and then make an "editable" installation: (You need to have already installed git for this to work, if not install git first)

cd pymc3 pip install --editable .

Upgrading from Theano to Theano-PyMC - Just in case when you run conda list at the pm3env prompt and find that you have theano instead of theano-pymc (stranger things have and will happen!)

Make a note of the channel where theano was installed from: it will show pypi or conda-forge.

If you see pypi then use the command, pip uninstall theano - you should see a message stating theano has been uninstalled

If you see conda-forge use the command, conda remove theano - you should answer with a y if prompted for the removal of theano

Install theano-pymc using the command, conda install -c conda-forge theano-pymc.

Once the installation is complete, run the command conda list and verify pymc3 & theano-pymc are installed.

Optional Dependencies (before you install any packages, first check they have not been already installed using conda list

The GLM submodule relies on Patsy. Patsy brings the convenience of "R-style formulas" to Python.

pm.model_to_graphviz depends on Graphviz and pydot:

Use the command conda install -c conda-forge python-graphviz and pip install pydot-ng

In the package installations done so far, Jupyter Notebook or Jupyter Lab are not installed. If you are working with Anaconda3, you install these two Jupyter tools from Anaconda3 Navigator or from the pm3env command prompt using the command: conda install -c conda-forge notebook

Do Not Close the Command Prompt window of the pm3env - we have to use it later on.

Post Installation Checks - Do Not Skip This Step on Windows A. Assumes you have installed Jupyter Notebook B. Make note of whether you have R and in particular RTools installed on your laptop and location of its' install. C. Assumes you have either Admin or Power-User rights on your laptop so that you can make changes to the environment variables at the User level.

Windows does not come pre-installed with C and C++ compilers (as Mac and Linux Distros do) so it is important to ensure your Anaconda3 environments are pointing to the correct internal compilers.

Going back to the original creation / installation of pm3env environment, one of the packages installed is m2w64-toolchain. Here things can get complex in terms of having compiler tools specific to the version of Python and to that of the environment. So generally speaking, try to keep your Anaconda3 environments to a minimum, when you are starting out with tools such as pymc3. Also make sure you install a m2w64-toolchain in each environment you create. Conda will ensure that you have the most appropriate version of the compilers for the version of Python in that environment. There are exceptions, but it gets beyond the scope of this document.

On Windows, search for "Edit System Environment Variables". You will be taken to the screen below

Click on the "Environment Variables" on the bottom right-hand corner and you will see a new window pop-up with the top Window for User variables and the bottom window for the System variables. Click on the entry labeled Path under User Variables for Your_User_Name and click edit. Here you should add the following Anaconda paths specific to your environment:

Note the location of where Anaconda3 is installed by default (as shown below). If you have changed the location during the installation of Anaconda3, please make the changes accordingly.

First: C:\ProgramData\Anaconda3\Library\mingw-w64\bin Second: C:\ProgramData\Anaconda3\Library\bin Third: C:\ProgramData\Anaconda3\Scripts

Move each of the entries so that they appear in the exact order as shown above at the top of the User Variables Path using the Move Up and Move Down buttons

Then click "OK" to accept the changes all the way.

Go back to the pm3env Command Prompt window that is already open that displays (pm3env) C:\Users\Your_User_Name>

Type jupyter-notebook at the prompt and (hopefully) a page should be opened in your Default Browser.

On the top right corner, click on the drop-down under New

Select Python 3 (ipykernel)

You should see a new Jupyter Notebook open that looks as follows:

Make sure that at the top right corner, the button says Trusted. If this is the first time you are using a Jupyter Notebook, it might show up as Not Trusted. Click on it if it says "Not Trusted" and select the option to make it "Trusted".

In the first Notebook cell, type import theano as tp and click on the Run button.

You may see a warning as shown in the picture below (WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.)

In the next cell type the following commands and click Run import pymc3 as pm print(f"Running on PyMC3 v{pm.__version__}")

You should see an output as follows: Running on PyMC3 v3.11.2

You can now begin testing the full capabilities of Pymc3 by starting with the three examples from the Getting Started with Pymc3 - linked here - http://docs.pymc.io/notebooks/getting_started

Good luck with using Pymc3!
installation
opened by sreedat 134
Vectors of multivariate variables
It would be useful if we could model multiple independent multivariate variables in the same statement. For example, if I wanted four multivariate normal vectors with the same prior, I should be able to specify:

f = pm.MvNormal('f', np.zeros(3), np.eye(3), shape=(4,3))

but it currently returns a ValueError complaining of non-aligned matrices.
enhancements
opened by fonnesbeck 88
Add MLDA stepper
The MLDA stepper

The MLDA (Multi-Level Delayed Acceptance) [1] method employs a hierarchy of chains to sample from a posterior. The top-level chain samples from the fine model posterior (i.e. the most accurate one / the one we are interested in) and the lower level chains sample from coarse model posteriors (i.e. approximations of declining accuracy).

MLDA uses samples generated in one level as proposals for the level above. A chain runs for a fixed number of iterations and then the last sample is used as the proposal for the higher-level chain. The bottom level is a Metropolis sampler, although an algorithm like pCN or adaptive MH is expected to be used in the future.

The stepper is suitable for situations where a model is expensive to evaluate in high spatial or temporal resolution (e.g. realistic subsurface flow models where a PDE needs to be solved in each iteration). In those cases we can use cheaper, lower resolutions as coarse models. If the approximations are sufficiently good, this leads to good quality proposals, high acceptance rates and high ES/sec compared to other methods because only a few expensive fine model solves are needed to achieve the same ESS.

Changes

MLDA is a new ArrayStepShared step method in metropolis.py. Its constructor instantiates step methods for all the levels in the hierarchy.

RecursiveDA is a new Proposal proposal method in metropolis.py. It recursively calls sample to run the hierarchy of chains in multiple levels, using the step methods instantiated by MLDA.init. Note that logging is switched off here to avoid unwanted console output. sample()each time generates a fixed number of samples and then we only keep the last and return control to the level above. We rerun sample() from that point each time control goes back to that level.

A new internal variable called is_mlda_base is added to Metropolis. This is detected within sample() to avoid running reset_tuning(). This is done because we want the Metropolis step in the bottom level to maintain the tuning information across subsequent sample() calls. Note that resetting was introduced by #3733. MLDA is not affected by the issue there, as it always uses cores=1 and chains=1 in each level of the hierarchy. A more elegant solution might be the detection of cores and chains and passing the info to _iter_sample to decide if reset_tuning runs.

Tests for MLDA and RecursiveDA and two new models for testing in tests/models.py

A notebook comparing the performance of MLDA with Metropolis for a groundwater flow model (a Bayesian inverse problem setting). The forward model requires a PDE solve which is done using the C-based FEniCS library (see notebook for details). Thus it is also a demonstration of using black box external code in the likelihood. The PR adds all the necessary code to solve the model. The example is also provided in the form of a python script.

Performance

Running the notebook on a MacBook Pro (see specs within), with 3 unknown parameters and model resolution (30, 30) and (120, 120), the resulting performance comparison is the following:

Work in progress

I am trying to find ways to accelerate the method. It seems that there is a lot of overhead from creating/destroying traces and Theano parsing the likelihood when calling sample() multiple times. Stopping and restarting is necessary as I need to switch between chains and continue a chain from where it last left off. Are there faster ways to pause and resume a chain than saving the trace/tuning and calling a new sample() afterwards? Would iter_sample be more efficient in this case?

I am working on adding an adaptive bottom level sampler and also on applying an adaptive likelihood correction technique found in [2]. Is there any work going on adaptive samplers like adaptive MH or pCN within the pymc3 community?

I plan to add one more feature contained in [1] (the variance reduction technique to reduce variance of integral estimates).

Usage

Can be used as any other method but has one positional argument that needs to be provided (coarse_models). See notebook and tests for examples.

References

[1] Dodwell, Tim & Ketelsen, Chris & Scheichl, Robert & Teckentrup, Aretha. (2019). Multilevel Markov Chain Monte Carlo. SIAM Review. 61. 509-545. https://doi.org/10.1137/19M126966X [2] Cui, Tiangang & Fox, Colin & O'Sullivan, Michael. (2012). Adaptive Error Modelling in MCMC Sampling for Large Scale Inverse Problems.
opened by gmingas 82

Running multiple chains causes RecursionError

Setting the njobs parameter to run multiple chains results in an error:

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-59-548e16bedce3> in <module>()
      6 
      7 
----> 8     trace = sample(5000, njobs=2)

/Users/fonnescj/Github/pymc3/pymc3/sampling.py in sample(draws, step, start, trace, chain, njobs, tune, progressbar, model, random_seed)
    153         sample_args = [draws, step, start, trace, chain,
    154                        tune, progressbar, model, random_seed]
--> 155     return sample_func(*sample_args)
    156 
    157 

/Users/fonnescj/Github/pymc3/pymc3/sampling.py in _mp_sample(njobs, args)
    274 def _mp_sample(njobs, args):
    275     p = mp.Pool(njobs)
--> 276     traces = p.map(argsample, args)
    277     p.close()
    278     return merge_traces(traces)

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    258         in a list that is returned.
    259         '''
--> 260         return self._map_async(func, iterable, mapstar, chunksize).get()
    261 
    262     def starmap(self, func, iterable, chunksize=None):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in get(self, timeout)
    606             return self._value
    607         else:
--> 608             raise self._value
    609 
    610     def _set(self, i, obj):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
    383                         break
    384                     try:
--> 385                         put(task)
    386                     except Exception as e:
    387                         job, ind = task[:2]

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/connection.py in send(self, obj)
    204         self._check_closed()
    205         self._check_writable()
--> 206         self._send_bytes(ForkingPickler.dumps(obj))
    207 
    208     def recv_bytes(self, maxlength=None):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/reduction.py in dumps(cls, obj, protocol)
     48     def dumps(cls, obj, protocol=None):
     49         buf = io.BytesIO()
---> 50         cls(buf, protocol).dump(obj)
     51         return buf.getbuffer()
     52 

RecursionError: maximum recursion depth exceeded

opened by fonnesbeck 78

Optimization

Bayesian optimization over posterior latent space is an interesting sort of problem that becomes real with this PR. Notebook with toy example is provided

@fonnesbeck you can find Histogram application for SVGD there

opened by ferrine 73
Bring back distribution moments
With #4983 and #5087 finished, it's a good time to bring back moments for our distributions for more stable starting points.

With this we can then do the switch in #5009

We should also update the distribution developer guide to mention the implementation of moments: https://github.com/pymc-devs/pymc/blob/main/docs/source/developer_guide_implementing_distribution.md

How to help?

This PR should give a template on how to implement and test new moments for distributions: https://github.com/pymc-devs/pymc/pull/5087/files

In most cases we should be able to copy the moments we were using in the V3 branch. For example here is what we were doing for the Beta https://github.com/pymc-devs/pymc/blob/efbaccee94159de0a15b04baf84b1250047f132a/pymc3/distributions/continuous.py#L1235 2.1 We used to have multiple moments for some distributions such as mean, median, mode. We only support one moment now, and probably the "higher-order" one is the most useful (that is mean > median > mode)... You might need to truncate the moment if you are dealing with a discrete distribution. 2.2 We left some of these moments commented out inside the distribution dist classmethod. Make sure they are removed when you implement them! https://github.com/pymc-devs/pymc/blob/8f3636daf7d9946f6eca4717f3bb0c6d77d9c6e9/pymc/distributions/continuous.py#L538-L539

We have to be careful with size != None and broadcasting properly when when some parameters that are not used in the moment may nevertheless inform about the shape of the distribution. E.g. pm.Normal.dist(mu=0, sigma=np.arange(1, 6)) returns a moment of [mu, mu, mu, mu, mu]. Again #5087 should give some template to think about this https://github.com/pymc-devs/pymc/blob/8f3636daf7d9946f6eca4717f3bb0c6d77d9c6e9/pymc/distributions/continuous.py#L546-L550 3.1 In the case where you have to manually broadcast the parameters with each other it's important to add test conditions that would fail if you were not to do that. A straightforward way to do this is to make the used parameter a scalar, the unused one(s) a vector (one at a time) and size None

Just to keep things uniformish, please add the get_moment immediately below the dist classmethod and before logp or logcdf methods.

New tests have to be added in test_distributions_moments.py. Make sure to test different combinations of size and broadcasting to cover the cases mentioned in point 3.

Don't hesitate to ask any questions. You can grab as many distributions to implement moments as you want. Just make sure to write in this issue so that we can keep track of it.

Profit with your new open source KARMA!

The following distributions don't have a moment method implemented:

[x] pymc.distributions.continuous.Beta #5145

[x] pymc.distributions.continuous.Kumaraswamy #5147

[x] pymc.distributions.continuous.Exponential #5147

[x] pymc.distributions.continuous.Laplace #5147

[x] pymc.distributions.continuous.StudentT #5147

[x] pymc.distributions.continuous.Cauchy #5147

[x] pymc.distributions.continuous.HalfCauchy https://github.com/pymc-devs/pymc/pull/5148

[x] pymc.distributions.continuous.Gamma https://github.com/pymc-devs/pymc/pull/5148

[x] pymc.distributions.continuous.Weibull https://github.com/pymc-devs/pymc/pull/5148

[x] pymc.distributions.continuous.LogNormal https://github.com/pymc-devs/pymc/pull/5148

[x] pymc.distributions.continuous.HalfStudentT https://github.com/pymc-devs/pymc/pull/5152

[x] pymc.distributions.continuous.ChiSquared https://github.com/pymc-devs/pymc/pull/5154

[x] pymc.distributions.continuous.Wald #5161

[x] pymc.distributions.continuous.Pareto #5161

[x] pymc.distributions.continuous.InverseGamma #5199

[x] pymc.distributions.continuous.ExGaussian #5165

[x] pymc.distributions.continuous.VonMises #5232

[x] pymc.distributions.discrete.Binomial https://github.com/pymc-devs/pymc/pull/5150

[x] pymc.distributions.discrete.BetaBinomial #5175

[x] pymc.distributions.discrete.Poisson https://github.com/pymc-devs/pymc/pull/5150

[x] pymc.distributions.discrete.NegativeBinomial #5163

[x] pymc.distributions.discrete.Constant #5156

[x] pymc.distributions.discrete.ZeroInflatedPoisson #5163

[x] pymc.distributions.discrete.ZeroInflatedNegativeBinomial #5206

[x] pymc.distributions.discrete.ZeroInflatedBinomial #5163

[x] pymc.distributions.discrete.DiscreteUniform #5167

[x] pymc.distributions.discrete.Geometric #5158

[x] pymc.distributions.discrete.HyperGeometric #5167

[x] pymc.distributions.discrete.Categorical #5176

[x] pymc.distributions.distribution.DensityDist #5159

[x] pymc.distributions.multivariate.MvNormal #5171

[x] pymc.distributions.multivariate.MatrixNormal #5173

[x] pymc.distributions.multivariate.KroneckerNormal #5235

[x] pymc.distributions.multivariate.MvStudentT #5173

[x] pymc.distributions.multivariate.Dirichlet #5174

[x] pymc.distributions.multivariate.Multinomial #5201

Note: This distribution had tests for the mode in test_distributions.py which should be moved/refactored to test_distributions_moments.py

https://github.com/pymc-devs/pymc/blob/bdd4d1992f11cab9202774b14b8044ddf0cb7674/pymc/tests/test_distributions.py#L2151

https://github.com/pymc-devs/pymc/blob/bdd4d1992f11cab9202774b14b8044ddf0cb7674/pymc/tests/test_distributions.py#L2189

[x] pymc.distributions.multivariate.DirichletMultinomial #5225

[x] pymc.distributions.continuous.AsymmetricLaplace #5188

[x] pymc.distributions.continuous.SkewNormal #5188

[x] pymc.distributions.continuous.Triangular #5180

[x] pymc.distributions.discrete.DiscreteWeibull #5407

[x] pymc.distributions.continuous.Gumbel #5180

[x] pymc.distributions.continuous.Logistic #5157

[x] pymc.distributions.continuous.LogitNormal #5180

[x] pymc.distributions.continuous.Interpolated #5222

[x] pymc.distributions.continuous.Rice #5190

[x] pymc.distributions.continuous.Moyal #5179

[x] pymc.distributions.simulator.Simulator #5208

[x] pymc.distributions.multivariate.CAR #5220

[x] pymc.distributions.continuous.PolyaGamma #5193

[x] pymc.bart.BART #5211

beginner friendly help wanted v4
opened by ricardoV94 65
Variational inference with AD?

Can the theano infrastructure handle this?

https://github.com/stan-dev/stan/pull/1421 http://andrewgelman.com/2015/02/18/vb-stan-black-box-black-box-variational-bayes/

opened by datnamer 64
rewrite radon notebook using ArviZ and xarray
I am in the process of rewriting the notebook to exploit all ArviZ and xarray goodness. Comments are very welcome and I hope it is useful as a way to showcase xarray mostly and ArviZ capabilities inherited from xarray. Part of #3959

Depending on what your PR does, here are a few things you might want to address in the description:

[x] important background, or details about the implementation

~Doing this I have encountered a couple bugs in ArviZ so this notebook won't even run on ArviZ master, only on PR branch. I am not sure about the timeline of the doc sprint but it will end up either running on ArviZ master or on next release (there may be a release coming in the coming weeks :thinking:). Depends on https://github.com/arviz-devs/arviz/pull/1240 and https://github.com/arviz-devs/arviz/pull/1241~ All PRs have been merged and ArviZ 0.9.0 has been released!

I have checked doc generation locally to make sure xarray html repr is rendered properly

[x] right before it's ready to merge, mention the PR in the RELEASE-NOTES.md
opened by OriolAbril 63
Leveraging SymPy for PyMC3
SymPy (http://sympy.org/en/index.html) is a Python library for symbolic mathematics.

My initial motivation for looking at SymPy resulted from #172 and #173. Instead of recoding all probability distributions, samplers etc in Theano, maybe we could just use the ones provided by sympy.stats (http://docs.sympy.org/dev/modules/stats.html).

For this to work we needed to convert the sympy computing graph to a theano one. It seems that there is some work that shows that this is possible (https://github.com/nouiz/theano_sympy)

Looking at sympy (and sympy.stats) more closely it seems that there are potentially more areas where integrating this could help. Maybe this would give the best of both worlds: "Theano focuses more on tensor expressions than Sympy, and has more machinery for compilation. Sympy has more sophisticated algebra rules and can handle a wider variety of mathematical operations (such as series, limits, and integrals)."

There is additional discussion here: https://github.com/nouiz/theano_sympy/issues/1.

Copy pasting some chunks from @mrocklin response to move the discussion over here:

Overlap

There are some obvious points of overlap between the various projects

PyMC has distributions and SymPy has distributions (https://github.com/sympy/sympy/blob/master/sympy/stats/crv_types.py). SymPy doesn't currently have infinite discrete random variables like Poisson though. This could be fixed but is a current failing. SymPy's support for analytic solution of infinite sums is poor so this was a low priority. It seems like you're not really looking for that though.

PyMC has implemented some special functions that could be in Theano. I would encourage you to push these upstream. They'll probably get some useful attention from the Theano crowd.

Looking at the pymc2 readme it appears that you have created some sort of symbolic algebra class structure (you add two pymc.Normal objects). Presumably SymPy.core might be of use here.

What is the relationship with statsmodels? They also have a home-grown internal algebraic system. My guess is that if everyone were to unite under one algebraic system there would be some pleasant efficiencies. I obviously have a bias about what that algebraic system should be :)

Derivatives

Both Theano and SymPy provide derivatives which, apparently, you need. SymPy provides analytic ones, Theano provides automatic ones. My suggestion would be to use SymPy if it works and fall back on Theano if it doesn't work. You don't need SymPy.stats for this (in case you didn't want to offload your distributions work.) SymPy.core would be just fine.

Other benefits

In general the benefits to using symbolic systems tend to be unexpected. SymPy can provide lots of general aesthetic fluff like awesome pretty printing, symbolic simplification, C/Fortran code snippet generation, etc....
opened by twiecki 60
Update robust glm notebook

The first purpose of this PR is to update the GLM robust regression notebook already in the examples here: https://docs.pymc.io/notebooks/GLM-robust-with-outlier-detection.html

Those updates are everything from v2.1 onwards:

Version history:

version | date | author | changes :--- | :--- | :--- | :--- 1.0 | 2015-12-21 | jonsedar | Create and publish 2.0 | 2018-07-24 | twiecki | Restate outlier model using pm.Normal.dist().logp() and pm.Potential() 2.1 | 2019-11-16 | jonsedar | Restate nu in StudentT model to be more efficient, drop explicit use of theano shared vars, generally improve plotting / explanations / layout 2.2 | 2020-05-21 | jonsedar | Minor tidyup for plots and warnings and rerun with pymc3.8

The second purpose of this PR is to clarify the docstring within sampling.py, specifically the kwargs for step_kwargs when you have multiple steppers. I found the need for better clarity in the docstring during my rework of the notebook, so I hope it's valid to include a change in this single PR. I think this is also a fix for https://github.com/pymc-devs/pymc3/issues/3197

opened by jonsedar 59
add ref to template notebook in Jupyter style guide
What is this PR about? Closes #6420

Checklist

[ ] Explain important implementation details 👆

[ ] Make sure that the pre-commit linting/style checks pass.

[x] Link relevant issues (preferably in nice commit messages)

[ ] Are the changes covered by tests and docstrings?

[ ] Fill out the short summary sections 👇

Documentation

Update Jupyter Style Guide: include template notebook for pymc examples

cc: @drbenvincent
opened by reshamas 0
Bump PyTensor to 2.8.12
What is this PR about? Updating to the latest PyTensor version, because I'm looking forward to a bugfix in the release after, and I want things to go smooth =)

Checklist

[ ] Explain important implementation details 👆

[ ] Make sure that the pre-commit linting/style checks pass.

[ ] Link relevant issues (preferably in nice commit messages)

[ ] Are the changes covered by tests and docstrings?

[ ] Fill out the short summary sections 👇

Major / Breaking Changes

Updated PyTensor 2.8.11→2.8.12

pytensor-related major
opened by michaelosthege 3
Add dependabot config
What is this PR about? Ideally I'd like to have dependabot bump our PyTensor version pin, but as of 2023-01 it does not yet support conda environments, and the fact that we have versions pinned in environment.yml and requirements.txt at the same time will further complicate things.

However, we can already use it to keep our GitHub Actions updated.

Checklist

[x] Explain important implementation details 👆

[x] Make sure that the pre-commit linting/style checks pass.

[x] ~Link relevant issues (preferably in nice commit messages)~

[x] Are the changes covered by tests and docstrings?

[x] Fill out the short summary sections 👇

Maintenance

Added dependabot to keep GitHub Actions updated.

maintenance Github CI/CD no releasenotes
opened by michaelosthege 0

BUG: Class probabilities in `pm.Categorical` are split into observed and missing when `observed` has missing values

Describe the issue:

Originally raised on discourse here.

It appears that when pm.Categorical has observed data with missing values, the variable is re-instantiated inside model.make_obs_var with the wrong number of categories. This example shows that the new number of categories is indeed controlled by the shape of the data:

import pymc as pm
import numpy as np

# No error
data = np.ma.masked_equal([1, -1, -1], -1)
with pm.Model():
    idx = pm.Categorical(f"hi_idx", p=[0.1, 0.2, 0.7], observed=data)
pm.draw(idx, 100).max()
>>>Out: 2.0

data = np.ma.masked_equal([1, -1], -1)
with pm.Model():
    idx = pm.Categorical(f"hi_idx", p=[0.1, 0.2, 0.7], observed=data)
pm.draw(idx, 100).max()
>>> Out: 1.0

If the data are longer than the number of classes, the code will error out, as shown below:

Reproduceable code example:

import pymc as pm
import numpy as np

data = np.ma.masked_equal([1, 1, 0, 0, 2, -1, -1], -1)
with pm.Model():
    idx = pm.Categorical(f"hi_idx", p=[0.1, 0.2, 0.7], observed=data)
pm.draw(idx, 100).max()

Error message:


IndexError: index 5 is out of bounds for axis 0 with size 3
Apply node that caused the error: AdvancedSubtensor1(TensorConstant{[0.1 0.2 0.7]}, TensorConstant{[5 6]})
Toposort index: 1
Inputs types: [TensorType(float64, (3,)), TensorType(uint8, (2,))]
Inputs shapes: [(3,), (2,)]
Inputs strides: [(8,), (1,)]
Inputs values: [array([0.1, 0.2, 0.7]), array([5, 6], dtype=uint8)]
Outputs clients: [[categorical_rv{0, (1,), int64, True}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x12AD72820>), TensorConstant{(1,) of 2}, TensorConstant{4}, AdvancedSubtensor1.0)]]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3194, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3373, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/var/folders/7b/rzxy96cj0w751_6td3g2yss00000gn/T/ipykernel_96907/3148429013.py", line 3, in <module>
    idx = pm.Categorical(f"hi_idx", p=[0.1, 0.2, 0.7], observed=data)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pymc/distributions/distribution.py", line 457, in __new__
    return super().__new__(cls, name, *args, **kwargs)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pymc/distributions/distribution.py", line 310, in __new__
    rv_out = model.register_rv(
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pymc/model.py", line 1348, in register_rv
    rv_var = self.make_obs_var(rv_var, observed, dims, transform)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pymc/model.py", line 1425, in make_obs_var
    (missing_rv_var,) = local_subtensor_rv_lift.transform(fgraph, fgraph.outputs[0].owner)

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

PyMC version information:

PyMC version: 5.0.1 Pytensor version: 2.8.11

Context for the issue:

No response

bug

opened by jessegrabowski 1

Fix bug that does not correctly set the dtype of determinsitic variab…
What is this PR about?

Closes #6424 by setting the dtype of combined observed/interpolated deterministic generated by model.make_obs_var to be the same as the underling RV being interpolated.

Checklist

[x] Explain important implementation details 👆

[ ] Make sure that the pre-commit linting/style checks pass.

[ ] Link relevant issues (preferably in nice commit messages)

[x] Are the changes covered by tests and docstrings?

[x] Fill out the short summary sections 👇

Major / Breaking Changes

None

New features

None

Bugfixes

Allow indexing by discrete RVs with missing data

Documentation

...

Maintenance

...

bug
opened by jessegrabowski 2

BUG: Datatype of Discrete RVs is changed to `float64` when `observed` data has missing values

Describe the issue:

Issue first reported here. When using a categorical likelihood with missing variables in the observed data vector, the result is not able to be used as an index variable, because the dtype of the combined missing+observed data vector created in model.make_obs_var does not inherit the dtype of the underlying RV.

This will cause unexpected behavior if the user wants to index with the variable elsewhere in the model.

Reproduceable code example:

import pymc as pm
import numpy as np
import pytensor.tensor as pt

data = np.ma.masked_equal([1, 1, 0, 0, 2, -1, -1], -1)
something_to_index = pt.as_tensor_variable(np.random.normal(size=(10, 3)))

with pm.Model():
    idx = pm.Categorical(f"idx", p=[0.1, 0.2, 0.7], observed=data)
    stuff = something_to_index[:, idx]

Error message:

<details>
Traceback (most recent call last):
  File "/Users/jessegrabowski/Documents/Python/pymc/test.py", line 10, in <module>
    stuff = something_to_index[:, idx]
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/var.py", line 551, in __getitem__
    return at.subtensor.advanced_subtensor(self, *args)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/graph/op.py", line 296, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2556, in make_node
    index = tuple(map(as_index_variable, index))
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2518, in as_index_variable
    raise TypeError("index must be integers or a boolean mask")
TypeError: index must be integers or a boolean mask
</details>

PyMC version information:

pymc: 0+untagged.9319.g78a3582.dirty pytensor: 2.8.11

Context for the issue:

No response

bug

opened by jessegrabowski 0

Releases(v5.0.1)

v5.0.1(Dec 21, 2022)
What's Changed

New Features 🎉

Implement logp derivation for division, subtraction and negation by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6371

Extend logprob inference to power transforms by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6400

Bugfixes 🐛

Update PyTensor dependency and fix bugs in inferred mixture logprob by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6397

Maintenance 🔧

Update Release Notes template by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6392

Remove global RandomStream by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6396

Fix error in docstring of Truncated by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6395

added postprocessing_chunks option to sample_blackjax_nuts and sample… by @wnorcbrown in https://github.com/pymc-devs/pymc/pull/6388

replaces numpy sqrt method with pytensor equivalent by @morganstrom in https://github.com/pymc-devs/pymc/pull/6405

New Contributors

@wnorcbrown made their first contribution in https://github.com/pymc-devs/pymc/pull/6388

Full Changelog: https://github.com/pymc-devs/pymc/compare/v5.0.0...v5.0.1
Source code(tar.gz)
Source code(zip)
v5.0.0(Dec 12, 2022)
What's Changed

In this major release we are switching our graph computation backend from Aesara to PyTensor, which is a fork of Aesara under PyMC governance. Read the full announcement here: PyMC is Forking Aesara to PyTensor.

The switch itself should be rather seamless and you can probably just update your imports:

import aesara.tensor as at # old (pymc >=4,< 5) import pytensor.tensor as pt # new (pymc >=5)

If you encounter problems updating please check the latest Discussions and don't hesitate to get in touch.

Major Changes 🛠

⚠ Switched the graph backend from Aesara to PyTensor

Merged AePPL into a new logprob submodule. Dispatch methods can be found in logprob.abstract

⚠ The log_likelihood, needed for arviz.compare is no longer computed by default. It can be added with idata = pm.compute_log_likelihood(idata) or using pm.sample(idata_kwargs=dict(log_likelihood=True)) by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6374

Changed Minibatch API by @ferrine in https://github.com/pymc-devs/pymc/pull/6304

Fix ordering transformation for batched dimensions, and deprecate in favor of univariate_ordered and multivariate_ordered by @TimOliverMaier in #6255 and @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6375

New Features & Bugfixes 🎉

Support logp derivation in DensityDist when random function returns a PyTensor variable by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6361

Added alternative parametrization for AsymmetricLaplace by @aloctavodia in https://github.com/pymc-devs/pymc/pull/6337

Docs & Maintenance 🔧

Bugfixes to increase robustness against unnamed dims by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6339

Updated GOVERNANCE.md by @canyon289 in https://github.com/pymc-devs/pymc/pull/6358

Fixed overriding user provided mp_ctx strings to pm.sample() on M1 MacOS by @digicosmos86 in https://github.com/pymc-devs/pymc/pull/6363

Simplify measurable transform rewrites by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6370

Fix measurable stack and join with interdependent inputs by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6342

Allow transforms to work with multiple-valued nodes by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6341

Fix transformed Scan values by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6343

Add issue templates by @ferrine in https://github.com/pymc-devs/pymc/pull/6327

Fail docs build on errors in core notebooks by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6324

Curated ecosystem references by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6383

Switched run_mypy.py from pass-listing to fail-listing by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6381

Runing pydocstyle in pre-commit by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6382

Removed NoDistribution from docs by @stestoll in https://github.com/pymc-devs/pymc/pull/6316

Fix transforms example by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6333

New Contributors

@digicosmos86 made their first contribution in https://github.com/pymc-devs/pymc/pull/6363

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.4.0...v5.0.0
Source code(tar.gz)
Source code(zip)
v4.4.0(Nov 19, 2022)
What's Changed

Major Changes 🛠

Removed support for selectively tracking variables via pm.sample(trace=[...]). by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6269

Do not rely on tag information for rv and logp conversions by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6281. This includes:

Deprecated accessing any of [value_variable|observations|transform|total_size] via var.tag in favor of model.rvs_to_[values|transforms|total_sizes]

Deprecated joint_logp in favor of model.logp

Deprecated aesaraf.rvs_to_value_vars in favor of model.replace_rvs_by_values

Using keyword seed for initial point no longer supported by @wd60622 in https://github.com/pymc-devs/pymc/pull/6291

Sampling of transformed variables from prior_predictive is no longer allowed by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6309

Require all step methods to return stats by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6313. This includes

Require all step methods to return stats from their step/astep method.

The BlockedStep.generates_stats attribute was removed.

New Features & Bugfixes 🎉

Fix shared variable latex by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6260

Fix bug when replacing random variables with nested value transforms by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6281

Do not infer graph_model node types based on variable Op class by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6259

Do not propagate dims to observed component of imputed variable by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6263

Fix Categorical and Multinomial bugs by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6265

Sample stats for blackjax nuts by @TimOliverMaier in https://github.com/pymc-devs/pymc/pull/6264

Add warning if observed in DensityDist is dict by @symeneses in https://github.com/pymc-devs/pymc/pull/6292

Fix versioneer config to match version tags by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6314

Docs & Maintenance 🔧

Split sampling.py into sampling.py and sampling_forward.py by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6257

Improve join_nonshared_inputs documentation by @wd60622 in https://github.com/pymc-devs/pymc/pull/6216

Move sampling code to a submodule by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6268

Fixed a typo in the overview notebook by @grtyvr in https://github.com/pymc-devs/pymc/pull/6274

Check that sampler vars correspond to value variables in the model by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6235

Updated pymc.DiscreteWeibull docstring by @hyosubkim in https://github.com/pymc-devs/pymc/pull/6283

Improve random seed processing for SMC sampling by @juanitorduz in https://github.com/pymc-devs/pymc/pull/6298

Remove wrong type-hints and stale docstrings from distributions by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6280

update theme by @OriolAbril in https://github.com/pymc-devs/pymc/pull/6296

Fix some typos and lints by @Armavica in https://github.com/pymc-devs/pymc/pull/6300

Update docstrings for sample_smc and smc.py by @rowangayleschaefer in https://github.com/pymc-devs/pymc/pull/6114

Fix Flaky Euler-Maruyama Tests by @wd60622 in https://github.com/pymc-devs/pymc/pull/6287

New Contributors

@grtyvr made their first contribution in https://github.com/pymc-devs/pymc/pull/6274

@hyosubkim made their first contribution in https://github.com/pymc-devs/pymc/pull/6283

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.3.0...v4.4.0
Source code(tar.gz)
Source code(zip)
v4.3.0(Oct 31, 2022)
What's Changed

Major Changes 🛠

Remove samples and keep_size from sample_posterior_predictive by @pibieta in https://github.com/pymc-devs/pymc/pull/6029

Deprecate old or unused Model methods by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6237

Rename SMC files by @IMvision12 in https://github.com/pymc-devs/pymc/pull/6174

Require backends to record sample stats by @wd60622 in https://github.com/pymc-devs/pymc/pull/6205

Collect sampler warnings only through stats by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6192

New Features & Bugfixes 🎉

Refactor EulerMaruyama to work in v4 by @junpenglao in https://github.com/pymc-devs/pymc/pull/6227

Fix bug in get_vars_in_point_list when model does not have variables that exist in the trace by @lucianopaz in https://github.com/pymc-devs/pymc/pull/6203

Docs & Maintenance 🔧

Speed up posterior predictive sampling by @OriolAbril in https://github.com/pymc-devs/pymc/pull/6208

Add option to include transformed variables in InferenceData by @dfm in https://github.com/pymc-devs/pymc/pull/6232

Set start method to "fork" for MacOs ARM devices by @bchen93 in https://github.com/pymc-devs/pymc/pull/6218

Deprecate sample_posterior_predictive_w by @zaxtax in https://github.com/pymc-devs/pymc/pull/6254

Fix latex repr of symbolic distributions by @mattiadg in https://github.com/pymc-devs/pymc/pull/6231

Some doc fixes by @OriolAbril in https://github.com/pymc-devs/pymc/pull/6200

Modify logo_link to work with new sphinx schema by @hdnl in https://github.com/pymc-devs/pymc/pull/6209

Fix docstring of the ZeroInflatedPoisson distribution by @cscheffler in https://github.com/pymc-devs/pymc/pull/6213

Fix debug_print of wrong variable in notebook by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6225

Fix flaky TestMixture.test_component_choice_random by @bherwerth in https://github.com/pymc-devs/pymc/pull/6222

Seed flaky test TestSamplePPC.test_normal_scalar by @mattiadg in https://github.com/pymc-devs/pymc/pull/6220

Fix flaky TestTruncation.truncation_discrete_random by @mattiadg in https://github.com/pymc-devs/pymc/pull/6229

Seed pm.sample in BaseSampler(SeededTest) to make deriving test classes deterministic by @mattiadg in https://github.com/pymc-devs/pymc/pull/6251

New Contributors

@hdnl made their first contribution in https://github.com/pymc-devs/pymc/pull/6209

@mattiadg made their first contribution in https://github.com/pymc-devs/pymc/pull/6220

@bchen93 made their first contribution in https://github.com/pymc-devs/pymc/pull/6218

@IMvision12 made their first contribution in https://github.com/pymc-devs/pymc/pull/6174

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.2.2...v4.3.0
Source code(tar.gz)
Source code(zip)
v4.2.2(Oct 10, 2022)
What's Changed

New Features & Bugfixes 🎉

Add ZeroSumNormal distribution by @AlexAndorra in https://github.com/pymc-devs/pymc/pull/6121

Refactor Multivariate RandomWalk distributions by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6131

Docs & Maintenance 🔧

Finish restructuring the tests to follow the structure of the code by @Armavica in https://github.com/pymc-devs/pymc/pull/6125

Small typo corrections in Markdown for overview notebook by @willettk in https://github.com/pymc-devs/pymc/pull/6183

Run mypy outside of pre-commit in its own job by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6186

Update docstrings of set_data and Data by @bwengals in https://github.com/pymc-devs/pymc/pull/6087

Remove unused trace features by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6188

New Contributors

@willettk made their first contribution in https://github.com/pymc-devs/pymc/pull/6183

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.2.1...v4.2.2
Source code(tar.gz)
Source code(zip)
v4.2.1(Sep 30, 2022)
What's Changed

New Features & Bugfixes 🎉

Check shared variable values to determine volatility in posterior predictive sampling by @lucianopaz in https://github.com/pymc-devs/pymc/pull/6147

Log name of variables that are sampled in predictive sampling functions by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6142

Fix DiscreteUniformRV dropping degenerate dimension by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6151

Fix shape bug when creating a truncated normal via Truncated by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6165

Docs & Maintenance 🔧

Repair the plot of Interpolated and add an example for Deterministic by @Armavica in https://github.com/pymc-devs/pymc/pull/6126

Add constant_fold helper by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6160

Use sigma instead of noise in GP functions 6094 by @wd60622 in https://github.com/pymc-devs/pymc/pull/6145

Replace multinomial sampling with systematic sampling in sample_smc by @aloctavodia in https://github.com/pymc-devs/pymc/pull/6162

Assume default_output is the only measurable output in SymbolicRandomVariables by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6161

New Contributors

@wd60622 made their first contribution in https://github.com/pymc-devs/pymc/pull/6145

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.2.0...v4.2.1
Source code(tar.gz)
Source code(zip)
v4.2.0(Sep 19, 2022)
What's Changed

Major Changes 🛠

Allow broadcasting via observed and dims by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6063

Remove support for specifying "dims on the fly" from the shapes of variables by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6112

Automatic versioning with versioneer by @cfonnesbeck in https://github.com/pymc-devs/pymc/pull/6078

New Features & Bugfixes 🎉

Implement Truncated distributions by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6113

Port GARCH11 to v4 by @junpenglao in https://github.com/pymc-devs/pymc/pull/6119

Implement Symbolic RVs and enable nested distribution factories (such as Mixtures of Mixtures) by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6072

Allow for batched alpha in StickBreakingWeights by @purna135 in https://github.com/pymc-devs/pymc/pull/6042

Remove NoDistribution and enable .dist API for Simulator and DensityDist by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6110

Add start_sigma to ADVI 2 by @markusschmaus in https://github.com/pymc-devs/pymc/pull/6132

Create .gitpod.yml by @ferrine in https://github.com/pymc-devs/pymc/pull/6070 and https://github.com/pymc-devs/pymc/pull/6109

Docs & Maintenance 🔧

Make rvs_to_values work with non-RandomVariables by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6101

Fix bug in Marginalapprox by @bwengals in https://github.com/pymc-devs/pymc/pull/6076

Fix bug in which TruncatedNormal returns -inf for all values if any value is out of bounds by @adrn in https://github.com/pymc-devs/pymc/pull/6128

Rename cov_func/cov to scale_func/scale for TP/MvStudentT by @fonnesbeck in https://github.com/pymc-devs/pymc/pull/6068

Ignore SpecifyShape when converting to JAX by @martiningram in https://github.com/pymc-devs/pymc/pull/6062

Remove reshape_t by @tjburch in https://github.com/pymc-devs/pymc/pull/6118

Fix Model docstring by @alekracicot in https://github.com/pymc-devs/pymc/pull/6048

Update opvi docs by @ferrine in https://github.com/pymc-devs/pymc/pull/6093

Fix formatting in documentation of AR distribution parameters by @daniel-saunders-phil in https://github.com/pymc-devs/pymc/pull/6080

Fix incorrect formula in NormalMixture docstring by @MatthewQuenneville in https://github.com/pymc-devs/pymc/pull/6073

Fix last remaining PyMC3 occurrences & broken link by @Armavica in https://github.com/pymc-devs/pymc/pull/6133

Update GOVERNANCE.md for PyMCon_2022 planning repo by @canyon289 in https://github.com/pymc-devs/pymc/pull/6088

Add new core contributors by @OriolAbril in https://github.com/pymc-devs/pymc/pull/6117

Pin pydata-sphinx-theme by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/6120

Mirror codebase structure in tests by @Armavica in https://github.com/pymc-devs/pymc/pull/6084

Clean up some warnings from the test suite by @Armavica in https://github.com/pymc-devs/pymc/pull/6067 and https://github.com/pymc-devs/pymc/pull/6074

Restructure the test suite to follow the code by @Armavica in https://github.com/pymc-devs/pymc/pull/6111

New Contributors

@alekracicot made their first contribution in https://github.com/pymc-devs/pymc/pull/6048

@MatthewQuenneville made their first contribution in https://github.com/pymc-devs/pymc/pull/6073

@tjburch made their first contribution in https://github.com/pymc-devs/pymc/pull/6118

@markusschmaus made their first contribution in https://github.com/pymc-devs/pymc/pull/6096

@cfonnesbeck made their first contribution in https://github.com/pymc-devs/pymc/pull/6078

@adrn made their first contribution in https://github.com/pymc-devs/pymc/pull/6128

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.7...v4.2.0
Source code(tar.gz)
Source code(zip)
v4.1.7(Aug 26, 2022)
What's Changed

Docs & Maintenance 🔧

Remove note that probs are automatically rescaled by @Armavica in https://github.com/pymc-devs/pymc/pull/6066

update the default value of jitter to JITTER_DEFAULT by @danhphan in https://github.com/pymc-devs/pymc/pull/6055

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.6...v4.1.7
Source code(tar.gz)
Source code(zip)
v4.1.6(Aug 25, 2022)
What's Changed

Docs & Maintenance 🔧

adding markdown cell for Watermark by @reshamas in https://github.com/pymc-devs/pymc/pull/6051

DOC Adding "Git Bash command" to install virtual enviroment by @vitaliset in https://github.com/pymc-devs/pymc/pull/6056

Fix JAX sampling funcs overwriting existing var's dims and coords by @jhrcook in https://github.com/pymc-devs/pymc/pull/6041

Remove unused IS_FLOAT32 and IS_WINDOWS from test_ode by @maresb in https://github.com/pymc-devs/pymc/pull/6057

Add missing file test_printing.py to github runner by @Armavica in https://github.com/pymc-devs/pymc/pull/6058

Convert pip-installed dev dependencies to Conda by @maresb in https://github.com/pymc-devs/pymc/pull/6060

Upgrade to aesara=2.8.2 and aeppl=0.0.35 by @Armavica in https://github.com/pymc-devs/pymc/pull/6059

New Contributors

@Armavica made their first contribution in https://github.com/pymc-devs/pymc/pull/6058

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.5...v4.1.6
Source code(tar.gz)
Source code(zip)
v4.1.5(Aug 17, 2022)
What's Changed

New Features & Bugfixes 🎉

Constrain priors with symmetric mass distribution by @lucianopaz in https://github.com/pymc-devs/pymc/pull/5981

Fix AttributeError in HMC bad initial energy warning by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6037

Docs & Maintenance 🔧

Fix problems with specifying target_accept and nuts kwargs by @mschmidt87 in https://github.com/pymc-devs/pymc/pull/6018

Typehints and updated docstring for Blackjax NUTS sampling function by @jhrcook in https://github.com/pymc-devs/pymc/pull/6022

Revert numpy warnings workaround by @maresb in https://github.com/pymc-devs/pymc/pull/6025

Revert "Proposal: Readd 3.7" by @twiecki in https://github.com/pymc-devs/pymc/pull/6014

fixed some docstring spacing around colons by @daniel-saunders-phil in https://github.com/pymc-devs/pymc/pull/6027

issue6004 fixed example in docstring for set_data by @rowangayleschaefer in https://github.com/pymc-devs/pymc/pull/6028

Updating docstrings of distributions by @vitaliset in https://github.com/pymc-devs/pymc/pull/5998

Pass user-provided NUTS kwargs to Numpyro by @jhrcook in https://github.com/pymc-devs/pymc/pull/6021

⬆️ UPGRADE: Autoupdate pre-commit config by @twiecki in https://github.com/pymc-devs/pymc/pull/6008

[DOCS] Fix aesara core notebook dprint error by @juanitorduz in https://github.com/pymc-devs/pymc/pull/6030

Removed assert_negative_support deprecated function call #5997 by @dihanster in https://github.com/pymc-devs/pymc/pull/6034

Update aeppl dependency to 0.0.34 by @cluhmann in https://github.com/pymc-devs/pymc/pull/6049

Updated pymc.simulator docstring (typos, defaults, type description) by @daniel-saunders-phil in https://github.com/pymc-devs/pymc/pull/6035

Added networkx export functionality by @jonititan in https://github.com/pymc-devs/pymc/pull/6046

New Contributors

@mschmidt87 made their first contribution in https://github.com/pymc-devs/pymc/pull/6018

@daniel-saunders-phil made their first contribution in https://github.com/pymc-devs/pymc/pull/6027

@rowangayleschaefer made their first contribution in https://github.com/pymc-devs/pymc/pull/6028

@dihanster made their first contribution in https://github.com/pymc-devs/pymc/pull/6034

@jonititan made their first contribution in https://github.com/pymc-devs/pymc/pull/6046

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.4...v4.1.5
Source code(tar.gz)
Source code(zip)
v4.1.4(Jul 26, 2022)
What's Changed

Docs & Maintenance 🔧

Updated docstrings of some distribution classes inside multivariate.py by @pibieta in https://github.com/pymc-devs/pymc/pull/5982

Fix error when passing coords and dims in sampling_jax by @bherwerth in https://github.com/pymc-devs/pymc/pull/5983

⬆️ UPGRADE: Autoupdate pre-commit config by @twiecki in https://github.com/pymc-devs/pymc/pull/5984

Fix docker image build by @symeneses in https://github.com/pymc-devs/pymc/pull/5977

docs: Fix a few typos by @timgates42 in https://github.com/pymc-devs/pymc/pull/5988

contributing, jupyter style; author section more explicit by @reshamas in https://github.com/pymc-devs/pymc/pull/6000

Move MLDA to pymc-experimental by @michaelosthege in https://github.com/pymc-devs/pymc/pull/6007

Bump aesara to 2.7.8. by @twiecki in https://github.com/pymc-devs/pymc/pull/5995

Proposal: Readd 3.7 by @canyon289 in https://github.com/pymc-devs/pymc/pull/6010

Fix pm.Interpolated moment by @larryshamalama in https://github.com/pymc-devs/pymc/pull/5986

Bump aesara to 2.7.9 and aeppl to 0.0.33 by @twiecki in https://github.com/pymc-devs/pymc/pull/6012

Create arrow to observation nodes subject to arbitrary dtype casting in pm.model_to_graphviz by @larryshamalama in https://github.com/pymc-devs/pymc/pull/6011

New Contributors

@pibieta made their first contribution in https://github.com/pymc-devs/pymc/pull/5982

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.3...v4.1.4
Source code(tar.gz)
Source code(zip)
v4.1.3(Jul 15, 2022)
What's Changed

Docs & Maintenance 🔧

update docstrings in BetaBinomial class by @saurbhc in https://github.com/pymc-devs/pymc/pull/5960

Deprecate assert_negative_support by @vitaliset in https://github.com/pymc-devs/pymc/pull/5963

Updated docstrings to inform users that ODE solution may be slow. by @dmburt in https://github.com/pymc-devs/pymc/pull/5965

Add docker-image workflow by @symeneses in https://github.com/pymc-devs/pymc/pull/5966

⬆️ UPGRADE: Autoupdate pre-commit config by @twiecki in https://github.com/pymc-devs/pymc/pull/5967

Provide a fix for sample_blackjax_nuts failing with chains=1 with prior parameters of different shapes by @bherwerth in https://github.com/pymc-devs/pymc/pull/5969

correct docstring in BetaBinomial Class by @SangamSwadiK in https://github.com/pymc-devs/pymc/pull/5957

Correct docs for Bernoulli, Poisson, Negative Binomial, Geometric and HyperGeometric by @SangamSwadiK in https://github.com/pymc-devs/pymc/pull/5958

update docstrings in ZeroInflatedPoisson, DiracDelta and OrderedLogistic classes by @saurbhc in https://github.com/pymc-devs/pymc/pull/5962

Bernoulli, OrderedProbit, ZeroInflatedBinomial, ZeroInflatedNegativeBinomial docstring update by @mariyayb in https://github.com/pymc-devs/pymc/pull/5961

Updated docstring for find_constrained_prior by @jlindbloom in https://github.com/pymc-devs/pymc/pull/5964

Point installation links to new installation guide in docs by @fonnesbeck in https://github.com/pymc-devs/pymc/pull/5873

Bump aesara dependency by @keesterbrugge in https://github.com/pymc-devs/pymc/pull/5970

New Contributors

@saurbhc made their first contribution in https://github.com/pymc-devs/pymc/pull/5960

@vitaliset made their first contribution in https://github.com/pymc-devs/pymc/pull/5963

@dmburt made their first contribution in https://github.com/pymc-devs/pymc/pull/5965

@bherwerth made their first contribution in https://github.com/pymc-devs/pymc/pull/5969

@mariyayb made their first contribution in https://github.com/pymc-devs/pymc/pull/5961

@jlindbloom made their first contribution in https://github.com/pymc-devs/pymc/pull/5964

@keesterbrugge made their first contribution in https://github.com/pymc-devs/pymc/pull/5970

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.2...v4.1.3
Source code(tar.gz)
Source code(zip)
v4.1.2(Jul 8, 2022)
What's Changed

New Features & Bugfixes 🎉

Fix model graph node name to remove RV from end only and not the start by @cscheffler in https://github.com/pymc-devs/pymc/pull/5953

Workaround to suppress (some) import warnings from NumPy by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5956

Docs & Maintenance 🔧

include :: in name prefix check by @moshelooks in https://github.com/pymc-devs/pymc/pull/5951

correct docstrings in Binomial Class by @SangamSwadiK in https://github.com/pymc-devs/pymc/pull/5952

Bump Aesara to 2.7.5, aeppl to 0.0.32, update tests for aeppl by @maresb in https://github.com/pymc-devs/pymc/pull/5955

New Contributors

@moshelooks made their first contribution in https://github.com/pymc-devs/pymc/pull/5951

@cscheffler made their first contribution in https://github.com/pymc-devs/pymc/pull/5953

@SangamSwadiK made their first contribution in https://github.com/pymc-devs/pymc/pull/5952

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.1...v4.1.2
Source code(tar.gz)
Source code(zip)
v4.1.1(Jul 4, 2022)
What's Changed

Docs & Maintenance 🔧

Bump aesara to 2.7.4. by @twiecki in https://github.com/pymc-devs/pymc/pull/5947

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.1.0...v4.1.1
Source code(tar.gz)
Source code(zip)
v4.1.0(Jul 3, 2022)
What's Changed

Major Changes 🛠

Dropped support for Python 3.7 and added support for Python 3.10 by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5917

Default to pm.Data(mutable=False) by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5944

Deprecating MLDA in anticipation of migrating it to pymc-experimental by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5944

New Features & Bugfixes 🎉

Small improvements to early NUTS behaviour by @aseyboldt in https://github.com/pymc-devs/pymc/pull/5824

Correct the order of rvs sent to compile_dlogp in find_MAP by @quantheory in https://github.com/pymc-devs/pymc/pull/5928

Remove nan_is_num and nan_is_high limiters from find_MAP. by @quantheory in https://github.com/pymc-devs/pymc/pull/5929

Registering _as_tensor_variable converter for pandas objects by @juanitorduz in https://github.com/pymc-devs/pymc/pull/5920

Fix model and aesara_config kwargs for pm.Model by @ferrine in https://github.com/pymc-devs/pymc/pull/5915

Docs & Maintenance 🔧

Remove reference to old parameters in SMC docstring by @aloctavodia in https://github.com/pymc-devs/pymc/pull/5914

Get rid of python-version specific conda environments by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5911

Further fixes to VI docs by @ferrine in https://github.com/pymc-devs/pymc/pull/5916

Expand dimensionality notebook by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5746

Review docstrings checkmarcked as best practice by @OriolAbril in https://github.com/pymc-devs/pymc/pull/5919

Update conda environment name when running docker with jupyter notebook by @danhphan in https://github.com/pymc-devs/pymc/pull/5933

Update docs build and contributing instructions by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5938

Add numpyro install to building docs instructions by @isms in https://github.com/pymc-devs/pymc/pull/5936

Add version string to conda install command. by @twiecki in https://github.com/pymc-devs/pymc/pull/5946

New Contributors

@quantheory made their first contribution in https://github.com/pymc-devs/pymc/pull/5928

@isms made their first contribution in https://github.com/pymc-devs/pymc/pull/5936

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.0.1...v4.1.0
Source code(tar.gz)
Source code(zip)
v4.0.1(Jun 20, 2022)
What's Changed

Docs

PyMC, Aesara and Aeppl intro notebook by @juanitorduz in https://github.com/pymc-devs/pymc/pull/5721

Moved wiki install guides to the docs by @fonnesbeck in https://github.com/pymc-devs/pymc/pull/5869

Fix Examples link in README by @ryanrussell in https://github.com/pymc-devs/pymc/pull/5860

Update dev guide by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5810

Run black on core notebooks by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5901

Convert rng_seeder to random_seed in 'Prior and Posterior Predictive Checks' notebook by @hectormz in https://github.com/pymc-devs/pymc/pull/5896

Disable dark mode in docs by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5904

Fixed Student-t process docstring by @kunalghosh in https://github.com/pymc-devs/pymc/pull/5853

Bugfixes & Maintenance

Align advertised Metropolis.stats_dtypes with changes from 1e7d91f by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5882

Added a check in Empirical approximation which does not yet support InferenceData inputs (see #5884) by @ferrine in https://github.com/pymc-devs/pymc/pull/5874

Compute some basic Slice sample stats by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5889

Fixed bug when sampling discrete variables with SMC by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5887

Removed t suffix from functions, Model methods and properties by @cuchoi in https://github.com/pymc-devs/pymc/pull/5863

Model.logpt → Model.logp

Model.dlogpt → Model.dlogp

Model.d2logpt → Model.d2logp

Model.datalogpt → Model.datalogp

Model.varlogpt → Model.varlogp

Model.observedlogpt → Model.observedlogp

Model.potentiallogpt → Model.potentiallogp

Model.varlogp_nojact → Model.varlogp_nojac

logprob.joint_logpt → logprob.joint_logp

Remove self-directing arrow in observed nodes by @larryshamalama in https://github.com/pymc-devs/pymc/pull/5893

Update clone_replace strict keyword name by @brandonwillard in https://github.com/pymc-devs/pymc/pull/5849

Renamed pm.Constant to pm.DiracDelta by @cluhmann in https://github.com/pymc-devs/pymc/pull/5903

Update Dockerfile to PyMC v4 by @danhphan in https://github.com/pymc-devs/pymc/pull/5881

Refactor sampling_jax postrocessing to avoid jit by @ferrine in https://github.com/pymc-devs/pymc/pull/5908

Fix compile_fn bug and reduce return type confusion by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5909

Align conda envs and add Windows 3.9 env by @hectormz in https://github.com/pymc-devs/pymc/pull/5895

Include ConstantData in InferenceData returned by JAX samplers by @danhphan in https://github.com/pymc-devs/pymc/pull/5807

Updated Aesara dependency to 2.7.3 by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5910

New Contributors

@kunalghosh made their first contribution in https://github.com/pymc-devs/pymc/pull/5853

@ryanrussell made their first contribution in https://github.com/pymc-devs/pymc/pull/5860

@hectormz made their first contribution in https://github.com/pymc-devs/pymc/pull/5896

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.0.0...v4.0.1
Source code(tar.gz)
Source code(zip)
v4.0.0(Jun 3, 2022)
If you want a description of the highlights of this release, check out the release announcement on our new website. Feel free to read it, print it out, and give it to people on the street -- because everybody has to know PyMC 4.0 is officially out 🍾

Do not miss 🚨

⚠️ The project was renamed to "PyMC". Now the library is installed as "pip install pymc" and imported like import pymc as pm. See this migration guide for more details.

⚠️ Theano-PyMC has been replaced with Aesara, so all external references to theano and tt need to be replaced with aesara and at, respectively (see 4471).

⚠️ Support for JAX and JAX samplers, also allows sampling on GPUs. This benchmark shows speed-ups of up to 11x.

⚠️ The GLM submodule was removed, please use Bambi instead.

⚠️ PyMC now requires SciPy version >= 1.4.1 (see #4857).

v3 features not yet working in v4 ⏳

⚠️ We plan to get these working again, but at this point their inner workings have not been refactored.

MvNormalRandomWalk, MvStudentTRandomWalk, GARCH11 and EulerMaruyama distributions (see #4642)

Nested Mixture distributions (see #5533)

pm.sample_posterior_predictive_w (see #4807)

Partially observed Multivariate distributions (see #5260)

New features 🥳

Distributions:

Univariate censored distributions are now available via pm.Censored. #5169

The CAR distribution has been added to allow for use of conditional autoregressions which often are used in spatial and network models.

Added a logcdf implementation for the Kumaraswamy distribution (see #4706).

The OrderedMultinomial distribution has been added for use on ordinal data which are aggregated by trial, like multinomial observations, whereas OrderedLogistic only accepts ordinal data in a disaggregated format, like categorical observations (see #4773).

The Polya-Gamma distribution has been added (see #4531). To make use of this distribution, the polyagamma>=1.3.1 library must be installed and available in the user's environment.

pm.DensityDist can now accept an optional logcdf keyword argument to pass in a function to compute the cummulative density function of the distribution (see 5026).

pm.DensityDist can now accept an optional moment keyword argument to pass in a function to compute the moment of the distribution (see 5026).

Added an alternative parametrization, logit_p to pm.Binomial and pm.Categorical distributions (see 5637).

Model dimensions:

The dimensionality of model variables can now be parametrized through either of shape or dims (see #4696):

With shape the length of dimensions must be given numerically or as scalar Aesara Variables. Numeric entries in shape restrict the model variable to the exact length and re-sizing is no longer possible.

dims keeps model variables re-sizeable (for example through pm.Data) and leads to well defined coordinates in InferenceData objects.

An Ellipsis (...) in the last position of shape or dims can be used as short-hand notation for implied dimensions.

New features for pm.Data containers:

With pm.Data(..., mutable=False), or by using pm.ConstantData() one can now create TensorConstant data variables. These can be more performant and compatible in situations where a variable doesn't need to be changed via pm.set_data(). See #5295. If you do need to change the variable, use pm.Data(..., mutable=True), or pm.MutableData().

New named dimensions can be introduced to the model via pm.Data(..., dims=...). For mutable data variables (see above) the lengths of these dimensions are symbolic, so they can be re-sized via pm.set_data().

pm.Data now passes additional kwargs to aesara.shared/at.as_tensor. #5098.

The length of dims in the model is now tracked symbolically through Model.dim_lengths (see #4625).

Sampling:

⚠️ Random seeding behavior changed (see #5787)!

Sampling results will differ from those of v3 when passing the same random_seed as before. They will be consistent across subsequent v4 releases unless mentioned otherwise.

Sampling functions no longer respect user-specified global seeding! Always pass random_seed to ensure reproducible behavior.

random_seed now accepts RandomState and Generators besides integers.

A small change to the mass matrix tuning methods jitter+adapt_diag (the default) and adapt_diag improves performance early on during tuning for some models. #5004

New experimental mass matrix tuning method jitter+adapt_diag_grad. #5004

Support for samplers written in JAX:

Adding support for numpyro's NUTS sampler via pymc.sampling_jax.sample_numpyro_nuts()

Adding support for blackjax's NUTS sampler via pymc.sampling_jax.sample_blackjax_nuts() (see #5477)

pymc.sampling_jax samplers support log_likelihood, observed_data, and sample_stats in returned InferenceData object (see #5189)

Adding support for pm.Deterministic in pymc.sampling_jax (see #5182)

Miscellaneous:

The new pm.find_constrained_prior function can be used to find optimized prior parameters of a distribution under some constraints (e.g lower and upper bound). See #5231.

Nested models now inherit the parent model's coordinates. #5344

softmax and log_softmax functions added to math module (see #5279).

Added the low level compile_forward_sampling_function method to compile the aesara function responsible for generating forward samples (see #5759).

Expected breaking changes 💔

pm.sample(return_inferencedata=True) is now the default (see #4744).

ArviZ plots and stats wrappers were removed. The functions are now just available by their original names (see #4549 and 3.11.2 release notes).

pm.sample_posterior_predictive(vars=...) kwarg was removed in favor of var_names (see #4343).

ElemwiseCategorical step method was removed (see #4701)

LKJCholeskyCov's compute_corr keyword argument is now set to True by default (see#5382)

Alternative sd keyword argument has been removed from all distributions. sigma should be used instead (see #5583).

Read on if you're a developer. Or curious. Or both.

Unexpected breaking changes (action needed) 😲

Very important ⚠️

pm.Bound interface no longer accepts a callable class as argument, instead it requires an instantiated distribution (created via the .dist() API) to be passed as an argument. In addition, Bound no longer returns a class instance but works as a normal PyMC distribution. Finally, it is no longer possible to do predictive random sampling from Bounded variables. Please, consult the new documentation for details on how to use Bounded variables (see 4815).

BART has received various updates (5091, 5177, 5229, 4914) but was removed from the main package in #5566. It is now available from pymc-experimental.

Removed AR1. AR of order 1 should be used instead. (see 5734).

The pm.EllipticalSlice sampler was removed (see #5756).

BaseStochasticGradient was removed (see #5630)

pm.Distribution(...).logp(x) is now pm.logp(pm.Distribution(...), x).

pm.Distribution(...).logcdf(x) is now pm.logcdf(pm.Distribution(...), x).

pm.Distribution(...).random(size=x) is now pm.draw(pm.Distribution(...), draws=x).

pm.draw_values(...) and pm.generate_samples(...) were removed.

pm.fast_sample_posterior_predictive was removed.

pm.sample_prior_predictive, pm.sample_posterior_predictive and pm.sample_posterior_predictive_w now return an InferenceData object by default, instead of a dictionary (see #5073).

pm.sample_prior_predictive no longer returns transformed variable values by default. Pass them by name in var_names if you want to obtain these draws (see 4769).

pm.sample(trace=...) no longer accepts MultiTrace or len(.) > 0 traces (see 5019#).

Setting of initial values:

Setting initial values through pm.Distribution(testval=...) is now pm.Distribution(initval=...).

Model.update_start_values(...) was removed. Initial values can be set in the Model.initial_values dictionary directly.

Test values can no longer be set through pm.Distribution(testval=...) and must be assigned manually.

transforms module is no longer accessible at the root level. It is accessible at pymc.distributions.transforms (see#5347).

logp, dlogp, and d2logp and nojac variations were removed. Use Model.compile_logp, compile_dlgop and compile_d2logp with jacobian keyword instead.

pm.DensityDist no longer accepts the logp as its first position argument. It is now an optional keyword argument. If you pass a callable as the first positional argument, a TypeError will be raised (see 5026).

pm.DensityDist now accepts distribution parameters as positional arguments. Passing them as a dictionary in the observed keyword argument is no longer supported and will raise an error (see 5026).

The signature of the logp and random functions that can be passed into a pm.DensityDist has been changed (see 5026).

Important:

Signature and default parameters changed for several distributions:

pm.StudentT now requires either sigma or lam as kwarg (see #5628)

pm.StudentT now requires nu to be specified (no longer defaults to 1) (see #5628)

pm.AsymmetricLaplace positional arguments re-ordered (see #5628)

pm.AsymmetricLaplace now requires mu to be specified (no longer defaults to 0) (see #5628)

ZeroInflatedPoisson theta parameter was renamed to mu (see #5584).

pm.GaussianRandomWalk initial distribution defaults to zero-centered normal with sigma=100 instead of flat (see#5779)

pm.AR initial distribution defaults to unit normal instead of flat (see#5779)

logpt, logpt_sum, logp_elemwiset and nojac variations were removed. Use Model.logpt(jacobian=True/False, sum=True/False) instead.

dlogp_nojact and d2logp_nojact were removed. Use Model.dlogpt and d2logpt with jacobian=False instead.

model.makefn is now called Model.compile_fn, and model.fn was removed.

Methods starting with fast_*, such as Model.fast_logp, were removed. Same applies to PointFunc classes

Model(model=...) kwarg was removed

Model(theano_config=...) kwarg was removed

Model.size property was removed (use Model.ndim instead).

dims and coords handling:

Model.RV_dims and Model.coords are now read-only properties. To modify the coords dictionary use Model.add_coord.

dims or coordinate values that are None will be auto-completed (see #4625).

Coordinate values passed to Model.add_coord are always converted to tuples (see #5061).

Transform.forward and Transform.backward signatures changed.

Changes to the Gaussian Process (GP) submodule (see 5055):

The gp.prior(..., shape=...) kwarg was renamed to size.

Multiple methods including gp.prior now require explicit kwargs.

For all implementations, gp.Latent, gp.Marginal etc., cov_func and mean_func are required kwargs.

In Windows test conda environment the mkl version is fixed to verison 2020.4, and mkl-service is fixed to 2.3.0. This was required for gp.MarginalKron to function properly.

gp.MvStudentT uses rotated samples from StudentT directly now, instead of sampling from pm.Chi2 and then from pm.Normal.

The "jitter" parameter, or the diagonal noise term added to Gram matrices such that the Cholesky is numerically stable, is now exposed to the user instead of hard-coded. See the function gp.util.stabilize.

The is_observed arguement for gp.Marginal* implementations has been deprecated.

In the gp.utils file, the kmeans_inducing_points function now passes through kmeans_kwargs to scipy's k-means function.

The function replace_with_values function has been added to gp.utils.

MarginalSparse has been renamed MarginalApprox.

Removed MixtureSameFamily. Mixture is now capable of handling batched multivariate components (see #5438).

Documentation

Switched to the pydata-sphinx-theme

Updated our documentation tooling to use MyST, MyST-NB, sphinx-design, notfound.extension, sphinx-copybutton and sphinx-remove-toctrees.

Separated the builds of the example notebooks and of the versioned docs.

Restructured the documentation to facilitate learning paths

Updated API docs to document objects at the path users should use to import them

Maintenance

⚠️ Fixed old-time bug in Slice sampler that resulted in biased samples (see #5816).

Removed float128 dtype support (see #4514).

Logp method of Uniform and DiscreteUniform no longer depends on pymc.distributions.dist_math.bound for proper evaluation (see #4541).

We now include cloudpickle as a required dependency, and no longer depend on dill (see #4858).

The incomplete_beta function in pymc.distributions.dist_math was replaced by aesara.tensor.betainc (see 4857).

math.log1mexp and math.log1mexp_numpy will expect negative inputs in the future. A FutureWarning is now raised unless negative_input=True is set (see #4860).

Changed name of Lognormal distribution to LogNormal to harmonize CamelCase usage for distribution names.

Attempt to iterate over MultiTrace will raise NotImplementedError.

Removed silent normalisation of p parameters in Categorical and Multinomial distributions (see #5370).

Source code(tar.gz)
Source code(zip)
v4.0.0b6(Mar 30, 2022)
What's Changed

Implemented default transform for Mixtures by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5636

Scope separator for netcdf by @ferrine in https://github.com/pymc-devs/pymc/pull/5663

Fix default update bug by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5667

Pandas dependency was removed by @thomasjpfan in https://github.com/pymc-devs/pymc/pull/5633

Recognize cast data in InferenceData by @zaxtax in https://github.com/pymc-devs/pymc/pull/5646

Updated docstrings of multiple distributions by @purna135 in https://github.com/pymc-devs/pymc/pull/5595, https://github.com/pymc-devs/pymc/pull/5596 and https://github.com/pymc-devs/pymc/pull/5600

Refine Interval docstrings and fix typo by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5640

Add test for interactions between missing, default and explicit updates in compile_pymc by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5645

Test reshape from observed by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5670

Upgraded all CI cache actions to v3 by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5647

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.0.0b5...v4.0.0b6
Source code(tar.gz)
Source code(zip)
v4.0.0b5(Mar 22, 2022)
What's Changed

Generalize multinomial moment to arbitrary dimensions by @markvrma in https://github.com/pymc-devs/pymc/pull/5476

Remove sd optional kwarg from distributions by @purna135 in https://github.com/pymc-devs/pymc/pull/5583

Improve scoped models by @ferrine in https://github.com/pymc-devs/pymc/pull/5607

Add helper wrapper aound Interval transform by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5347

Rename logp_transform to _get_default_transform by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5612

Do not set RNG updates inplace in compile_pymc by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5615

Refine trigger filter for both PRs and pushes by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5619

Update contributing guide with etiquette section by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5611

Combine test workflows into one by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5623

Raise ValueError if random variables are present in the logp graph by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5614

Run float32 jobs separately by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5630

Bring back sampler argument target_accept by @aloctavodia in https://github.com/pymc-devs/pymc/pull/5622

Parametrize Binomial and Categorical distributions via logit_p by @purna135 in https://github.com/pymc-devs/pymc/pull/5637

Remove SGMCMC and fix flaky mypy results by @michaelosthege in https://github.com/pymc-devs/pymc/pull/5631

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.0.0b4...v4.0.0b5
Source code(tar.gz)
Source code(zip)
v4.0.0b4(Mar 17, 2022)
This release adds the following major improvements:

Refactor Mixture distribution for V4 by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5438

Adding NUTS sampler from blackjax to sampling_jax by @zaxtax in https://github.com/pymc-devs/pymc/pull/5477

Update aesara and aeppl dependencies to fix a memory leak in pymc models by @ricardoV94 in https://github.com/pymc-devs/pymc/pull/5582

New Contributors

@mirko-m made their first contribution in https://github.com/pymc-devs/pymc/pull/5414

@chritter made their first contribution in https://github.com/pymc-devs/pymc/pull/5491

@5hv5hvnk made their first contribution in https://github.com/pymc-devs/pymc/pull/5601

Full Changelog: https://github.com/pymc-devs/pymc/compare/v4.0.0b3...v4.0.0b4
Source code(tar.gz)
Source code(zip)
v3.11.5(Mar 15, 2022)
PyMC 3.11.5 (15 March 2022)

This is a backport & bugfix release that eases the transition to pymc >=4.0.0.

Backports

The pm.logp(rv, x) syntax is now available and recommended to make your model code v4-ready. Note that this backport is just an alias and much less capable than what's available with pymc >=4 (see #5083).

The pm.Distribution(testval=...) kwarg was deprecated and will be replaced by pm.Distribution(initval=...)in pymc >=4 (see #5226).

The pm.sample(start=...) kwarg was deprecated and will be replaced by pm.sample(initvals=...)in pymc >=4 (see #5226).

pm.LogNormal is now available as an alias for pm.Lognormal (see #5389).

Bugfixes

The upper limit for the SciPy version is <1.8.0 and will most probably remain for all future 3.x.x releases. For compatibility with newer SciPy versions please update to pymc>=4.0.0. Also see #5448.

A hotfix is applied on import to remain compatible with NumPy 1.22 (see #5316).

Source code(tar.gz)
Source code(zip)
v4.0.0b3(Mar 8, 2022)
Here is the full list of changes compared to 4.0.0b2.

For a current list of changes w.r.t. the upcoming v3.11.5 see RELEASE-NOTES.md.

Notable changes & features

ADVI has been ported to PyMC 4

LKJ has been ported to PyMC 4 (https://github.com/pymc-devs/pymc/pull/5382)

Dependencies have been updated

Source code(tar.gz)
Source code(zip)
v4.0.0b2(Jan 14, 2022)
PyMC 4.0.0 beta 2

This beta release includes the removal of warnings, polishing of APIs, more distributions and internal refactorings.

Here is the full list of changes compared to 4.0.0b1.

For a current list of changes w.r.t. the upcoming v3.11.5 see RELEASE-NOTES.md.

Notable changes & features

Introduction of pm.Data(..., mutable=False/True) and corresponding pm.ConstantData/pm.MutableData wrappers (see #5295).

The warning about theano or pymc3 being installed in parallel was removed.

dims can again be specified alongside shape or size (see #5325).

pm.draw was added to draw prior samples from a variable (see #5340).

Renames of model properties & methods like Model.logpt.

A function to find a prior based on lower/upper bounds (see #5231).

Source code(tar.gz)
Source code(zip)
v4.0.0b1(Dec 16, 2021)
PyMC 4.0.0 beta 1

⚠ This is the first beta of the next major release for PyMC 4.0.0 (formerly PyMC3). 4.0.0 is a rewrite of large parts of the PyMC code base which make it faster, adds many new features, and introduces some breaking changes. For the most part, the API remains stable and we expect that most models will work without any changes.

Not-yet working features

We plan to get these working again, but at this point, their inner workings have not been refactored.

Timeseries distributions (see #4642)

Mixture distributions (see #4781)

Cholesky distributions (see WIP PR #4784)

Variational inference submodule (see WIP PR #4582)

Elliptical slice sampling (see #5137)

BaseStochasticGradient (see #5138)

pm.sample_posterior_predictive_w (see #4807)

Partially observed Multivariate distributions (see #5260)

Also, check out the milestones for a potentially more complete list.

Unexpected breaking changes (action needed)

New API is not available in v3.11.5.

Old API does not work in v4.0.0.

All of the above applies to:

⚠ The library is now named, installed, and imported as "pymc". For example: pip install pymc. (Use pip install pymc --pre while we are in the pre-release phase.)

⚠ Theano-PyMC has been replaced with Aesara, so all external references to theano, tt, and pymc3.theanof need to be replaced with aesara, at, and pymc.aesaraf (see 4471).

pm.Distribution(...).logp(x) is now pm.logp(pm.Distribution(...), x)

pm.Distribution(...).logcdf(x) is now pm.logcdf(pm.Distribution(...), x)

pm.Distribution(...).random() is now pm.Distribution(...).eval()

pm.draw_values(...) and pm.generate_samples(...) were removed. The tensors can now be evaluated with .eval().

pm.fast_sample_posterior_predictive was removed.

pm.sample_prior_predictive, pm.sample_posterior_predictive and pm.sample_posterior_predictive_w now return an InferenceData object by default, instead of a dictionary (see #5073).

pm.sample_prior_predictive no longer returns transformed variable values by default. Pass them by name in var_names if you want to obtain these draws (see 4769).

pm.sample(trace=...) no longer accepts MultiTrace or len(.) > 0 traces (see 5019#).

The GLM submodule was removed, please use Bambi instead.

pm.Bound interface no longer accepts a callable class as an argument, instead, it requires an instantiated distribution (created via the .dist() API) to be passed as an argument. In addition, Bound no longer returns a class instance but works as a normal PyMC distribution. Finally, it is no longer possible to do predictive random sampling from Bounded variables. Please, consult the new documentation for details on how to use Bounded variables (see 4815).

pm.logpt(transformed=...) kwarg was removed (816b5f).

Model(model=...) kwarg was removed

Model(theano_config=...) kwarg was removed

Model.size property was removed (use Model.ndim instead).

dims and coords handling:

Model.RV_dims and Model.coords are now read-only properties. To modify the coords dictionary use Model.add_coord.

dims or coordinate values that are None will be auto-completed (see #4625).

Coordinate values passed to Model.add_coord are always converted to tuples (see #5061).

Model.update_start_values(...) was removed. Initial values can be set in the Model.initial_values dictionary directly.

Test values can no longer be set through pm.Distribution(testval=...) and must be assigned manually.

Transform.forward and Transform.backward signatures changed.

pm.DensityDist no longer accepts the logp as its first positional argument. It is now an optional keyword argument. If you pass a callable as the first positional argument, a TypeError will be raised (see 5026).

pm.DensityDist now accepts distribution parameters as positional arguments. Passing them as a dictionary in the observed keyword argument is no longer supported and will raise an error (see 5026).

The signature of the logp and random functions that can be passed into a pm.DensityDist has been changed (see 5026).

Changes to the Gaussian process (gp) submodule:

The gp.prior(..., shape=...) kwarg was renamed to size.

Multiple methods including gp.prior now require explicit kwargs.

Changes to the BART implementation:

A BART variable can be combined with other random variables. The inv_link argument has been removed (see 4914).

Moved BART to its own module (see 5058).

Changes to the Gaussian Process (GP) submodule (see 5055):

For all implementations, gp.Latent, gp.Marginal etc., cov_func and mean_func are required kwargs.

In Windows test conda environment the mkl version is fixed to verison 2020.4, and mkl-service is fixed to 2.3.0. This was required for gp.MarginalKron to function properly.

gp.MvStudentT uses rotated samples from StudentT directly now, instead of sampling from pm.Chi2 and then from pm.Normal.

The "jitter" parameter, or the diagonal noise term added to Gram matrices such that the Cholesky is numerically stable, is now exposed to the user instead of hard-coded. See the function gp.util.stabilize.

The is_observed argument for gp.Marginal* implementations has been deprecated.

In the gp.utils file, the kmeans_inducing_points function now passes through kmeans_kwargs to scipy's k-means function.

The function replace_with_values function has been added to gp.utils.

MarginalSparse has been renamed MarginalApprox.

Expected breaks

New API was already available in v3.

Old API had deprecation warnings since at least 3.11.0 (2021-01).

Old API stops working in v4 (preferably with informative errors).

All of the above apply to:

pm.sample(return_inferencedata=True) is now the default (see #4744).

ArviZ plots and stats wrappers were removed. The functions are now just available by their original names (see #4549 and 3.11.2 release notes).

pm.sample_posterior_predictive(vars=...) kwarg was removed in favor of var_names (see #4343).

ElemwiseCategorical step method was removed (see #4701)

Ongoing deprecations

Old API still works in v4 and has a deprecation warning.

Preferably the new API should be available in v3 already

New features

The length of dims in the model is now tracked symbolically through Model.dim_lengths (see #4625).

The CAR distribution has been added to allow for use of conditional autoregressions which often are used in spatial and network models.

The dimensionality of model variables can now be parametrized through either of shape, dims or size (see #4696):

With shape the length of dimensions must be given numerically or as scalar Aesara Variables. Numeric entries in shape restrict the model variable to the exact length and re-sizing is no longer possible.

dims keeps model variables re-sizeable (for example through pm.Data) and leads to well-defined coordinates in InferenceData objects.

The size kwarg behaves as it does in Aesara/NumPy. For univariate RVs it is the same as shape, but for multivariate RVs it depends on how the RV implements broadcasting to dimensionality greater than RVOp.ndim_supp.

An Ellipsis (...) in the last position of shape or dims can be used as shorthand notation for implied dimensions.

Added a logcdf implementation for the Kumaraswamy distribution (see #4706).

The OrderedMultinomial distribution has been added for use on ordinal data which are aggregated by trial, like multinomial observations, whereas OrderedLogistic only accepts ordinal data in a disaggregated format, like categorical observations (see #4773).

The Polya-Gamma distribution has been added (see #4531). To make use of this distribution, the polyagamma>=1.3.1 library must be installed and available in the user's environment.

A small change to the mass matrix tuning methods jitter+adapt_diag (the default) and adapt_diag improves performance early on during tuning for some models. #5004

New experimental mass matrix tuning method jitter+adapt_diag_grad. #5004

pm.DensityDist can now accept an optional logcdf keyword argument to pass in a function to compute the cummulative density function of the distribution (see 5026).

pm.DensityDist can now accept an optional get_moment keyword argument to pass in a function to compute the moment of the distribution (see 5026).

New features for BART:

Added partial dependence plots and individual conditional expectation plots 5091.

Modify how particle weights are computed. This improves the accuracy of the modeled function (see 5177).

Improve sampling, increase the default number of particles 5229.

pm.Data now passes additional kwargs to aesara.shared. #5098

...

Internal changes

⚠ PyMC now requires Scipy version >= 1.4.1 (see 4857).

Removed float128 dtype support (see #4514).

Logp method of Uniform and DiscreteUniform no longer depends on pymc.distributions.dist_math.bound for proper evaluation (see #4541).

We now include cloudpickle as a required dependency, and no longer depend on dill (see #4858).

The incomplete_beta function in pymc.distributions.dist_math was replaced by aesara.tensor.betainc (see 4857).

math.log1mexp and math.log1mexp_numpy will expect negative inputs in the future. A FutureWarning is now raised unless negative_input=True is set (see #4860).

Changed name of Lognormal distribution to LogNormal to harmonize CamelCase usage for distribution names.

Attempt to iterate over MultiTrace will raise NotImplementedError.

...

Source code(tar.gz)
Source code(zip)
v3.11.4(Aug 24, 2021)

Source code(tar.gz)
Source code(zip)
v3.11.3(Aug 20, 2021)

Source code(tar.gz)
Source code(zip)
v3.11.2(Mar 14, 2021)
PyMC3 3.11.2 (14 March 2021)

New Features

pm.math.cartesian can now handle inputs that are themselves >1D (see #4482).

Statistics and plotting functions that were removed in 3.11.0 were brought back, albeit with deprecation warnings if an old naming scheme is used (see #4536). In order to future proof your code, rename these function calls:

pm.traceplot → pm.plot_trace

pm.compareplot → pm.plot_compare (here you might need to rename some columns in the input according to the arviz.plot_compare documentation)

pm.autocorrplot → pm.plot_autocorr

pm.forestplot → pm.plot_forest

pm.kdeplot → pm.plot_kde

pm.energyplot → pm.plot_energy

pm.densityplot → pm.plot_density

pm.pairplot → pm.plot_pair

Maintenance

⚠ Our memoization mechanism wasn't robust against hash collisions (#4506), sometimes resulting in incorrect values in, for example, posterior predictives. The pymc3.memoize module was removed and replaced with cachetools. The hashable function and WithMemoization class were moved to pymc3.util (see #4525).

pm.make_shared_replacements now retains broadcasting information which fixes issues with Metropolis samplers (see #4492).

Release manager for 3.11.2: Michael Osthege (@michaelosthege)
Source code(tar.gz)
Source code(zip)
v3.11.1(Feb 12, 2021)
New Features

Automatic imputations now also work with ndarray data, not just pd.Series or pd.DataFrame (see#4439).

pymc3.sampling_jax.sample_numpyro_nuts now returns samples from transformed random variables, rather than from the unconstrained representation (see #4427).

Maintenance

We upgraded to Theano-PyMC v1.1.2 which includes bugfixes for...

⚠ a problem with tt.switch that affected the behavior of several distributions, including at least the following special cases (see #4448)

Bernoulli when all the observed values were the same (e.g., [0, 0, 0, 0, 0]).

TruncatedNormal when sigma was constant and mu was being automatically broadcasted to match the shape of observations.

Warning floods and compiledir locking (see #4444)

math.log1mexp_numpy no longer raises RuntimeWarning when given very small inputs. These were commonly observed during NUTS sampling (see #4428).

ScalarSharedVariable can now be used as an input to other RVs directly (see #4445).

pm.sample and pm.find_MAP no longer change the start argument (see #4458).

Fixed Dirichlet.logp method to work with unit batch or event shapes (see #4454).

Bugfix in logp and logcdf methods of Triangular distribution (see #4470).

Release manager for 3.11.1: Michael Osthege (@michaelosthege)
Source code(tar.gz)
Source code(zip)
v3.11.0(Jan 21, 2021)
This release breaks some APIs w.r.t. 3.10.0. It also brings some dreadfully awaited fixes, so be sure to go through the (breaking) changes below.

Breaking Changes

⚠ Many plotting and diagnostic functions that were just aliasing ArviZ functions were removed (see 4397). This includes pm.summary, pm.traceplot, pm.ess and many more!

Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)

⚠ We now depend on Theano-PyMC version 1.1.0 exactly (see #4405). Major refactorings were done in Theano-PyMC 1.1.0. If you implement custom Ops or interact with Theano in any way yourself, make sure to read the Theano-PyMC 1.1.0 release notes.

⚠ Python 3.6 support was dropped (by no longer testing) and Python 3.9 was added (see #4332).

⚠ Changed shape behavior: No longer collapse length 1 vector shape into scalars. (see #4206 and #4214)

Applies to random variables and also the .random(size=...) kwarg!

To create scalar variables you must now use shape=None or shape=().

shape=(1,) and shape=1 now become vectors. Previously they were collapsed into scalars

0-length dimensions are now ruled illegal for random variables and raise a ValueError.

In sample_prior_predictive the vars kwarg was removed in favor of var_names (see #4327).

Removed theanof.set_theano_config because it illegally changed Theano's internal state (see #4329).

New Features

Option to set check_bounds=False when instantiating pymc3.Model(). This turns off bounds checks that ensure that input parameters of distributions are valid. For correctly specified models, this is unneccessary as all parameters get automatically transformed so that all values are valid. Turning this off should lead to faster sampling (see #4377).

OrderedProbit distribution added (see #4232).

plot_posterior_predictive_glm now works with arviz.InferenceData as well (see #4234)

Add logcdf method to all univariate discrete distributions (see #4387).

Add random method to MvGaussianRandomWalk (see #4388)

AsymmetricLaplace distribution added (see #4392).

DirichletMultinomial distribution added (see #4373).

Added a new predict method to BART to compute out of sample predictions (see #4310).

Maintenance

Fixed bug whereby partial traces returns after keyboard interrupt during parallel sampling had fewer draws than would've been available #4318

Make sample_shape same across all contexts in draw_values (see #4305).

The notebook gallery has been moved to https://github.com/pymc-devs/pymc-examples (see #4348).

math.logsumexp now matches scipy.special.logsumexp when arrays contain infinite values (see #4360).

Fixed mathematical formulation in MvStudentT random method. (see #4359)

Fix issue in logp method of HyperGeometric. It now returns -inf for invalid parameters (see 4367)

Fixed MatrixNormal random method to work with parameters as random variables. (see #4368)

Update the logcdf method of several continuous distributions to return -inf for invalid parameters and values, and raise an informative error when multiple values cannot be evaluated in a single call. (see 4393 and #4421)

Improve numerical stability in logp and logcdf methods of ExGaussian (see #4407)

Issue UserWarning when doing prior or posterior predictive sampling with models containing Potential factors (see #4419)

Dirichlet distribution's random method is now optimized and gives outputs in correct shape (see #4416)

Attempting to sample a named model with SMC will now raise a NotImplementedError. (see #4365)

Release manager for 3.11.0: Eelke Spaak (@Spaak)
Source code(tar.gz)
Source code(zip)
v3.10.0(Dec 7, 2020)
This is a major release with many exciting new features. The biggest change is that we now rely on our own fork of Theano-PyMC. This is in line with our big announcement about our commitment to PyMC3 and Theano.

When upgrading, make sure that Theano-PyMC and not Theano are installed (the imports remain unchanged, however). If not, you can uninstall Theano:

conda remove theano

And to install:

conda install -c conda-forge theano-pymc

Or, if you are using pip (not recommended):

pip uninstall theano

And to install:

pip install theano-pymc

This new version of Theano-PyMC comes with an experimental JAX backend which, when combined with the new and experimental JAX samplers in PyMC3, can greatly speed up sampling in your model. As this is still very new, please do not use it in production yet but do test it out and let us know if anything breaks and what results you are seeing, especially speed-wise.

New features

New experimental JAX samplers in pymc3.sample_jax (see notebook and #4247). Requires JAX and either TFP or numpyro.

Add MLDA, a new stepper for multilevel sampling. MLDA can be used when a hierarchy of approximate posteriors of varying accuracy is available, offering improved sampling efficiency especially in high-dimensional problems and/or where gradients are not available (see #3926)

Add Bayesian Additive Regression Trees (BARTs) #4183)

Added pymc3.gp.cov.Circular kernel for Gaussian Processes on circular domains, e.g. the unit circle (see #4082).

Added a new MixtureSameFamily distribution to handle mixtures of arbitrary dimensions in vectorized form for improved speed (see #4185).

sample_posterior_predictive_w can now feed on xarray.Dataset - e.g. from InferenceData.posterior. (see #4042)

Change SMC metropolis kernel to independent metropolis kernel #4115)

Add alternative parametrization to NegativeBinomial distribution in terms of n and p (see #4126)

Added semantically meaningful str representations to PyMC3 objects for console, notebook, and GraphViz use (see #4076, #4065, #4159, #4217, #4243, and #4260).

Add Discrete HyperGeometric Distribution (see #4249)

Maintenance

Switch the dependency of Theano to our own fork, Theano-PyMC.

Removed non-NDArray (Text, SQLite, HDF5) backends and associated tests.

Use dill to serialize user defined logp functions in DensityDist. The previous serialization code fails if it is used in notebooks on Windows and Mac. dill is now a required dependency. (see #3844).

Fixed numerical instability in ExGaussian's logp by preventing logpow from returning -inf (see #4050).

Numerically improved stickbreaking transformation - e.g. for the Dirichlet distribution. #4129

Enabled the Multinomial distribution to handle batch sizes that have more than 2 dimensions. #4169

Test model logp before starting any MCMC chains (see #4211)

Fix bug in model.check_test_point that caused the test_point argument to be ignored. (see PR #4211)

Refactored MvNormal.random method with better handling of sample, batch and event shapes. #4207

The InverseGamma distribution now implements a logcdf. #3944

Make starting jitter methods for nuts sampling more robust by resampling values that lead to non-finite probabilities. A new optional argument jitter-max-retries can be passed to pm.sample() and pm.init_nuts() to control the maximum number of retries per chain. 4298

Documentation

Added a new notebook demonstrating how to incorporate sampling from a conjugate Dirichlet-multinomial posterior density in conjunction with other step methods (see #4199).

Mentioned the way to do any random walk with theano.tensor.cumsum() in GaussianRandomWalk docstrings (see #4048).

Release manager for 3.10.0: Eelke Spaak (@Spaak)
Source code(tar.gz)
Source code(zip)

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

Related tags

Overview

The future of PyMC3 & Theano

Features

Getting started

If you already know about Bayesian statistics:

Learn Bayesian statistics with a book together with PyMC3:

PyMC3 talks

Installation

Citing PyMC3

Contact

License

Software using PyMC3

Papers citing PyMC3

Contributors

Support

PyMC for enterprise

Sponsors

Comments

The MLDA stepper

Changes

Performance

Work in progress

Usage

References

How to help?

The following distributions don't have a moment method implemented:

Overlap

Derivatives

Other benefits

Documentation

Major / Breaking Changes

Maintenance

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

Major / Breaking Changes

New features

Bugfixes

Documentation

Maintenance

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

Releases(v5.0.1)

v5.0.1(Dec 21, 2022)

What's Changed

New Features 🎉

Bugfixes 🐛

Maintenance 🔧

New Contributors

v5.0.0(Dec 12, 2022)

What's Changed

Major Changes 🛠

New Features & Bugfixes 🎉

Docs & Maintenance 🔧

New Contributors

v4.4.0(Nov 19, 2022)

What's Changed

Major Changes 🛠

New Features & Bugfixes 🎉

Docs & Maintenance 🔧

New Contributors

v4.3.0(Oct 31, 2022)

What's Changed

Major Changes 🛠

New Features & Bugfixes 🎉

Docs & Maintenance 🔧

New Contributors

v4.2.2(Oct 10, 2022)

What's Changed

New Features & Bugfixes 🎉

Docs & Maintenance 🔧

New Contributors

v4.2.1(Sep 30, 2022)