BAyesian Model-Building Interface (Bambi) in Python.

Related tags

Data Analysisbambi
Overview

Bambi

BAyesian Model-Building Interface in Python Build Status codecov Code style: black

Overview

Bambi is a high-level Bayesian model-building interface written in Python. It's built on top of the PyMC3 probabilistic programming framework, and is designed to make it extremely easy to fit mixed-effects models common in social sciences settings using a Bayesian approach.

Installation

Bambi requires a working Python interpreter (3.7+). We recommend installing Python and key numerical libraries using the Anaconda Distribution, which has one-click installers available on all major platforms.

Assuming a standard Python environment is installed on your machine (including pip), Bambi itself can be installed in one line using pip:

pip install bambi

Alternatively, if you want the bleeding edge version of the package you can install from GitHub:

pip install git+https://github.com/bambinos/bambi.git

Dependencies

Bambi requires working versions of numpy, pandas, matplotlib, patsy, pymc3, and theano. Dependencies are listed in requirements.txt, and should all be installed by the Bambi installer; no further action should be required.

Documentation

The Bambi documentation can be found in the official docs

Citation

If you use Bambi and want to cite it please use arXiv

Here is the citation in BibTeX format

@misc{capretto2020,
      title={Bambi: A simple interface for fitting Bayesian linear models in Python}, 
      author={TomΓ‘s Capretto and Camen Piho and Ravin Kumar and Jacob Westfall and Tal Yarkoni and Osvaldo A. Martin},
      year={2020},
      eprint={2012.10754},
      archivePrefix={arXiv},
      primaryClass={stat.CO}
}

Contributions

Bambi is a community project and welcomes contributions. Additional information can be found in the Contributing Readme.

For a list of contributors see the GitHub contributor page

Code of Conduct

Bambi wishes to maintain a positive community. Additional details can be found in the Code of Conduct

License

MIT License

Comments
  • Radon Example Implementation

    Radon Example Implementation

    This PR aims to solve https://github.com/bambinos/bambi/issues/439.

    I have pushed the first version of the models. Here are the formulas I used:

    • pooled_model: log_radon ~ floor
    • unpooled_model: log_radon ~ 0 + county + county:floor
    • partial_pooling_model: log_radon ~ 0 + (0 + 1|county)
    • varying_intercept_model: log_radon ~ 0 + (1|county) + floor # ! How to do complete one-hot-encoding for county?
    • varying_intercept_slope_model: log_radon ~ 0 + (floor|county) # ! How to do complete one-hot-encoding for county?

    I could like to have all the countries as encoded (not removing the first one) plus I am not auto-scaling the data to make the results with the original PyMC example comparable. Does this look good? I am still getting used to the formula-like notation for hierarchical models (which is why I am working on this PR πŸ˜… ).

    After double-checking the model specification I will continue to add text and some additional plots.

    opened by juanitorduz 79
  • Add model comparison example

    Add model comparison example

    This PR adds a new notebook where Bambi is used to fit a couple of logistic regression models and then Arviz is used to perform model comparison. I would like to have feedback on the example before merging.

    Note: This was mentioned in #210

    opened by tomicapretto 38
  • Update example notebooks

    Update example notebooks

    I updated some notebooks to reflect changes in Bambi and some extra comments/explanations I added when considered they could be helpful. I include the WIP label because I think the notebooks may be modified before merging according to feedback and discussion.

    Note: In a chat with @aloctavodia we talked about modifying the object that is passed to az.plot_trace() in such a way that *_offset terms are not included by default. This is not included in this PR. I think we should work on it in a separate PR and then update the examples again to reflect changes.

    opened by tomicapretto 30
  • Add informative message when prior scaling fails due to non-identifiable fixed effects

    Add informative message when prior scaling fails due to non-identifiable fixed effects

    When a model with default priors and non-identifiable fixed effects is created (e.g., by calling something like add_term('condition', drop_first=False) when an intercept is already in the model), the prior scaling approach currently implemented in PriorScaler sets the random factor hyperprior variances to inf, and the sampler consequently refuses to start. We should either check for this scenario and substitute some large but sane value, or raise an exception with a more informative message rather than waiting for Theano to complain.

    bug 
    opened by tyarkoni 30
  • Stan

    Stan

    This PR adds a new StanBackEnd that should (in theory) do almost everything that the PyMC3BackEnd does. Using it is as simple as specifying backend='stan' when initializing a new Model. The StanBackEnd wraps PyStan, but converts the sampling output into a PyMC3 MultiTrace object, which allows us to use all of the existing plotting/summarizing machinery (at least in principle; in practice, I had to hack my way around the normal MultiTrace initialization process, so some attributes are probably not set properly).

    There are some things to iron out before we can merge this. @jake-westfall, maybe you can take a pass at these? We need to do the following:

    • Test everything more thoroughly to make sure it works.

    • Modify at least a few of the existing tests to run using the Stan backend. It's probably also worth adding a couple of tests that run with both backends and make sure the resulting estimates are close. We also need to modify the travis config to install PyStan (and maybe add an optional_dependencies.txt file containing PyStan that people can pip install -r from).

    • Add support for all of the PyMC3 distributions we currently support. The trick here is to map PyMC3 distribution names and arguments onto the Stan language. This is accomplished via the dists dictionary in StanBackEnd, which maps the PyMC3 dist names onto dictionaries containing the Stan names, argument order, and (optionally) and bounds to impose. The latter is required because in Stan there are no separate distributions for clipped distributions; e.g., half-cauchy is just cauchy with a lower bound of 0. In principle I think all we need to do is add new entries to the dists dictionary; I don't think any of the PyMC3 distributions we currently support should introduce any new complications.

    • One gotcha right now is that PyStan ignores the incl_warmup argument in extract() if permuted=True. This is reasonable in one sense, because the permuted results concatenate all chains together, and it's arguably weird to include the different initialization points. But it's a problem for us, because to make results commensurable with the PyMC3BackEnd, we should really include the burn-in and let the user drop it themselves later. The solution is unfortunately probably going to require us to duplicate some of the PyStan code for reformatting the Stan output, but with the burn-in included.

    • Add support for different error distributions. Right now the StanBackEnd only supports normally distributed errors, but we should at minimum support the options we support with PyMC3. This shouldn't take much work; I think we just need to modify the specification of the yhat term and the model error.

    opened by tyarkoni 28
  • ValueError: Factor on right hand side of group specific term must be a single term.

    ValueError: Factor on right hand side of group specific term must be a single term.

    Thanks for all the great work with Bambi. It has been a godsend for my work.

    I tried to implement the following formula:

    y ~ time * tx + (time | therapist/subjects)

    which expands to:

    y ~ time * tx + (time | therapist:subjects) + (time | therapist)

    and I get the following error:

    "ValueError: Factor on right hand side of group specific term must be a single term"

    Is it possible to implement three-level conditional hierarchical models or am I simply writing this incorrectly? If I am writing it incorrectly, how would I need to reformat it to work with bambi/formulae?

    opened by zwelitunyiswa 24
  • Add support for numpyro and blackjax PyMC samplers

    Add support for numpyro and blackjax PyMC samplers

    This is to address https://github.com/bambinos/bambi/issues/522 and https://github.com/bambinos/bambi/issues/525 inspired by @zwelitunyiswa's example

    I decided to add a single new value to the fit() method which allows switching in of numpyro/blackjax samplers instead of the pymc default. I decided against some cpu/gpu flags because it's mostly decided by whatever Jax can find and the methods I saw to disable GPUs are quite hacky involving playing with your environment variables which I felt is out of scope for a library to be fiddling with so I've just noted this in the documentation instead.

    I've tested the samplers locally and they work on one of my personal projects, but I'll try and knock up a simple example shortly which demonstrates them all.

    One note: The PyMC 4 release blog post says:

    These samplers live in a different submodule sampling_jax but the plan is to integrate them into pymc.sample(backend="JAX").

    So we should expect the implementation here to change pretty soon, so I think it's worth keeping the implementation in bambi simple so it's easy to port-over when this happens.

    opened by markgoodhead 23
  • Documentation

    Documentation

    Fixes #81

    Notes on reviewing

    • The diff count shows that this PR is large, but its quite misleading. A lot of the line count is either from
      • autogenerated files (conf.py, Make.bat, and Makefile) -- out of these one should only review conf.py if you're familiar with sphinx
      • ipython notebook diffs. I only updated slight wording and the number of octothorpes (#) in the headers, did not re-run any cells
    • all code in bambi/ are docstring related. I did not add or remove any documentation, but there were some spacing problems (which cause sphinx to have a hard time)
      • the only change of note here is that I changed some docstrings in bambi/diagnostics.py from the numpy convention to the Google convention, since most of the library uses the Google convention
    • I would pretty much only look at the .rst file changes
    • Happy to hear any thoughts people have whether it be reversions, continued code changes, or to back up any decisions I made!

    Changes

    • Incorporates more information from the README.md
    • Adds example notebooks
      • Made the examples/ directory a symbolic link to the notebooks which are now housed in the documentation (similar to PyMC3)
      • Edited the headers in the notebooks since they are read in by sphinx (mostly this is making one top-level header # and making the other headers either ## or ###
      • No other edits to notebooks were made (mention this since notebooks are hard to review)
    • API Reference now includes all classes, methods and functions from models/, priors/, results/, and diagnostics.

    To run the docs locally

    1. Grab this branch, and enter the root directory of this repository
    2. enter an python3 virtual env of your choice (I used conda create python=3.6)
    3. install the library python setup.py install
    4. install requirements pip install -r requirements-dev.txt
    5. enter the docs directory cd docs/
    6. run the build make html
    7. open the generated html in browser open _build/html/index.html (from still within the docs/ directory

    Next steps

    • hosting on readthedocs?
    • If there is any custom CSS or images that people want on the docs?
    opened by camenpihor 21
  • Add example with categorical family

    Add example with categorical family

    Added a notebook with examples using the categorical family to address #436. I noticed on the original merge (#426) that @tomicapretto made several very nice examples, so I aggregated several into one notebook and added some comments to tie it together.

    Note that I omitted two of the examples, the satisfaction survey and the inhaler example. I made that choice because I think those two fall more in the domain of ordinal regression, rather than categorical. I'm happy to add them if you'd prefer to include them though.

    opened by tjburch 20
  • Unexpected Error when testing Bambi package

    Unexpected Error when testing Bambi package

    I'm working on a laptop with Windows 10 and using Anaconda 64-bit. I have created an environment for working with Pymc3 and bambi. I have been able to test Pymc3 and it is working for an Hierarchical Linear Regression model. For bambi I was testing the example described at this link (growth curves of Pigs example) [https://bambinos.github.io/bambi/master/notebooks/multi-level_regression.html]. I run into an error I had not seen with bambi before (actually from the output this looks like it is occurring at Theano, but I'm not a programmer or a developer).

    Here is my environment and the complete traceback of the error

    # packages in environment at C:\ProgramData\Anaconda3\envs\pm3env:
    #
    # Name                    Version                   Build  Channel
    argon2-cffi               20.1.0           py37hcc03f2d_2    conda-forge
    arviz                     0.11.1                   pypi_0    pypi
    async_generator           1.10                       py_0    conda-forge
    attrs                     21.2.0             pyhd8ed1ab_0    conda-forge
    backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
    backports                 1.0                        py_2    conda-forge
    backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
    bambi                     0.5.0                    pypi_0    pypi
    blas                      1.0                         mkl
    bleach                    4.0.0              pyhd8ed1ab_0    conda-forge
    ca-certificates           2021.5.30            h5b45459_0    conda-forge
    cached-property           1.5.2                    pypi_0    pypi
    cachetools                4.2.2                    pypi_0    pypi
    cairo                     1.16.0            hb19e0ff_1008    conda-forge
    certifi                   2021.5.30        py37h03978a9_0    conda-forge
    cffi                      1.14.6           py37hd8e9650_0    conda-forge
    cftime                    1.5.0                    pypi_0    pypi
    colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
    cycler                    0.10.0                     py_2    conda-forge
    decorator                 5.0.9              pyhd8ed1ab_0    conda-forge
    defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
    dill                      0.3.4                    pypi_0    pypi
    entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
    et-xmlfile                1.1.0                    pypi_0    pypi
    expat                     2.4.1                h39d44d4_0    conda-forge
    fastprogress              1.0.0                    pypi_0    pypi
    filelock                  3.0.12                   pypi_0    pypi
    font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
    font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
    font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
    font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
    fontconfig                2.13.1            h1989441_1005    conda-forge
    fonts-conda-ecosystem     1                             0    conda-forge
    fonts-conda-forge         1                             0    conda-forge
    formulae                  0.1.3                    pypi_0    pypi
    freetype                  2.10.4               h546665d_1    conda-forge
    fribidi                   1.0.10               h8d14728_0    conda-forge
    getopt-win32              0.1                  h8ffe710_0    conda-forge
    gettext                   0.19.8.1          h1a89ca6_1005    conda-forge
    graphite2                 1.3.13                     1000    conda-forge
    graphviz                  2.48.0               hefbd956_0    conda-forge
    gts                       0.7.6                h7c369d9_2    conda-forge
    h5py                      3.1.0                    pypi_0    pypi
    harfbuzz                  2.8.2                hc601d6f_0    conda-forge
    icc_rt                    2019.0.0             h0cc432a_1
    icu                       68.1                 h0e60522_0    conda-forge
    importlib-metadata        2.1.1                    pypi_0    pypi
    intel-openmp              2021.3.0          h57928b3_3372    conda-forge
    ipykernel                 5.5.5            py37h7813e69_0    conda-forge
    ipython                   7.26.0           py37h4038f58_0    conda-forge
    ipython_genutils          0.2.0                      py_1    conda-forge
    ipywidgets                7.6.3              pyhd3eb1b0_1
    jbig                      2.1               h8d14728_2003    conda-forge
    jedi                      0.18.0           py37h03978a9_2    conda-forge
    jinja2                    3.0.1              pyhd8ed1ab_0    conda-forge
    jpeg                      9d                   h8ffe710_0    conda-forge
    jsonschema                3.2.0              pyhd8ed1ab_3    conda-forge
    jupyter                   1.0.0                    py37_7
    jupyter_client            6.1.12             pyhd8ed1ab_0    conda-forge
    jupyter_console           6.4.0              pyhd8ed1ab_0    conda-forge
    jupyter_core              4.7.1            py37h03978a9_0    conda-forge
    jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
    jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
    kiwisolver                1.3.1            py37h8c56517_1    conda-forge
    lcms2                     2.12                 h2a16943_0    conda-forge
    lerc                      2.2.1                h0e60522_0    conda-forge
    libclang                  11.1.0          default_h5c34c98_1    conda-forge
    libdeflate                1.7                  h8ffe710_5    conda-forge
    libffi                    3.3                  h0e60522_2    conda-forge
    libgd                     2.3.2                h138e682_0    conda-forge
    libglib                   2.68.3               h1e62bf3_0    conda-forge
    libiconv                  1.16                 he774522_0    conda-forge
    libpng                    1.6.37               h1d00b33_2    conda-forge
    libpython                 2.1                      py37_0
    libsodium                 1.0.18               h8d14728_1    conda-forge
    libtiff                   4.3.0                h0c97f57_1    conda-forge
    libwebp                   1.2.0                h57928b3_0    conda-forge
    libwebp-base              1.2.0                h8ffe710_2    conda-forge
    libxcb                    1.13              hcd874cb_1003    conda-forge
    libxml2                   2.9.12               hf5bbc77_0    conda-forge
    llvmlite                  0.36.0           py37habb0c8c_0    conda-forge
    lz4-c                     1.9.3                h8ffe710_1    conda-forge
    m2w64-gcc-libgfortran     5.3.0                         6    conda-forge
    m2w64-gcc-libs            5.3.0                         7    conda-forge
    m2w64-gcc-libs-core       5.3.0                         7    conda-forge
    m2w64-gmp                 6.1.0                         2    conda-forge
    m2w64-libwinpthread-git   5.0.0.4634.697f757               2    conda-forge
    markupsafe                2.0.1            py37hcc03f2d_0    conda-forge
    matplotlib                3.3.2                haa95532_0
    matplotlib-base           3.3.2            py37h3379fd5_1    conda-forge
    matplotlib-inline         0.1.2              pyhd8ed1ab_2    conda-forge
    mistune                   0.8.4           py37hcc03f2d_1004    conda-forge
    mkl                       2020.4             hb70f87d_311    conda-forge
    mkl-service               2.3.0            py37h196d8e1_0
    mkl_fft                   1.3.0            py37hda49f71_1    conda-forge
    mkl_random                1.2.0            py37h414f9d2_1    conda-forge
    mpmath                    1.2.1              pyhd8ed1ab_0    conda-forge
    msys2-conda-epoch         20160418                      1    conda-forge
    nbclient                  0.5.3              pyhd8ed1ab_0    conda-forge
    nbconvert                 6.1.0            py37h03978a9_0    conda-forge
    nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
    nest-asyncio              1.5.1              pyhd8ed1ab_0    conda-forge
    netcdf4                   1.5.7                    pypi_0    pypi
    notebook                  6.4.0              pyha770c72_0    conda-forge
    numba                     0.53.1           py37h4e635f9_0    conda-forge
    numpy                     1.19.2           py37hadc3359_0
    numpy-base                1.19.2           py37ha3acd2a_0
    olefile                   0.46               pyh9f0ad1d_1    conda-forge
    openjpeg                  2.4.0                hb211442_1    conda-forge
    openpyxl                  3.0.7                    pypi_0    pypi
    openssl                   1.1.1k               h8ffe710_0    conda-forge
    packaging                 21.0               pyhd8ed1ab_0    conda-forge
    pandas                    1.2.1            py37hf11a4ad_0
    pandoc                    2.14.1               h8ffe710_0    conda-forge
    pandocfilters             1.4.2                      py_1    conda-forge
    pango                     1.48.7               hd84fcdd_0    conda-forge
    parso                     0.8.2              pyhd8ed1ab_0    conda-forge
    patsy                     0.5.1                    pypi_0    pypi
    pcre                      8.45                 h0e60522_0    conda-forge
    pickleshare               0.7.5                   py_1003    conda-forge
    pillow                    8.3.1            py37hd7d9ad0_0    conda-forge
    pip                       20.3.3           py37haa95532_0
    pixman                    0.40.0               h8ffe710_0    conda-forge
    prometheus_client         0.11.0             pyhd8ed1ab_0    conda-forge
    prompt-toolkit            3.0.19             pyha770c72_0    conda-forge
    prompt_toolkit            3.0.19               hd8ed1ab_0    conda-forge
    pthread-stubs             0.4               hcd874cb_1001    conda-forge
    pycparser                 2.20               pyh9f0ad1d_2    conda-forge
    pygments                  2.9.0              pyhd8ed1ab_0    conda-forge
    pymc3                     3.11.2                   pypi_0    pypi
    pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
    pyqt                      5.12.3           py37h03978a9_7    conda-forge
    pyqt-impl                 5.12.3           py37hf2a7229_7    conda-forge
    pyqt5-sip                 4.19.18          py37hf2a7229_7    conda-forge
    pyqtchart                 5.12             py37hf2a7229_7    conda-forge
    pyqtwebengine             5.12.1           py37hf2a7229_7    conda-forge
    pyrsistent                0.17.3           py37hcc03f2d_2    conda-forge
    python                    3.7.9                h60c2a47_0
    python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
    python-graphviz           0.16               pyh243d235_2    conda-forge
    python_abi                3.7                     2_cp37m    conda-forge
    pytz                      2021.1             pyhd8ed1ab_0    conda-forge
    pywin32                   300              py37hcc03f2d_0    conda-forge
    pywinpty                  1.1.3            py37h7f67f24_0    conda-forge
    pyzmq                     22.2.0           py37hcce574b_0    conda-forge
    qt                        5.12.9               h5909a2a_4    conda-forge
    qtconsole                 5.1.1              pyhd8ed1ab_0    conda-forge
    qtpy                      1.9.0                      py_0    conda-forge
    scipy                     1.7.1                    pypi_0    pypi
    semver                    2.13.0                   pypi_0    pypi
    send2trash                1.7.1              pyhd8ed1ab_0    conda-forge
    setuptools                49.6.0           py37h03978a9_3    conda-forge
    six                       1.16.0             pyh6c4a22f_0    conda-forge
    sqlite                    3.36.0               h8ffe710_0    conda-forge
    statsmodels               0.12.2                   pypi_0    pypi
    sympy                     1.7.1            py37h03978a9_1    conda-forge
    terminado                 0.10.1           py37h03978a9_0    conda-forge
    testpath                  0.5.0              pyhd8ed1ab_0    conda-forge
    theano-pymc               1.1.2                    pypi_0    pypi
    tk                        8.6.10               h8ffe710_1    conda-forge
    tornado                   6.1              py37hcc03f2d_1    conda-forge
    traitlets                 5.0.5                      py_0    conda-forge
    typing                    3.7.4.3                  pypi_0    pypi
    typing_extensions         3.10.0.0           pyha770c72_0    conda-forge
    ucrt                      10.0.20348.0         h57928b3_0    conda-forge
    vc                        14.2                 hb210afc_5    conda-forge
    vs2015_runtime            14.29.30037          h902a5da_5    conda-forge
    watermark                 2.2.0                    pypi_0    pypi
    wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
    webencodings              0.5.1                      py_1    conda-forge
    wheel                     0.36.2             pyhd3deb0d_0    conda-forge
    widgetsnbextension        3.5.1            py37h03978a9_4    conda-forge
    wincertstore              0.2             py37h03978a9_1006    conda-forge
    winpty                    0.4.3                         4    conda-forge
    xarray                    0.16.2                   pypi_0    pypi
    xorg-kbproto              1.0.7             hcd874cb_1002    conda-forge
    xorg-libice               1.0.10               hcd874cb_0    conda-forge
    xorg-libsm                1.2.3             hcd874cb_1000    conda-forge
    xorg-libx11               1.7.2                hcd874cb_0    conda-forge
    xorg-libxau               1.0.9                hcd874cb_0    conda-forge
    xorg-libxdmcp             1.1.3                hcd874cb_0    conda-forge
    xorg-libxext              1.3.4                hcd874cb_1    conda-forge
    xorg-libxpm               3.5.13               hcd874cb_0    conda-forge
    xorg-libxt                1.2.1                hcd874cb_2    conda-forge
    xorg-xextproto            7.3.0             hcd874cb_1002    conda-forge
    xorg-xproto               7.0.31            hcd874cb_1007    conda-forge
    xz                        5.2.5                h62dcd97_1    conda-forge
    zeromq                    4.3.4                h0e60522_0    conda-forge
    zipp                      3.5.0              pyhd8ed1ab_0    conda-forge
    zlib                      1.2.11            h62dcd97_1010    conda-forge
    zstd                      1.5.0                h6255e5f_0    conda-forge
    

    Model I tried to run is:

    model = bmb.Model("Weight ~ Time + (Time|Pig)", data)
    results = model.fit()
    

    And the error I get is:

    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-11-99071b9bde96> in <module>
          1 model = bmb.Model("Weight ~ Time + (Time|Pig)", data)
    ----> 2 results = model.fit()
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\bambi\models.py in fit(self, omit_offsets, backend, **kwargs)
        213             )
        214 
    --> 215         return self.backend.run(omit_offsets=omit_offsets, **kwargs)
        216 
        217     def build(self, backend="pymc"):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\bambi\backends\pymc.py in run(self, start, method, init, n_init, omit_offsets, **kwargs)
        139                     n_init=n_init,
        140                     return_inferencedata=True,
    --> 141                     **kwargs,
        142                 )
        143 
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
        502                 progressbar=progressbar,
        503                 jitter_max_retries=jitter_max_retries,
    --> 504                 **kwargs,
        505             )
        506             if start is None:
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\sampling.py in init_nuts(init, chains, n_init, model, random_seed, progressbar, jitter_max_retries, **kwargs)
       2185         raise ValueError(f"Unknown initializer: {init}.")
       2186 
    -> 2187     step = pm.NUTS(potential=potential, model=model, **kwargs)
       2188 
       2189     return start, step
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\step_methods\hmc\nuts.py in __init__(self, vars, max_treedepth, early_max_treedepth, **kwargs)
        166         `pm.sample` to the desired number of tuning steps.
        167         """
    --> 168         super().__init__(vars, **kwargs)
        169 
        170         self.max_treedepth = max_treedepth
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\step_methods\hmc\base_hmc.py in __init__(self, vars, scaling, step_scale, is_cov, model, blocked, potential, dtype, Emax, target_accept, gamma, k, t0, adapt_step_size, step_rand, **theano_kwargs)
         86         vars = inputvars(vars)
         87 
    ---> 88         super().__init__(vars, blocked=blocked, model=model, dtype=dtype, **theano_kwargs)
         89 
         90         self.adapt_step_size = adapt_step_size
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\step_methods\arraystep.py in __init__(self, vars, model, blocked, dtype, logp_dlogp_func, **theano_kwargs)
        252 
        253         if logp_dlogp_func is None:
    --> 254             func = model.logp_dlogp_function(vars, dtype=dtype, **theano_kwargs)
        255         else:
        256             func = logp_dlogp_func
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\model.py in logp_dlogp_function(self, grad_vars, tempered, **kwargs)
       1002         varnames = [var.name for var in grad_vars]
       1003         extra_vars = [var for var in self.free_RVs if var.name not in varnames]
    -> 1004         return ValueGradFunction(costs, grad_vars, extra_vars, **kwargs)
       1005 
       1006     @property
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\pymc3\model.py in __init__(self, costs, grad_vars, extra_vars, dtype, casting, compute_grads, **kwargs)
        689 
        690         if compute_grads:
    --> 691             grad = tt.grad(self._cost_joined, self._vars_joined)
        692             grad.name = "__grad"
        693             outputs = [self._cost_joined, grad]
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
        637             assert g.type.dtype in theano.tensor.float_dtypes
        638 
    --> 639     rval = _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
        640 
        641     for i in range(len(rval)):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
       1438         return grad_dict[var]
       1439 
    -> 1440     rval = [access_grad_cache(elem) for elem in wrt]
       1441 
       1442     return rval
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1438         return grad_dict[var]
       1439 
    -> 1440     rval = [access_grad_cache(elem) for elem in wrt]
       1441 
       1442     return rval
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in <listcomp>(.0)
       1059             inputs = node.inputs
       1060 
    -> 1061             output_grads = [access_grad_cache(var) for var in node.outputs]
       1062 
       1063             # list of bools indicating if each output is connected to the cost
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_grad_cache(var)
       1391                     for idx in node_to_idx[node]:
       1392 
    -> 1393                         term = access_term_cache(node)[idx]
       1394 
       1395                         if not isinstance(term, Variable):
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\gradient.py in access_term_cache(node)
       1218                             )
       1219 
    -> 1220                 input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
       1221 
       1222                 if input_grads is None:
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\tensor\elemwise.py in L_op(self, inputs, outs, ograds)
        562 
        563         # compute grad with respect to broadcasted input
    --> 564         rval = self._bgrad(inputs, outs, ograds)
        565 
        566         # TODO: make sure that zeros are clearly identifiable
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\tensor\elemwise.py in _bgrad(self, inputs, outputs, ograds)
        666                 ret.append(None)
        667                 continue
    --> 668             ret.append(transform(scalar_igrad))
        669 
        670         return ret
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\tensor\elemwise.py in transform(r)
        657                 return DimShuffle((), ["x"] * nd)(res)
        658 
    --> 659             new_r = Elemwise(node.op, {})(*[transform(ipt) for ipt in node.inputs])
        660             return new_r
        661 
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\tensor\elemwise.py in <listcomp>(.0)
        657                 return DimShuffle((), ["x"] * nd)(res)
        658 
    --> 659             new_r = Elemwise(node.op, {})(*[transform(ipt) for ipt in node.inputs])
        660             return new_r
        661 
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\tensor\elemwise.py in transform(r)
        657                 return DimShuffle((), ["x"] * nd)(res)
        658 
    --> 659             new_r = Elemwise(node.op, {})(*[transform(ipt) for ipt in node.inputs])
        660             return new_r
        661 
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\graph\op.py in __call__(self, *inputs, **kwargs)
        251 
        252         if config.compute_test_value != "off":
    --> 253             compute_test_value(node)
        254 
        255         if self.default_output is not None:
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\graph\op.py in compute_test_value(node)
        124 
        125     # Create a thunk that performs the computation
    --> 126     thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
        127     thunk.inputs = [storage_map[v] for v in node.inputs]
        128     thunk.outputs = [storage_map[v] for v in node.outputs]
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\graph\op.py in make_thunk(self, node, storage_map, compute_map, no_recycling, impl)
        632             )
        633             try:
    --> 634                 return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
        635             except (NotImplementedError, MethodNotDefined):
        636                 # We requested the c code, so don't catch the error.
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\graph\op.py in make_c_thunk(self, node, storage_map, compute_map, no_recycling)
        599                 raise NotImplementedError("float16")
        600         outputs = cl.make_thunk(
    --> 601             input_storage=node_input_storage, output_storage=node_output_storage
        602         )
        603         thunk, node_input_filters, node_output_filters = outputs
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\basic.py in make_thunk(self, input_storage, output_storage, storage_map)
       1202         init_tasks, tasks = self.get_init_tasks()
       1203         cthunk, module, in_storage, out_storage, error_storage = self.__compile__(
    -> 1204             input_storage, output_storage, storage_map
       1205         )
       1206 
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\basic.py in __compile__(self, input_storage, output_storage, storage_map)
       1140             input_storage,
       1141             output_storage,
    -> 1142             storage_map,
       1143         )
       1144         return (
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\basic.py in cthunk_factory(self, error_storage, in_storage, out_storage, storage_map)
       1632             for node in self.node_order:
       1633                 node.op.prepare_node(node, storage_map, None, "c")
    -> 1634             module = get_module_cache().module_from_key(key=key, lnk=self)
       1635 
       1636         vars = self.inputs + self.outputs + self.orphans
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\cmodule.py in module_from_key(self, key, lnk)
       1189             try:
       1190                 location = dlimport_workdir(self.dirname)
    -> 1191                 module = lnk.compile_cmodule(location)
       1192                 name = module.__file__
       1193                 assert name.startswith(location)
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\basic.py in compile_cmodule(self, location)
       1548                     lib_dirs=self.lib_dirs(),
       1549                     libs=libs,
    -> 1550                     preargs=preargs,
       1551                 )
       1552             except Exception as e:
    
    C:\ProgramData\Anaconda3\envs\pm3env\lib\site-packages\theano\link\c\cmodule.py in compile_str(module_name, src_code, location, include_dirs, lib_dirs, libs, preargs, py_module, hide_symbols)
       2545             compile_stderr = compile_stderr.replace("\n", ". ")
       2546             raise Exception(
    -> 2547                 f"Compilation failed (return status={status}): {compile_stderr}"
       2548             )
       2549         elif config.cmodule__compilation_warning and compile_stderr:
    
    Exception: ("Compilation failed (return status=1): C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp: In member function 'int {anonymous}::__struct_compiled_op_m67599e776bb0a5edbe20464e4ef6902fada5652e9f038845aa3f408620203691::run()':. C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp:506:39: warning: narrowing conversion of 'V5_n0' from 'npy_intp' {aka 'long long int'} to 'int' inside { } [-Wnarrowing].      int init_totals[2] = {V5_n0, V1_n1};.                                        ^. C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp:506:39: warning: narrowing conversion of 'V1_n1' from 'npy_intp' {aka 'long long int'} to 'int' inside { } [-Wnarrowing]. C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp:521:5: warning: narrowing conversion of 'V5_stride0' from 'ssize_t' {aka 'long long int'} to 'int' inside { } [-Wnarrowing].      };.      ^. C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp:521:5: warning: narrowing conversion of 'V1_stride0' from 'ssize_t' {aka 'long long int'} to 'int' inside { } [-Wnarrowing]. C:\\Users\\sreedatta\\AppData\\Local\\Theano\\compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_140_Stepping_1_GenuineIntel-3.7.9-64\\tmp81eb_5sq\\mod.cpp:521:5: warning: narrowing conversion of 'V1_stride1' from 'ssize_t' {aka 'long long int'} to 'int' inside { } [-Wnarrowing]. At global scope:. cc1plus.exe: warning: unrecognized command line option '-Wno-c++11-narrowing'. C:\\Users\\SREEDATTA\\AppData\\Local\\Temp\\ccjNNew1.s: Assembler messages:\r. C:\\Users\\SREEDATTA\\AppData\\Local\\Temp\\ccjNNew1.s:4410: Error: invalid register for .seh_savexmm\r. ", 'FunctionGraph(Elemwise{mul}(<TensorType(float64, col)>, <TensorType(int8, (True, True))>))')
    

    Can one of you help?

    Sree

    opened by sreedat 20
  • `Model.predict()` generates unexpected out-of-sample predictions for a mixed effects model

    `Model.predict()` generates unexpected out-of-sample predictions for a mixed effects model

    Hello,

    first of all, thanks for the great work on this project, I've been using bambi a lot and it has been super helpful!

    I'm currently facing a (potential) issue when trying to make out-of-sample predictions for a logistic regression model built with the following formula:

    y ~ x1 + x2 + x3 + (0 + x2|x1) + (0 + x3|x1)

    where x1 and x2 are categorical variables with two dimensions respectively and x3 is a continuous variable.

    The out-of-sample data I'm trying to make predictions for looks like this (exemplary):

    | x1 | x2 | x3 | |----|----|-----| | 0 | 0 | 0 | | 0 | 0 | 0.5 | | 0 | 0 | 1 | | 0 | 0 | 1.5 | | 1 | 0 | 0 | | 1 | 0 | 0.5 | | 1 | 0 | 1 | | 1 | 0 | 1.5 |

    There was no error when running model.predict(iData, data=out_of_sample_data, kind='mean'), however the spaghetti plot I generated from the posterior predictions looked off for when x1==1, the variance was much bigger than I expected. (I noticed this because I had manually made a plot displaying the 0.5 decision boundary for x3, i.e. the mean value and hdi intervals of x3 where there is a 50% probability of a positive outcome and that didn't match what I saw in the spaghetti plot.)

    I then had a look at the code and noticed that the Z matrix generated in the predict method in models.py looked different from what I expected. Here's the code bit I'm referring to (last line):

            if self._design.group:
                if in_sample:
                    Z = self._design.group.design_matrix
                else:
                    Z = self._design.group._evaluate_new_data(data).design_matrix
    

    What I got for Z was the following:

    ||||||| |---|---|---|---|-----|-----| | 1 | 0 | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0.5 | 0 | | 1 | 0 | 0 | 0 | 1 | 0 | | 1 | 0 | 0 | 0 | 1.5 | 0 | | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 1 | 0 | 0 | 0.5 | | 0 | 0 | 1 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | 0 | 1.5 |

    ...but what I was expecting (after trying to make sense of it) was this:

    ||||||| |---|---|---|---|-----|-----| | 1 | 0 | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0.5 | 0 | | 1 | 0 | 0 | 0 | 1 | 0 | | 1 | 0 | 0 | 0 | 1.5 | 0 | | 0 | 1 | 0 | 0 | 0 | 0 | | 0 | 1 | 0 | 0 | 0 | 0.5 | | 0 | 1 | 0 | 0 | 0 | 1 | | 0 | 1 | 0 | 0 | 0 | 1.5 |

    so basically the second and third column swapped. I then added the following line:

    Z[:, [1, 2]] = Z[:, [2, 1]]
    

    to achieve that and the spaghetti plot I then generated matched my expectation.

    It would be great if someone had a look at this and fixed it properly (if it really is an issue and not me making a mistake), I hope it was clear enough and if not, let me know!

    bug 
    opened by LeonieMei 19
  • Odd packages being pulled in via conda-forge

    Odd packages being pulled in via conda-forge

    Trying to mamba install bambi into an existing environment:

      + astor                 0.8.1  pyh9f0ad1d_0         conda-forge/noarch        Cached
      + aws-c-auth           0.6.21  hd46a2a8_1           conda-forge/osx-arm64     Cached
      + aws-c-compression    0.2.16  ha56e2a8_0           conda-forge/osx-arm64     Cached
      + aws-c-http           0.6.29  hb329ca4_0           conda-forge/osx-arm64     Cached
      + aws-c-mqtt           0.7.13  h5519404_10          conda-forge/osx-arm64     Cached
      + aws-c-s3              0.2.1  hb7b86b9_2           conda-forge/osx-arm64     Cached
      + aws-c-sdkutils        0.1.7  ha56e2a8_0           conda-forge/osx-arm64     Cached
      + aws-crt-cpp         0.18.16  h8bc873b_5           conda-forge/osx-arm64     Cached
      + bambi                 0.9.3  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + base58                2.1.1  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + boto3               1.26.37  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + botocore            1.29.37  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + formulae              0.3.4  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + jmespath              1.0.1  pyhd8ed1ab_0         conda-forge/noarch        Cached
      + libarrow             10.0.1  hc5f4219_3_cpu       conda-forge/osx-arm64     Cached
      + libgrpc              1.51.1  h55edf5b_0           conda-forge/osx-arm64     Cached
      + s3transfer            0.6.0  pyhd8ed1ab_0         conda-forge/noarch        Cached
    
      Change:
    ────────────────────────────────────────────────────────────────────────────────────────
    
      - cryptography         38.0.4  py310h4fe9c50_0      conda-forge
      + cryptography         38.0.4  py310hfc83b78_0      conda-forge/osx-arm64     Cached
      - curl                 7.87.0  hbe9bab4_0           conda-forge
      + curl                 7.87.0  h9049daf_0           conda-forge/osx-arm64     Cached
      - hdf5                 1.12.2  nompi_h55deafc_101   conda-forge
      + hdf5                 1.12.2  nompi_ha7af310_101   conda-forge/osx-arm64     Cached
      - krb5                 1.20.1  h127bd45_0           conda-forge
      + krb5                 1.20.1  h69eda48_0           conda-forge/osx-arm64     Cached
      - libcurl              7.87.0  hbe9bab4_0           conda-forge
      + libcurl              7.87.0  h9049daf_0           conda-forge/osx-arm64     Cached
      - libevent             2.1.10  hbae9a57_4           conda-forge
      + libevent             2.1.10  h7673551_4           conda-forge/osx-arm64     Cached
      - libnghttp2           1.47.0  h232270b_1           conda-forge
      + libnghttp2           1.47.0  h519802c_1           conda-forge/osx-arm64     Cached
      - libssh2              1.10.0  hb80f160_3           conda-forge
      + libssh2              1.10.0  h7a5bd25_3           conda-forge/osx-arm64     Cached
      - libthrift            0.16.0  h1a74c4f_2           conda-forge
      + libthrift            0.16.0  h6635e49_2           conda-forge/osx-arm64     Cached
      - libzip                1.9.2  h96606af_1           conda-forge
      + libzip                1.9.2  h76ab92c_1           conda-forge/osx-arm64     Cached
      - python               3.10.8  hf452327_0_cpython   conda-forge
      + python               3.10.8  h3ba56d0_0_cpython   conda-forge/osx-arm64     Cached
      - sigtool               0.1.3  h7747421_0           conda-forge
      + sigtool               0.1.3  h44b9a77_0           conda-forge/osx-arm64     Cached
    
      Upgrade:
    ────────────────────────────────────────────────────────────────────────────────────────
    
      - arrow-cpp             9.0.0  py310h5547a8d_2_cpu  conda-forge
      + arrow-cpp            10.0.1  h44b9a77_3_cpu       conda-forge/osx-arm64     Cached
      - aws-c-cal            0.5.11  h4530763_0           conda-forge
      + aws-c-cal            0.5.20  h7a1267a_3           conda-forge/osx-arm64     Cached
      - aws-c-common          0.6.2  h3422bc3_0           conda-forge
      + aws-c-common          0.8.5  h1a8c8d9_0           conda-forge/osx-arm64     Cached
      - aws-c-event-stream    0.2.7  h9972306_13          conda-forge
      + aws-c-event-stream   0.2.16  h828d2a8_0           conda-forge/osx-arm64     Cached
      - aws-c-io             0.10.5  hea86ef8_0           conda-forge
      + aws-c-io            0.13.11  h2ec9475_2           conda-forge/osx-arm64     Cached
      - aws-checksums        0.1.11  h487e1a8_7           conda-forge
      + aws-checksums        0.1.14  ha56e2a8_0           conda-forge/osx-arm64     Cached
      - aws-sdk-cpp         1.8.186  h392f50b_4           conda-forge
      + aws-sdk-cpp         1.9.379  he22300a_6           conda-forge/osx-arm64     Cached
      - grpc-cpp             1.46.4  haeec53e_7           conda-forge
      + grpc-cpp             1.51.1  h44b9a77_0           conda-forge/osx-arm64       21kB
      - libgoogle-cloud       2.1.0  hec15cb4_1           conda-forge
      + libgoogle-cloud       2.5.0  hcf11473_1           conda-forge/osx-arm64     Cached
      - libprotobuf          3.20.2  hb5ab8b9_0           conda-forge
      + libprotobuf         3.21.12  hb5ab8b9_0           conda-forge/osx-arm64     Cached
      - openssl              1.1.1s  h03a7124_1           conda-forge
      + openssl               3.0.7  h03a7124_1           conda-forge/osx-arm64     Cached
      - orc                   1.7.6  hb9d09c9_0           conda-forge
      + orc                   1.8.1  hef0d403_0           conda-forge/osx-arm64     Cached
      - protobuf             3.20.2  py310h0f1eb42_1      conda-forge
      + protobuf            4.21.12  py310h0f1eb42_0      conda-forge/osx-arm64      287kB
      - pyarrow               9.0.0  py310had0e577_2_cpu  conda-forge
      + pyarrow              10.0.1  py310ha7868e4_3_cpu  conda-forge/osx-arm64        3MB
    
      Downgrade:
    ────────────────────────────────────────────────────────────────────────────────────────
    
      - click                 8.1.3  unix_pyhd8ed1ab_2    conda-forge
      + click                 8.0.4  py310hbe9552e_0      conda-forge/osx-arm64      153kB
      - streamlit            1.16.0  pyhd8ed1ab_0         conda-forge
      + streamlit             1.9.0  pyhd8ed1ab_0         conda-forge/noarch        Cached
    
      Summary:
    
      Install: 17 packages
      Change: 12 packages
      Upgrade: 14 packages
      Downgrade: 2 packages
    
      Total download: 3MB
    

    Some of the downgrades (like streamlit) should not happen, I also don't see why all these aws packages should be installed.

    opened by twiecki 3
  • TruncatedNormal in Hierarchical Setting

    TruncatedNormal in Hierarchical Setting

    Hi All, Merry Christmas.

    Bounds are working prior = {'x1':bmb.Prior("TruncatedNormal", mu=10,sigma=5,lower=0, upper=50)} bmb.model('y ~ x1 + x2 + x3')

    Bounds are not working prior = {'x1|level':bmb.Prior('Normal',mu=bmb.Prior("TruncatedNormal", mu=10,sigma=5,lower=0, upper=50),sigma = 5)} bmb.model('y ~ x1|level + x2 + x3')

    I am a noob. Any guidance is appreciated. Thanks

    opened by SuryaMudimi 1
  • bambi example of bayesian meta-analysis?

    bambi example of bayesian meta-analysis?

    I don't know if anyone is using bambi has conduct meta-analyses, but if so I would love to see an example. As a potential example source, there's a very nice example using brms here. If not, I will try (and most likely fail) to recreate this example and then ask questions.

    documentation 
    opened by don-jil 1
  • Figure in Sleepstudy example not being reproducible

    Figure in Sleepstudy example not being reproducible

    Seems the code used to generate the scatterplot is not working anymore

    image

    I'm trying this but I don't like the end result

    idata.posterior.plot.scatter(
        x="1|Subject", y="Days|Subject",
        hue="Subject__factor_dim",
        hue_style="discrete",
        add_colorbar=False,
        add_legend=False,
        ax=ax
    )
    

    image

    I have ArviZ 0.14.0 and xarray 2022.11.0

    documentation good first issue 
    opened by tomicapretto 2
  • Add Governance

    Add Governance

    I've been chatting with @aloctavodia about adding a Governance structure and document to the project. The current governance of ArviZ looks appealing as a starting point.

    Anyone who reads this and is interested in participating in the governance of Bambi, please comment here or send me a message. The ultimate goal is to make this a more transparent and robust project that doesn't rely completely, and by (lack of) design, on one or two self-selected core developers.

    Discussion 
    opened by tomicapretto 2
Releases(0.9.3)
  • 0.9.3(Dec 21, 2022)

  • 0.9.2(Dec 9, 2022)

    New features

    • Implement censored() (#581)
    • Add Formula class (#585)
    • Add common numpy transforms to extra_namespace (#589)
    • Add AsymmetricLaplace family for Quantile Regression (#591)
    • Add 'transforms' argument to plot_cap() (#594)
    • Add panel covariates to plot_cap() and make it more flexible (#596)

    Maintenance and fixes

    • Reimplemented predictions to make better usage of xarray data structures (#573)
    • Keep 0 dimensional parameters as 0 dimensional instead of 1 dimensional (#575)
    • Refactor terms for modularity and extensibility (#582)
    • Remove seed argument from model.initial_point() (#592)
    • Add build check function on prior predictive and plot prior (#605)

    Documentation

    • Add quantile regression example (#608)

    Deprecation

    • Remove automatic_priors argument from Model (#603)
    • Remove string option for data input in Model (#604)
    Source code(tar.gz)
    Source code(zip)
  • 0.9.1(Aug 27, 2022)

    Bambi 0.9.1

    New features

    • Add support for jax sampling via numpyro and blackjax samplers (#526)
    • Add Laplace family (#524)
    • Improve Laplace computation and integration (#555 and #563)

    Maintenance and fixes

    • Ensure order variable is preserved when ploting priors (#529)
    • Treat offset accordingly (#534)
    • Refactor tests to share data generation code (#531)

    Documentation

    • Update documentation following good inferencedata practices (#537)
    • Add logos to repo and docs (#542)

    Deprecation

    • Deprecate method argument in favor of inference_method (#554)
    Source code(tar.gz)
    Source code(zip)
  • 0.9.0(Jun 6, 2022)

    New features

    • Bambi now uses PyMC 4.0 as it's backend. Most if not all your previous model should run the same, without the need of any change.
    • Add Plot Conditional Adjusted Predictions plot_cap (#517)

    Maintenance and fixes

    • Group specific terms now work with numeric of multiple columns (#516)
    Source code(tar.gz)
    Source code(zip)
  • 0.8.0(May 18, 2022)

    Bambi 0.8.0

    New features

    • Add VonMises ("vonmises") built-in family (#453)
    • Model.predict() gains a new argument include_group_specific to determine if group-specific effects are considered when making predictions (#470)
    • Add Multinomial ("multinomial") built-in family (#490)

    Maintenance and fixes

    • Add posterior predictive sampling method to "categorical" family (#458)
    • Require Python >= 3.7.2 to fix NoReturn type bug in Python (#463)
    • Fixed the wrong builtin link given by link="inverse" was wrong. It returned the same result as link="cloglog" (#472)
    • Replaced plain dictionaries with namedtuples when same dictionary structure was repeated many times (#472)
    • The function check_full_rank() in utils.py now checks the array is 2 dimensional (#472)
    • Removed _extract_family_prior() from bambi/families as it was unnecesary (#472)
    • Removed bambi/families/utils.py as it was unnecessary (#472)
    • Removed external links and unused datasets (#483)
    • Replaced "_coord_group_factor" with "__factor_dim" and "_coord_group_expr" with "__expr_dim" in dimension/coord names (#499)
    • Fixed a bug related to modifying the types of the columns in the original data frame (#502)

    Documentation

    • Add circular regression example (#465)
    • Add Categorical regression example (#457)
    • Add Beta regression example (#442)
    • Add Radon Example (#440)
    • Fix typos and clear up writing in some docs (#462)
    • Documented the module bambi/defaults (#472)
    • Improved documentation and made it more consistent (#472)
    • Cleaned Strack RRR example (#479)

    Deprecation

    • Removed old default priors (#474)
    • Removed draws parameter from Model.predict() method (#504)
    Source code(tar.gz)
    Source code(zip)
  • 0.7.1(Jan 15, 2022)

  • 0.7.0(Jan 11, 2022)

    This release includes a mix of new features, fixes, and new examples on our webpage.

    New features

    • Add "categorical" built-in family (#426)
    • Add include_mean argument to the method Model.fit() (#434)
    • Add .set_alias() method to Model (#435)

    Maintenance and fixes

    • Codebase for the PyMC backend has been refactored (#408)
    • Fix examples that averaged posterior values across chains (#429)
    • Fix issue #427 with automatic priors for the intercept term (#430)

    Documentation

    • Add StudentT regression example, thanks to @tjburch (#414)
    • Add B-Spline regression example with cherry blossoms dataset (#416)
    • Add hirarchical linear regression example with sleepstudy dataset (#424)
    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Sep 17, 2021)

  • 0.6.2(Sep 17, 2021)

  • 0.6.1(Aug 24, 2021)

  • 0.6.0(Aug 9, 2021)

    Many changes are included in this release. Some of the most important changes are

    • New model families (StudentT, Binomial, Beta).
    • In-sample and out-of-sample predictions.
    • Improved sampling performance due to predictor centering when the model contains an intercept.
    • New default priors (similar to rstanarm default priors).
    • It's possible to use potentials.
    • There's a new function to load datasets used throughout examples
    Source code(tar.gz)
    Source code(zip)
  • 0.5.0(May 16, 2021)

    The main changes in this release can be summarized as follows

    • Modified the API. Now all information relative to the model is passed in Model instantiation instead of in Model.fit().
    • Fixed Gamma, Wald, and Negative Binomial families.
    • Changed theme of the webpage and now the documentation is built automatically.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Apr 6, 2021)

    The aim of this release is to update to formulae 0.0.9, which contains several bug fixes. There are also other minor fixes and improvements that can be found in the changelog.

    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Mar 8, 2021)

  • 0.2.0(Mar 19, 2020)

    This release drops Python 2 support (Python >=3.6 is required) and relies on ArviZ for all the plotting and diagnostics/stats. Support for PyStan has been deprecated. If you like to contribute to maintaining PyStan support please contact us. We have done a lot of internal changes to clean the code and make it easier to maintain.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Apr 1, 2017)

    This release features numerous new features and improvements, including support for Stan, a revamped API, expanded random effect support, considerably better compilation and sampling performance for large models, better parameterization of random effects, among other changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.0.5(Jan 19, 2017)

A program that uses an API and a AI model to get info of sotcks

Stock-Market-AI-Analysis I dont mind anyone using this code but please give me credit A program that uses an API and a AI model to get info of stocks

1 Dec 17, 2021
Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

Find relative paths from a project root directory Finding project directories in Python (data science) projects, just like there R here and rprojroot

Daniel Chen 102 Nov 16, 2022
Pandas and Dask test helper methods with beautiful error messages.

beavis Pandas and Dask test helper methods with beautiful error messages. test helpers These test helper methods are meant to be used in test suites.

Matthew Powers 18 Nov 28, 2022
High Dimensional Portfolio Selection with Cardinality Constraints

High-Dimensional Portfolio Selecton with Cardinality Constraints This repo contains code for perform proximal gradient descent to solve sample average

Du Jinhong 2 Mar 22, 2022
Automatic earthquake catalog building workflow: EQTransformer + Siamese EQTransformer + PickNet + REAL + HypoInverse

Automatic regional-scale earthquake catalog building workflow: EQTransformer + Siamese EQTransforme

Xiao Zhuowei 9 Nov 27, 2022
INFO-H515 - Big Data Scalable Analytics

INFO-H515 - Big Data Scalable Analytics Jacopo De Stefani, Giovanni Buroni, ThΓ©o Verhelst and Gianluca Bontempi - Machine Learning Group Exercise clas

Yann-AΓ«l Le Borgne 58 Dec 11, 2022
Data Competition: automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly

Table of contents Introduction Dataset Model & Metrics How to Run Quickstart Install Training Evaluation Detection DATA COMPETITION The COVID-19 pande

Thanh Dat Vu 1 Feb 27, 2022
Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

Quantopian, Inc. 15.7k Jan 07, 2023
Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

BigScience Workshop 3 Mar 03, 2022
Python Practicum - prepare for your Data Science interview or get a refresher.

Python-Practicum Python Practicum - prepare for your Data Science interview or get a refresher. Data Data visualization using data on births from the

Jovan Trajceski 1 Jul 27, 2021
Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (Ξ²) of a stock against the chosen index. The script retrieves the data and resa

sammuhrai 4 Jul 29, 2022
Retentioneering 581 Jan 07, 2023
Employee Turnover Analysis

Employee Turnover Analysis Submission to the DataCamp competition "Can you help reduce employee turnover?"

Jannik Wiedenhaupt 1 Feb 13, 2022
PyClustering is a Python, C++ data mining library.

pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each

Andrei Novikov 1k Jan 05, 2023
Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

Nikolas Petrou 1 Feb 13, 2022
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences

Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are st

32 Dec 20, 2022
pandas: powerful Python data analysis toolkit

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive.

pandas 36.4k Jan 03, 2023
Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible

Salad Dais 6 Sep 01, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 09, 2023
A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset

xwrf A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset. The primary objective of

National Center for Atmospheric Research 43 Nov 29, 2022