Functional tensors for probabilistic programming

Overview

Build Status Latest Version Documentation Status

Funsor

Funsor is a tensor-like library for functions and distributions.

See Functional tensors for probabilistic programming for a system description.

Installing

Install using pip:

Funsor supports Python 3.6+.

pip install funsor

Install from source:

git clone [email protected]:pyro-ppl/funsor.git
cd funsor
git checkout master
pip install .

Using funsor

Funsor can be used through a number of interfaces:

  • Funsors can be used directly for probabilistic computations, using PyTorch optimizers in a standard training loop. Start with these examples: discrete_hmm, eeg_slds, kalman_filter, pcfg, sensor, slds, and vae.
  • Funsors can be used to implement custom inference algorithms within Pyro, using custom elbo implementations in standard pyro.infer.SVI training. See these examples: mixed_hmm and bart forecasting.
  • funsor.pyro provides a number of Pyro-compatible (and PyTorch-compatible) distribution classes that use funsors under the hood, as well utilities to convert between funsors and distributions.
  • funsor.minipyro provides a limited alternate backend for the Pyro probabilistic programming language, and can perform some ELBO computations exactly.

Design

See design doc.

The goal of this library is to generalize Pyro's delayed inference algorithms from discrete to continuous variables, and to create machinery to enable partially delayed sampling compatible with universality. To achieve this goal this library makes three orthogonal design choices:

  1. Open terms are objects. Funsors generalize the tensor interface to also cover arbitrary functions of multiple variables ("inputs"), where variables may be integers, real numbers, or real tensors. Function evaluation / substitution is the basic operation, generalizing tensor indexing. This allows probability distributions to be first-class Funsors and make use of existing tensor machinery, for example we can generalize tensor contraction to computing analytic integrals in conjugate probabilistic models.

  2. Support nonstandard interpretation. Funsors support user-defined interpretations, including, eager, lazy, mixed eager+lazy, memoized (like opt_einsum's sharing), and approximate interpretations like Monte Carlo approximations of integration operations (e.g. .sum() over a funsor dimension).

  3. Named dimensions. Substitution is the most basic operation of Funsors. To avoid the difficulties of broadcasting and advanced indexing in positionally-indexed tensor libraries, all Funsor dimensions are named. Indexing uses the .__call__() method and can be interpreted as substitution (with well-understood semantics). Funsors are viewed as algebraic expressions with one algebraic free variable per dimension. Each dimension is either covariant (an output) or contravariant (an input).

Using funsor we can easily implement Pyro-style delayed sampling, roughly:

trace_log_prob = 0.

def pyro_sample(name, dist, obs=None):
    assert isinstance(dist, Funsor)
    if obs is not None:
        value = obs
    elif lazy:
        # delayed sampling (like Pyro's parallel enumeration)
        value = funsor.Variable(name, dist.support)
    else:
        value = dist.sample('value')[0]['value']

    # save log_prob in trace
    trace_log_prob += dist(value)

    return value

# ...later during inference...
loss = -trace_log_prob.reduce(logaddexp)  # collapses delayed variables

See funsor/minipyro.py for complete implementation.

Related projects

Citation

If you use Funsor, please consider citing:

@article{obermeyer2019functional,
  author = {Obermeyer, Fritz and Bingham, Eli and Jankowiak, Martin and
            Phan, Du and Chen, Jonathan P},
  title = {{Functional Tensors for Probabilistic Programming}},
  journal = {arXiv preprint arXiv:1910.10775},
  year = {2019}
}
Comments
  • Profiling GaussianHMM with NumPyro backend

    Profiling GaussianHMM with NumPyro backend

    I just do profiling funsor.pyro.hmm.GaussianHMM and pyro.distributions.GaussianHMM and have some observations:

    • funsor is slow comparing to pyro, especially for small hidden/obs dims. With batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2, pyro takes 5ms while funsor takes 35ms to evaluate log_prob.
    • The overhead in funsor seems to be constant. I increased obs_dim, hidden_dim to 30, 20 and verified that .
    • torch.cat takes a large amount of time in funsor (e.g. with batch_dim, time_dim = 50, 60, this op takes 7ms per the total 70ms). Similarly, torch.pad takes a portion of time in pyro (but the time is less than torch.cat in funsor). I think the reason is we replace pad by new_zeros + cat in funsor.
    • Time for numerical calculation (except cat, pad) seems to be similar between two versions. It seems to me that at some points, funsor switch dimensions. I guess this is due to the fact that in funsor, we store dimensions in a set. For example, I looked at args of triangular solve and got (this is not important I believe)
    funsor torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([1, 5, 2, 4]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([1, 5, 2, 1]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 287 ms, sys: 30.9 ms, total: 318 ms
    Wall time: 28.3 ms
    pyro torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 70.4 ms, sys: 1.17 ms, total: 71.6 ms
    Wall time: 5.84 ms
    
    Profiling code
    import pyro.distributions as dist
    import pytest
    import torch
    from pyro.distributions.util import broadcast_shape
    
    from funsor.pyro.hmm import DiscreteHMM, GaussianHMM, GaussianMRF, SwitchingLinearHMM
    from funsor.testing import assert_close, random_mvn
    
    batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2
    
    init_shape = (batch_dim,)
    trans_mat_shape = trans_mvn_shape = obs_mat_shape = obs_mvn_shape = (batch_dim, time_dim)
    init_dist = random_mvn(init_shape, hidden_dim)
    trans_mat = torch.randn(trans_mat_shape + (hidden_dim, hidden_dim))
    trans_dist = random_mvn(trans_mvn_shape, hidden_dim)
    obs_mat = torch.randn(obs_mat_shape + (hidden_dim, obs_dim))
    obs_dist = random_mvn(obs_mvn_shape, obs_dim)
    
    actual_dist = GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    expected_dist = dist.GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    assert actual_dist.batch_shape == expected_dist.batch_shape
    assert actual_dist.event_shape == expected_dist.event_shape
    
    shape = broadcast_shape(init_shape + (1,),
                            trans_mat_shape, trans_mvn_shape,
                            obs_mat_shape, obs_mvn_shape)
    data = obs_dist.expand(shape).sample()
    assert data.shape == actual_dist.shape()
    
    %time actual_log_prob = actual_dist.log_prob(data)
    %time expected_log_prob = expected_dist.log_prob(data)
    

    With the above observations, I think that using jax.jit will resolve this "constant" overhead. So I would like to go ahead and implement GaussianHMM in NumPyro. This requires modifications of current funsor.pyro to make it work with NumPyro. All changes in this PR are just for profiling, not to merge. Hopefully, jax.jit will work well with lazily evaluation mechanisms in funsor.

    Tasks

    • [x] todo
    discussion 
    opened by fehiepsi 14
  • Support global backend

    Support global backend

    Addresses #207

    Follow up discussions at #317, this PR supports global backend in funsor.

    To make it easier for reviewing, I list here important changes in this PR:

    • Separate out jax and numpy implementation. I also take this chance to reorder those ops according to their names (to make it easier to compare different backends or to find some specific implementation).
    • Use numpy for default implementation of numeric ops. funsor.torch or funsor.jax are lazily imported, depending on the environment variable FUNSOR_BACKEND or the usage of funsor.set_backend(...).
    • We can change backend using the utility set_backend. There is a problem: if we have change backend to JAX, we can't revert back to numpy backend.
    • Port Pyro Tensordot implementation to funsor. Pyro implementation has some statements x.dim(), which is not available for ndarray
    • Despite that we will use numpy backend by default, I still use PyTorch names (except for amax, amin) for those ops (like the current master branch). I guess Pyro devs would like PyTorch names than NumPy names for numeric ops.

    Some small changes:

    • replace torch._C._get_tracing_state by funsor.util.get_tracing_state
    • add funsor.util.is_nn_module
    • add ops.is_numeric_array to check if a variable is a numeric array. We can rename it to ops.is_tensor to match torch.is_tensor if prefered.
    • add ops.detach to detach gradient. This is no op for numpy backend.
    • fix issues of ops.log which gives warnings at 0. value
    • support both tuple or iterator for testing utilities randn, zeros,... (because I found so many places in test files that use that pattern torch.zeros(1, 2, 3) instead of torch.zeros((1, 2, 3))).
    • dispatch for recursion_reinterpret.register(torch.Tensor) is implemented in torch.py file, instead of interpreter.py file. (similar for children, to_funsor, allclose)
    • others are cleanup and reordering changes

    Tests pass with all backends

    • [x] test_affine
    • [x] test_alpha_conversion
    • [x] test_cnf
    • [x] test_delta
    • [x] test_gaussian
    • [x] test_import
    • [x] test_integrate
    • [x] test_joint
    • [x] test_sum_product
    • [x] test_tensor
    • [x] test_terms

    Tests only work with torch/pyro

    • test/pyro/
    • test/test_minipyro.py
    • test/test_adjoint.py: requires einsum.adjoint.require_backward
    • test/test_distributions.py: refactoring
    • test/test_einsum.py
    • test/test_memoize.py: requires some einsum stuffs
    • test/test_optimizer.py: require distributions and einsum

    TODO:

    • [x] separate numpy and jax backend.
    • [x] revise DICE logic
    awaiting review 
    opened by fehiepsi 12
  • Add a single normal form for sum-product contractions

    Add a single normal form for sum-product contractions

    Resolves #156

    As outlined in #156 I'm using this as a staging branch for the refactoring PRs associated with the proposed changes there:

    • [x] #157 Original PR (diff)
    • [x] #173 Refactor Affine into Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/e20978978283998341f5ed75a755cebc76fa2bab)
    • [x] #165 Refactor optimizer to use Contraction and normalize (https://github.com/pyro-ppl/funsor/pull/157/commits/a4b2ec29b10f6649d86cf9fee24d56e574b279c8)
    • [x] #169 Refactor Joint into multivariate Delta and Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/d2c79b0c8cdab99b370ca160de53dc8d62469dd9)
    • [x] Separate unfolding from normalization (https://github.com/pyro-ppl/funsor/pull/157/commits/ffdd38176567242fde59ab1f070488ab98cce72e)
    • [x] Troubleshoot performance issues (mostly caused by repeated unfolding)
    • [x] Use #212 for pattern matching
    • [x] Remove obsolete patterns (diff)
    • [x] Update MultiDelta to keep separate log_densitys for each point (https://github.com/pyro-ppl/funsor/pull/157/commits/30cbcb1c13806991cdeb70e5618799b60cf844b9)
    • [x] Fix regressions in adjoint and einsum
    • [x] Add missing patterns for minipyro
    • [x] Update bound inference tests for Number

    Original description: This PR takes a first step towards the goals in #156 by adding two things:

    1. A Contraction term representing finitary sum-product operations that takes advantage of multipledispatch.Variadic for pattern matching. This is also necessary for #72.
    2. A normalize interpretation that uses associative and distributive properties to put linear expressions into something like DNF, grouping operations into single Contraction terms where possible.

    As a demonstration, I have included patterns and tests illustrating how Joint is equivalent to interleaving eager and normalize.

    Tested:

    • I copied smoke tests and test cases verbatim from test_joint.py, test_gaussian.py, test_affine.py and test_einsum.py
    awaiting review refactor 
    opened by eb8680 10
  • DiCE factors from importance sampling?

    DiCE factors from importance sampling?

    In Tensor.unscaled_sample, a DiCE factor is added to the sample. In Gaussian.unscaled_sample, this isn't necessary because of the parameterization of the delta function.

    So far including DiCE with the samples doesn't sit quite right conceptually. Could it make sense to instead approach this from importance sampling? The default interpretation of adjoint(p) @ f for continuous p might be to use importance sampling with some default proposal q. To maintain differentiability, the importance weights would be p / q.detach(). If samples can be drawn from p, then p can be used for the proposal, and the importance weights would be p / p.detach(), which in this context are the DiCE factors.

    Am I missing some benefits of the DiCE formalism?

    discussion 
    opened by mattwescott 10
  • Generalize partial sum product to Markov models

    Generalize partial sum product to Markov models

    This attempts to generalize partial_sum_product to support elimination of markov variables following discussion with @eb8680 and @fritzo at pyro-ppl/pyro#2687.

    TODO

    • [x] modified_partial_sum_product that eliminates first-order markov variables using parallel-scan algorithm.
    • [x] Add more tests
    • [ ] Rename modified_partial_sum_product or replace partial_sum_product by it?

    Tests

    • [x] _expected_modified_partial_sum_product unrolls one markov dim at a time
    • [x] test_modified_partial_sum_product_0
    • [x] test_modified_partial_sum_product_1
    • [x] test_modified_partial_sum_product_2
    • [x] test_modified_partial_sum_product_3
    • [x] test_modified_partial_sum_product_4
    • [x] test_modified_partial_sum_product_5
    • [x] test_modified_partial_sum_product_6
    • [x] test_modified_partial_sum_product_7: xfail x and y markov vars with different ordinals
    • [x] test_modified_partial_sum_product_8
    • [x] test_modified_partial_sum_product_9
    • [x] test_modified_partial_sum_product_10
    • [x] test_modified_partial_sum_product_11
    • [x] test_modified_partial_sum_product_12: xfail w_curr in time plate but lower ordinal than y and x markov vars
    • [x] test_modified_partial_sum_product_13
    • [x] test_modified_partial_sum_product_14
    • [x] test_modified_partial_sum_product_15: xfail y depends on two time dims time and tones
    • [x] test_modified_partial_sum_product_16: x_prev->y_curr and y_prev->x_curr

    Note

    • Markov sites cannot be within local plates inside (see tests 2 and 10)

    Update

    • Added unrolling of markov vars in local plates to handle tests 2 and 10

    Useful links

    • pyro-ppl/funsor#167
    • pyro-ppl/funsor#204
    • pyro-ppl/funsor#293
    • pyro-ppl/numpyro#686
    • pyro-ppl/funsor#399
    enhancement WIP 
    opened by ordabayevy 9
  • Make funsor.distributions backend agnostic

    Make funsor.distributions backend agnostic

    Blocked by https://github.com/pyro-ppl/pyro-api/pull/19

    This is similar to #326 but doesn't include funsor.numpyro stuff. I have resolved has_rsample issue and make the whole test_distributions backend agnostic.

    awaiting review 
    opened by fehiepsi 9
  • refactor Gaussian to use Tensor api instead of torch.Tensor

    refactor Gaussian to use Tensor api instead of torch.Tensor

    This PR follows Approach 2 in #207 to make Gaussian backend-agnostic. My plan is to

    • check if info_vec is torch.Tensor, then set self.backend = TorchTensor.
    • make backend-specific ops such as triangular_solve, cholesky static methods of their corresponding TorchTensor type. For example,
    x.triangular_solve(y)
    

    will be

    self.backend.triangular_solve(x, y)
    

    It seems that there is a lot of work to do but I hope that future refactoring will be easier after this.

    Tasks

    • ~[] subclass Tensor to TorchTensor~ this is not necessary for now
    • [x] collect and implement all required ~static methods~ ops for TorchTensor
    • [x] use the above ~static methods~ ops in Gaussian
    • [x] do the same for other classes/functions in gaussian.py
    • [x] make sure all tests pass
    • [x] make align_tensors and materialize backend agnostic
    • [x] make numpy backend pass for gaussian smoke test

    Future PRs

    • deal with torch._C._get_tracing_state() statements
    • make sure that numpy backend will work
    awaiting review 
    opened by fehiepsi 9
  • `NotImplementedError` when using `apply_optimizer`

    `NotImplementedError` when using `apply_optimizer`

    This term below comes up in test_pyroapi_funsor:

    from funsor.cnf import Contraction
    from funsor.tensor import Tensor
    import torch
    import funsor.ops as ops
    from funsor import Bint, Real
    from funsor.terms import Unary, Binary, Variable, Number, lazy, to_data
    from funsor.constant import Constant
    from funsor.delta import Delta
    import funsor
    funsor.set_backend("torch")
    
    with lazy:
        term = Contraction(ops.add, ops.mul,
         frozenset({Variable('plate_inner__BOUND_78', Bint[2]), Variable('x__BOUND_77', Real), Variable('plate_outer__BOUND_79', Bint[3])}),  # noqa
         (Unary(ops.exp,
           Contraction(ops.null, ops.add,
            frozenset(),
            (Delta(
              (('x__BOUND_77',
                (Tensor(
                  torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
                  (('plate_outer__BOUND_79',
                    Bint[3],),),
                  'real'),
                 Number(0.0),),),)),
             Constant(
              (('plate_inner__BOUND_78',
                Bint[2],),),
              Tensor(
               torch.tensor([0.0, 0.0, 0.0], dtype=torch.float64),
               (('plate_outer__BOUND_79',
                 Bint[3],),),
               'real')),))),
          Unary(ops.all,
           Binary(ops.eq,
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'),
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'))),
          Tensor(
           torch.tensor([[-3.304130052277938, -0.9255234395261538, -1.5122103473560844], [-3.217490519312117, -1.2663745664889694, -1.2900109994655682]],
    dtype=torch.float64),  # noqa
           (('plate_inner__BOUND_78',
             Bint[2],),
            ('plate_outer__BOUND_79',
             Bint[3],),),
           'real'),))
    
    x = to_data(funsor.optimizer.apply_optimizer(term))
    

    Error message:

    Traceback (most recent call last):
      File "hehe.py", line 54, in <module>
        x = to_data(funsor.optimizer.apply_optimizer(term))
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 169, in apply_optimizer
        return interpreter.reinterpret(expr)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 255, in reinterpret
        return recursion_reinterpret(x)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 236, in recursion_reinterpret
        return _STACK[-1].interpret(type(x), *map(recursion_reinterpret, children(x)))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 87, in optimize_contraction_variadic
        return optimize.interpret(Contraction, r, b, v, tuple(ts))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 145, in optimize_contract_finitary_funsor
        path_end = Contraction(
      File "/home/ordabayev/repos/funsor/funsor/terms.py", line 210, in __call__
        return interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/cnf.py", line 323, in eager_contraction_tensor
        raise NotImplementedError("TODO")
    NotImplementedError: TODO
    
    enhancement 
    opened by ordabayevy 8
  • Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fixes nan gradients in TraceEnum_ELBO when evaluated eagerly (mentioned in https://github.com/pyro-ppl/funsor/issues/493).

    The problem seems to be that reflect.interpret in line 62 leads to substitutions that have adjoint as a base_interpretation which in turn leads to some extra expressions being added to the tape. Here I fix it by changing interpretation to _old_interpretation.

    This also solves test_adjoint.py::test_sequential_sum_product_adjoint: xfail_param(MarkovProduct, reason="mysteriously doubles adjoint values?") (blocked by #545 ).

    awaiting review 
    opened by ordabayevy 7
  • Avoid creating distributions when inferring value domain

    Avoid creating distributions when inferring value domain

    Addresses #412 pair coded with @eb8680

    This refactors logic in Distribution._infer_value_domain() to avoid creating a temporary dummy distribution if backend distributions implement a .infer_shapes() method.

    awaiting review refactor 
    opened by fritzo 7
  • Switch to sqrt(precision) representation in Gaussian

    Switch to sqrt(precision) representation in Gaussian

    Resolves #567 Adapts @fehiepsi's https://github.com/pyro-ppl/pyro/pull/2019

    This switches the internal Gaussian representation to a numerically stable and space efficient representation

    - Gaussian(info_vec, precision, inputs)
    + Gaussian(white_vec, prec_sqrt, inputs)
    

    In the new parametrization, Gaussians represent the log-density function

    Gaussian(white_vec, prec_sqrt, OrderedDict(x=Reals[n]))
      = -1/2 || x @ prec_sqrt - white_vec ||^2
    

    These two parameters are shaped to efficiently represent low-rank data:

    assert white_vec.shape == batch_shape + (rank,)
    assert prec_sqrt.shape == batch_shape + (dim, rank)
    

    reducing space complexity from O(dim(dim+1)) to O(rank(dim+1)). In my real-world example rank=1, dim=2369, and batch_shape=(1343,), so the space reduction is 30GB → 13MB.

    Computations is cheap in this representation: addition amounts to concatenation, and plate-reduction amounts to transpose and reshape. Some ops are only supported on full-rank Gaussians, and I've added checks based on the new property .is_full_rank. This partial support is ok because the Gaussian funsors that arise in Bayesian models are all full rank due to priors (notwithstanding numerical loss of rank).

    Because the Gaussian funsor is internal, the interface change should not cause breakage of most user code, since most user code uses to_funsor() and to_data() with backend-specific distributions. One broken piece of user code is Pyro's AutoGaussianFunsor which will need an update (and which will be sped up).

    As suggested by @eb8680, I've added some optional kwarg parametrizations and properties to support conversion to other Gaussian representations, e.g. g = Gaussian(mean=..., covariance=..., inputs=...) and g._mean, g._covariance. This allows more Gaussian math to live in gaussian.py.

    Tested

    • [x] added new tests
    • [x] pass existing funsor tests (linear algebra)
    • [x] pass existing funsor tests (patterns)
    • [x] pass NumPyro tests (conjugacy)
    • [x] pass Pyro tests on a branch (https://github.com/pyro-ppl/pyro/pull/2943)
    • [x] check computational cost on the pyro-cov model (results: under 12GB memory)
    awaiting review refactor 
    opened by fritzo 6
  • Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    https://github.com/pyro-ppl/funsor/blob/1d07af18c21894dd56e2f4f877c7845430c3b729/funsor/distribution.py#L531

    Opening a feature request as per this comment by @fehiepsi

    Also note that the same issue happens when using TruncatedNormal distribution, see this comment.

    enhancement 
    opened by pankajb64 0
  • FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    It would be great to have reduction rules for uniform likelihoods wrt Gaussian latent variables:

    • [ ] (MyContinuousDist + Uniform).reduce(logaddexp) implemented via .cdf()
    • [ ] (Gaussian + UniformPolygon).reduce(logaddexp) implemented via quadrature (MC or QMC)

    This could be used for Pyro and NumPyro probabilistic programs with constraint-like likelihoods, something like

    def model(data):
        z = pyro.sample("z", Normal(...))
        # Observe a constraint like lb < z < ub.
        pyro.sample("x", Uniform(...), obs=z)
    

    here we can be lazy and compute marginal likelihood via Normal(...).cdf([lb, ub]).

    enhancement 
    opened by fritzo 0
  • Delta integrate pattern

    Delta integrate pattern

    (Separating this from #593 PR)

    This proposes the following logic for the Delta Integrate pattern:

    • If reduced_var is in integrand.inputs then apply substitution to delta and integrand:
    delta = Delta("x",  point, log_density)
    integrand = Variable("x")
    Integrate(delta, integrand, reduced_vars="x")
      => delta(x=point).exp() * integrand(x=point)
      => log_density.exp() * point
    

    where log_density can be a Dice factor or an importance weight in general.

    • If reduced_var is not in integrand.inputs then just reduce delta:
    delta = Delta("x",  point, log_density)
    integrand = Number(3.0)
    Integrate(delta, integrand, reduced_vars="x")
      => delta.reduce(logaddexp, "x").exp() * integrand
      => 1 * 3.0
      => 3.0
    

    where delta is normalized.

    awaiting review 
    opened by ordabayevy 0
  • Provenance funsor

    Provenance funsor

    Provenance tracking implementation in funsor. Here the provenance is the set of (name, point) tuples of RV samples that were in the history of the computations of the tracked term. Substitution results in the multiplication by the density:

    >>> logp_x = Provenance(logp_value, {("x", point)})
    >>> logp_x(x=point)
    logp_value
    

    The Provenance funsor is intended to be used in https://github.com/pyro-ppl/pyro/pull/2893 as a wrapper for ProvenanceTensor.

    enhancement awaiting review 
    opened by ordabayevy 1
  • FR Thompson sampling example

    FR Thompson sampling example

    I came across this cute example of nested semiring dynamic programming in the context of Bayesian optimization. Thompson sampling first performs Bayesian regression, fitting a posterior p(θ|Xs,ys) over parameters θ given a fully-supervised dataset of (X,y) pairs, then repeatedly samples parameters θ and optimizes expected reward E[y|X,θ] over X. That is, at each step

    X_optimal <- arg max_X int_y int_θ y p(y|θ,X)  p(θ) prod_i p(yi|θ,Xi)
    # new choice                       --reward-- prior ---likelihood---
    

    For example in Pyroed p(θ) prod_i p(yi|θ,Xi) is approximated variationally, and E[y|θ,X] is just a tensor contraction (although there it is optimized via annealed Gibbs sampling rather than via dynamic programming, as an aside we could add an annealed Gibbs sampling interpretation to Funsor for sum-product contractions).

    What would be a good example here, where we might leverage Funsor's dynamic programming to implement both the integral and argmax operators?

    examples 
    opened by fritzo 0
Releases(0.4.4)
  • 0.4.4(Dec 28, 2022)

  • 0.4.3(Mar 21, 2022)

    What's Changed

    • Update black by @fritzo in https://github.com/pyro-ppl/funsor/pull/583
    • Extend partial_sum_product() to cases tractable for Gaussians by @fritzo in https://github.com/pyro-ppl/funsor/pull/584
    • update classifiers by @anukaal in https://github.com/pyro-ppl/funsor/pull/586
    • Fix provenance tensor to work with PyTorch 1.11 by @fritzo in https://github.com/pyro-ppl/funsor/pull/588
    • Update to PyTorch 1.11.0 by @fritzo in https://github.com/pyro-ppl/funsor/pull/589
    • Bump to version 0.4.3 by @fritzo in https://github.com/pyro-ppl/funsor/pull/590

    New Contributors

    • @anukaal made their first contribution in https://github.com/pyro-ppl/funsor/pull/586

    Full Changelog: https://github.com/pyro-ppl/funsor/compare/0.4.2...0.4.3

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Dec 13, 2021)

    Enhancements

    • #543 Added ProvenanceTensor for torch backend
    • #543 Added Constant funsor
    • #549 Added funsor.recipes.forward_fiter_backward_rsample()
    • #553 Added a Precondition interpretation for Gaussians, plus funsor.recipes.forward_fiter_backward_precondition()
    • #572 Support Funsors in pprint.pprint() and pdb

    Breaking changes

    • #568 Switch to sqrt(precision) representation for Gaussian
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Jul 2, 2021)

  • 0.4.0(Jan 24, 2021)

    • Added new sum product algorithms (contributed by @ordabayevy)
    • Added many new conjugacy patterns
    • Automated op registration and made adding new ops easier
    • #419 Support ExtendedDistribution
    • #432 Support IndependentConstraint
    • #397 Allow .reduce(op, vars_not_in_inputs)
    • #430 Add Tuple Funsor
    • #409 Change sarkka_bilmes_product interface
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Oct 15, 2020)

    • #380 Drop support for Python 3.5 and PyTorch <=1.5
    • #352 New syntax for funsor.domain.Domains: bint(n) -> Bint[n], reals(*shape) -> Reals[shape]
    • #369 New StatefulInterpretation pattern to simplify context-dependent interpretations
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jul 27, 2020)

    • Supports multiple backends: numpy (default), torch, and jax. To change the backend, use funsor.set_backend("torch") or funsor.set_backend("jax").
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 24, 2019)

  • 0.1.1(Oct 24, 2019)

  • 0.1.0(Oct 23, 2019)

Owner
Pyro - Deep Universal Probabilistic Programming
Show you how to integrate Zeppelin with Airflow

Introduction This repository is to show you how to integrate Zeppelin with Airflow. The philosophy behind the ingtegration is to make the transition f

Jeff Zhang 11 Dec 30, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
A forecasting system dedicated to smart city data

smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and

Kevin Lai 1 Nov 08, 2021
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw

Rick Lamers 1 Nov 17, 2021
Generates a simple report about the current Covid-19 cases and deaths in Malaysia

Generates a simple report about the current Covid-19 cases and deaths in Malaysia. Results are delay one day, data provided by the Ministry of Health Malaysia Covid-19 public data.

Yap Khai Chuen 7 Dec 15, 2022
This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes

This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes

Ishan Hegde 1 Nov 17, 2021
Open source platform for Data Science Management automation

Hydrosphere examples This repo contains demo scenarios and pre-trained models to show Hydrosphere capabilities. Data and artifacts management Some mod

hydrosphere.io 6 Aug 10, 2021
Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

Cristian Garcia 1.4k Dec 31, 2022
OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere.

opendrift OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere. Do

OpenDrift 167 Dec 13, 2022
Python reader for Linked Data in HDF5 files

Linked Data are becoming more popular for user-created metadata in HDF5 files.

The HDF Group 8 May 17, 2022
Validation and inference over LinkML instance data using souffle

Translates LinkML schemas into Datalog programs and executes them using Souffle, enabling advanced validation and inference over instance data

Linked data Modeling Language 7 Aug 07, 2022
A Python Tools to imaging the shallow seismic structure

ShallowSeismicImaging Tools to imaging the shallow seismic structure, above 10 km, based on the ZH ratio measured from the ambient seismic noise, and

Xiao Xiao 9 Aug 09, 2022
Business Intelligence (BI) in Python, OLAP

Open Mining Business Intelligence (BI) Application Server written in Python Requirements Python 2.7 (Backend) Lua 5.2 or LuaJIT 5.1 (OML backend) Mong

Open Mining 1.2k Dec 27, 2022
Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa

sammuhrai 4 Jul 29, 2022
Predictive Modeling & Analytics on Home Equity Line of Credit

Predictive Modeling & Analytics on Home Equity Line of Credit Data (Python) HMEQ Data Set In this assignment we will use Python to examine a data set

Dhaval Patel 1 Jan 09, 2022
Techdegree Data Analysis Project 2

Basketball Team Stats Tool In this project you will be writing a program that reads from the "constants" data (PLAYERS and TEAMS) in constants.py. Thi

2 Oct 23, 2021
Convert monolithic Jupyter notebooks into Ploomber pipelines.

Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo

Ploomber 65 Dec 16, 2022
Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons. Check out the binder link above for a sample c

Kevin Schwarzwald 42 Nov 09, 2022
Airflow ETL With EKS EFS Sagemaker

Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp

1 Feb 14, 2022
Performance analysis of predictive (alpha) stock factors

Alphalens Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open sour

Quantopian, Inc. 2.5k Jan 09, 2023