A library for pattern matching on symbolic expressions in Python.

Overview

MatchPy

MatchPy is a library for pattern matching on symbolic expressions in Python.

Work in progress

Latest version released on PyPi Latest version released via conda-forge Test coverage Build status of the master branch Documentation Status The Journal of Open Source Software Digital Object Identifier

Installation

MatchPy is available via PyPI, and for Conda via conda-forge. It can be installed with pip install matchpy or conda install -c conda-forge matchpy.

Overview

This package implements pattern matching in Python. Pattern matching is a powerful tool for symbolic computations, operating on symbolic expressions. Given a pattern and an expression (which is usually called subject), the goal of pattern matching is to find a substitution for all the variables in the pattern such that the pattern becomes the subject. As an example, consider the pattern f(x), where f is a function and x is a variable, and the subject f(a), where a is a constant symbol. Then the substitution that replaces x with a is a match. MatchPy supports associative and/or commutative function symbols, as well as sequence variables, similar to pattern matching in Mathematica.

A detailed example of how to use MatchPy can be found here.

MatchPy supports both one-to-one and many-to-one pattern matching. The latter makes use of similarities between patterns to efficiently find matches for multiple patterns at the same time.

A list of publications about MatchPy can be found below.

Expressions

Expressions are tree-like data structures, consisting of operations (functions, internal nodes) and symbols (constants, leaves):

>>> from matchpy import Operation, Symbol, Arity
>>> f = Operation.new('f', Arity.binary)
>>> a = Symbol('a')
>>> print(f(a, a))
f(a, a)

Patterns are expressions which may contain wildcards (variables):

>>> from matchpy import Pattern, Wildcard
>>> x = Wildcard.dot('x')
>>> print(Pattern(f(a, x)))
f(a, x_)

In the previous example, x is the name of the variable. However, it is also possible to use wildcards without names:

>>> w = Wildcard.dot()
>>> print(Pattern(f(w, w)))
f(_, _)

It is also possible to assign variable names to entire subexpressions:

>>> print(Pattern(f(w, a, variable_name='y')))
y: f(_, a)

Pattern Matching

Given a pattern and an expression (which is usually called subject), the idea of pattern matching is to find a substitution that maps wildcards to expressions such that the pattern becomes the subject. In MatchPy, a substitution is a dict that maps variable names to expressions.

>>> from matchpy import match
>>> y = Wildcard.dot('y')
>>> b = Symbol('b')
>>> subject = f(a, b)
>>> pattern = Pattern(f(x, y))
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{x ↦ a, y ↦ b}

Applying the substitution to the pattern results in the original expression.

>>> from matchpy import substitute
>>> print(substitute(pattern, substitution))
f(a, b)

Sequence Wildcards

Sequence wildcards are wildcards that can match a sequence of expressions instead of just a single expression:

>>> z = Wildcard.plus('z')
>>> pattern = Pattern(f(z))
>>> subject = f(a, b)
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{z ↦ (a, b)}

Associativity and Commutativity

MatchPy natively supports associative and/or commutative operations. Nested associative operators are automatically flattened, the operands in commutative operations are sorted:

>>> g = Operation.new('g', Arity.polyadic, associative=True, commutative=True)
>>> print(g(a, g(b, a)))
g(a, a, b)

Associativity and commutativity is also considered for pattern matching:

>>> pattern = Pattern(g(b, x))
>>> subject = g(a, a, b)
>>> print(next(match(subject, pattern)))
{x ↦ g(a, a)}
>>> h = Operation.new('h', Arity.polyadic)
>>> pattern = Pattern(h(b, x))
>>> subject = h(a, a, b)
>>> list(match(subject, pattern))
[]

Many-to-One Matching

When a fixed set of patterns is matched repeatedly against different subjects, matching can be sped up significantly by using many-to-one matching. The idea of many-to-one matching is to construct a so called discrimination net, a data structure similar to a decision tree or a finite automaton that exploits similarities between patterns. In MatchPy, there are two such data structures, implemented as classes: DiscriminationNet and ManyToOneMatcher. The DiscriminationNet class only supports syntactic pattern matching, that is, operations are neither associative nor commutative. Sequence variables are not supported either. The ManyToOneMatcher class supports associative and/or commutative matching with sequence variables. For syntactic pattern matching, the DiscriminationNet should be used, as it is usually faster.

>>> pattern1 = Pattern(f(a, x))
>>> pattern2 = Pattern(f(y, b))
>>> matcher = ManyToOneMatcher(pattern1, pattern2)
>>> subject = f(a, b)
>>> matches = matcher.match(subject)
>>> for matched_pattern, substitution in sorted(map(lambda m: (str(m[0]), str(m[1])), matches)):
...     print('{} matched with {}'.format(matched_pattern, substitution))
f(a, x_) matched with {x ↦ b}
f(y_, b) matched with {y ↦ a}

Roadmap

Besides the existing features, we plan on adding the following to MatchPy:

  • Support for Mathematica's Alternatives: For example f(a | b) would match either f(a) or f(b).
  • Support for Mathematica's Repeated: For example f(a..) would match f(a), f(a, a), f(a, a, a), etc.
  • Support pattern sequences (PatternSequence in Mathematica). These are mainly useful in combination with Alternatives or Repeated, e.g. f(a | (b, c)) would match either f(a) or f(b, c). f((a a)..) would match any f with an even number of a arguments.
  • All these additional pattern features need to be supported in the ManyToOneMatcher as well.
  • Better integration with existing types such as dict.
  • Code generation for both one-to-one and many-to-one matching. There is already an experimental implementation, but it still has some dependencies on MatchPy which can probably be removed.
  • Improving the documentation with more examples.
  • Better test coverage with more randomized tests.
  • Implementation of the matching algorithms in a lower-level language, for example C, both for performance and to make MatchPy's functionality available in other languages.

Contributing

If you have some issue or want to contribute, please feel free to open an issue or create a pull request. Help is always appreciated!

The Makefile has several tasks to help development:

  • To install all needed packages, you can use make init .
  • To run the tests you can use make test. The tests use pytest.
  • To generate the documentation you can use make docs .
  • To run the style checker (pylint) you can use make check .

If you have any questions or need help with setting things up, please open an issue and we will try the best to assist you.

Publications

Manuel Krebber and Henrik Barthels
Journal of Open Source Software, Volume 3(26), pp. 2, June 2018.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing, November 2017.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 15th Python in Science Conference, July 2017.

Manuel Krebber
Master Thesis, RWTH Aachen University, May 2017

If you want to cite MatchPy, please reference the JOSS paper:

@article{krebber2018,
    author    = {Manuel Krebber and Henrik Barthels},
    title     = {{M}atch{P}y: {P}attern {M}atching in {P}ython},
    journal   = {Journal of Open Source Software},
    year      = 2018,
    pages     = 2,
    month     = jun,
    volume    = {3},
    number    = {26},
    doi       = "10.21105/joss.00670",
    web       = "http://joss.theoj.org/papers/10.21105/joss.00670",
}
Comments
  • `ManyToOneReplacer` slow for small expressions

    `ManyToOneReplacer` slow for small expressions

    I have added ~1700 ReplacementRules in ManyToOneReplacer. The .replace() has become very slow even for very small expressions(like 2*x).

    File: https://github.com/parsoyaarihant/sympy/blob/502fd311d64155560383a5f4f3b2710c318dc410/sympy/integrals/rubi/patterns.py

    opened by parsoyaarihant 38
  • Added internal implementation of the Hopcroft-Karp algorithm

    Added internal implementation of the Hopcroft-Karp algorithm

    Added Hopcroft-Karp algorithm in order to free MatchPy of its GPL dependency. Readapted from the C++ implementation in SymEngine.

    Translating the C++ implementation in SymEngine back to Python.

    I'm the author of the C++ implementation so no copyright notice mentioning SymEngine is necessary.

    opened by Upabjojr 23
  • `ManyToOneReplacer` for Rubi

    `ManyToOneReplacer` for Rubi

    Hi, I implemented ~80 rules using ManyToOneReplacer for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.

    It would be great if you could advice us on how to solve this issue.

    Here are the files for our project:

    Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi

    opened by parsoyaarihant 21
  • Optional arguments in machpy

    Optional arguments in machpy

    I wanted to know if matchpy supports optional arguments during match.

    Mathematica: https://reference.wolfram.com/language/ref/Optional.html

    Example:

    >>> x = Symbol('x')
    >>> a_ = Wildcard.dot('a', optional=1)
    >>> a_free = CustomConstraint(lambda a, x: str(x) not in a.symbols)
    >>> x_ = Wildcard.dot('x')
    >>> pattern = Pattern(Mul(a_, x_), a_free)
    >>> next(match(x, pattern))
    {a: 1}
    
    opened by parsoyaarihant 20
  • Fixing the code generator

    Fixing the code generator

    The code generator enforces the reading of all variables whenever a constraint has been encountered. In this PR, the constraint is check only if all variables have been read.

    Furthermore, indentation has been changes to four spaces and a related bug forcing to always use bugs has been fixed as well.

    opened by Upabjojr 15
  • Repeated definition of the same CustomConstraint

    Repeated definition of the same CustomConstraint

    In the current RUBI rules patterns in SymPy, the same constraint is defined many times. For example: CustomConstraint(lambda a, x: FreeQ(a, x)) has its definition repeated more and more, like in:

    https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L22

    and in the next rule the same constraint is redefined: https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L26

    Python defines a new object for every new lambda definition, even if the functions return the same identical value.

    Does this practice have some potential negative consequences, like:

    • the size of the generated decision tree?
    • the overall speed of MatchPy?
    opened by Upabjojr 13
  • using `int` as Wildcard

    using `int` as Wildcard

    Hi, I am trying to define class ConstantWild (similar to ConstantSymbol you suggested earlier).

    Here is the code:

    from matchpy import Wildcard
    
    class ConstantWild(Wildcard):
        def __init__(self, value):
            super(self.__class__, self).__init__(min_count=1, fixed_size=True, variable_name=str(value))
            self.value = value
    
        def __str__(self):
            return str(self.value)
    

    The above code have same attributes as Wildcard.dot but I don't understand why it is giving errors.

    opened by parsoyaarihant 9
  • Custom sorting key for substitute( )

    Custom sorting key for substitute( )

    Substitute now accepts custom sorting key to specify custom sorting for elements in Multiset.

    This should solve the compatibility problem with SymPy which does not allow comparison operators (>, >=, <, <=) to be evaluated to booleans if the truth of the expression cannot be easily determined.

    opened by Upabjojr 8
  • Need Help in code generation.

    Need Help in code generation.

    I could not find any documentation related to code generation. Last year @wheerd had written a code generation script for rubi in sympy. https://gist.github.com/wheerd/13ac5a0e9b560db6201d96136fa0bbcc

    But code generated was huge so, It cannot be used. This year I have removed the repeated definition of constraints. So code generation should work probably.

    However, I am unable to write the code generation script. Here is the new structure of rubi in sympy. Can someone update the gist accordingly?

    rules = [] #list to keep track which rules has been applied
    def cons_f1(m, x):
        return FreeQ(m, x)
    cons1 = CustomConstraint(cons_f1)
    
    def cons_f2(m):
        return NonzeroQ(m + S(1))
    cons2 = CustomConstraint(cons_f2)
    
    pattern1 = Pattern(Integral(x_**WC('m', S(1)), x_), cons1, cons2)
    def replacement1(m, x):
        rules.append(1)
        return Simp(x**(m + S(1))/(m + S(1)), x)
    rule1 = ReplacementRule(pattern1, replacement1)
    
    help wanted question 
    opened by ashishkg0022 7
  • Alternate way to define constraint

    Alternate way to define constraint

    I wanted to know if there is alternatative way to define constraint in matchpy. Example:

    
    def FreeQ(vars, x): # returns True of any `var` contains `x`
    	if any(i.contains(x) for i in vars):
    		return False
    	return True
    
    def NonzeroQ(e): # returns True if `e` is not zero.
    	return e != 0
    
    pattern = Pattern(Int(Pow(Add(a_, Mul(b_, x_)), m_), x_), FreeQ(list(a, b, m), x), NonzeroQ(Add(m, 1)))
    

    Thanks.

    opened by parsoyaarihant 7
  • GPL'd dependency: hopcroftkarp library

    GPL'd dependency: hopcroftkarp library

    The hopcroftkarp library upon which this project depends is GPL'd.

    I have re-implemented the Hopcroft-Karp algorithm in C++ as part of an attempt to make MatchPy usable in SymEngine (that is, SymPy's core rewritten in C++). I didn't perform extensive testing, so there may be bugs. That code can be translated into Python.

    enhancement 
    opened by Upabjojr 6
  • FreeQ equivalent in MatchPy?

    FreeQ equivalent in MatchPy?

    Mathematica has FreeQ to test whether an expression contains a symbol. This is very useful in pattern matching for equations as you can specify that, for example, in a * x + b == 0 the variables a and b should not contain the variable x.

    In SymPy we are currently using things like CustomConstraint(lambda a, x: not a.has(x)), where a.has(x) is a SymPy expression that tells you if x is contained in the expression tree of a.

    Would it make sense to add an optimized FreeQ-like tester that checks whether the variable is contained in the expression during the matching iteration of MatchPy?

    opened by Upabjojr 1
  • Optional wildcards to match the identity element of the current node

    Optional wildcards to match the identity element of the current node

    When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

    In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

    It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

    enhancement 
    opened by Upabjojr 7
  • substitute not working with SymPy

    substitute not working with SymPy

    Calling sorted( ) in https://github.com/HPAC/matchpy/blob/5cae3f275e3a1f725518516bf3c700dc3be03a56/matchpy/functions.py#L87 fails with SymPy, as SymPy objects are not comparable with operators (<, <=, >, >=). Operators are overloaded in SymPy to create instances of inequality objects.

    SymPy has a function called default_sort_key meant to deal with this problem:

    from sympy import symbols
    x, y, z = symbols("x y z")
    from sympy.core.compatibility import default_sort_key
    sorted([z, x, y], key=default_sort_key)
    

    The question is, is it possible to modify MatchPy to specify a custom sorting key so that SymPy can specify the way its objects should be sorted?

    Or is it an issue with Multiset?

    opened by Upabjojr 11
  • One-to-one matching: symbols with variable_name do not work in commutative functions

    One-to-one matching: symbols with variable_name do not work in commutative functions

    In one-to-one matching, symbols that have a variable_name do not work in commutative functions. They seem to work correctly in many-to-one matching. Example:

    from matchpy import Operation, Symbol, Arity, Pattern, match, ManyToOneMatcher
    
    f = Operation.new('f', Arity.binary, commutative=True)
    a_x = Symbol('a', variable_name='x')
    a = Symbol('a')
    b = Symbol('b')
    
    subject = f(a, b)
    pattern = Pattern(f(a_x, b))
    
    print(list(match(subject, pattern)))
    
    matcher = ManyToOneMatcher(pattern)
    print(list(matcher.match(subject)))
    

    This results in

    []
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    but it should result in

    [{'x': Symbol('a')}]
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    So far, we don't have any tests that verify this functionality. Once it is fixed, we should add some.

    bug 
    opened by hbarthels 0
  • Match arbitrary function

    Match arbitrary function

    Is there any way to match a function with a certain number of arguments? Suppose I've defined a binary function sum, is there a way to create a pattern f(a, b) that matches sum(x, y)?

    enhancement 
    opened by sidmani 4
  • Improve checking correctness of input

    Improve checking correctness of input

    To avoid issues such as #60, we should check that objects are created and functions are used correctly, at least in those cases where so far, incorrect input does not produce any errors.

    enhancement 
    opened by hbarthels 0
Releases(0.5.5)
Owner
High-Performance and Automatic Computing
Umeå University & RWTH Aachen
High-Performance and Automatic Computing
An upgraded version of extractJS

extractJS_2.0 An enhanced version of extractJS with even more functionality Features Discover JavaScript files directly from the webpage Customizable

Ali 4 Dec 21, 2022
Attempt at creating organized collection of little handy snippets of code I'm receiving along the way

ChaosCode Attempt at creating organized collection of little handy snippets of code I'm receiving along the way I always considered coding and program

INFU 4 Nov 26, 2022
🥦 Send and receive nano with 2 simple functions

easy_nano Send and receive nano (without having to understand the nano protocol).

1 Feb 14, 2022
Just a simple python script to generate graphs of salt state requisites.

saltstatevis Just a simple python script to generate graphs of salt state requisites. Installation Requirements You will need to install graphviz to r

Dwayn Matthies 3 May 04, 2022
Monitoring of lake dynamics

slamcore_utils Description This repo contains the slamcore-setup-dataset script. It can be used for installing a sample dataset for offline testing an

10 Jun 23, 2022
Yandex Media Browser

Браузер медиа для плагина Yandex Station Включайте музыку, плейлисты и радио на Яндекс.Станции из Home Assistant! Скриншот Корневой раздел: Библиотека

Alexander Ryazanov 35 Dec 19, 2022
A project to explore and provide useful code for Mango Markets

🥭 Mango Explorer A project to explore and provide useful code for Mango Markets

Blockworks Foundation 160 Dec 19, 2022
Cirq is a Python library for writing, manipulating, and optimizing quantum circuits and running them against quantum computers and simulators

Cirq is a Python library for writing, manipulating, and optimizing quantum circuits and running them against quantum computers and simulators. Install

quantumlib 3.6k Jan 07, 2023
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard.

eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard. Features 🔥 Article monitor: helps you capture the trend at a

Sean 116 Dec 29, 2022
Programmatic interface to Synapse services for Python

A Python client for Sage Bionetworks' Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate

Sage Bionetworks 54 Dec 23, 2022
Some ideas and tools to develop Python 3.8 plugins for GIMP 2.99.4

gimp-python-development Some ideas and tools to develop Python 3.8 plugins for GIMP 2.99.4. GIMP 2.99.4 is the latest unstable pre-release of GIMP 3.

Ismael Benito 53 Sep 25, 2022
Design-by-contract in Python3 with informative violation messages and inheritance

icontract icontract provides design-by-contract to Python3 with informative violation messages and inheritance. It also gives a base for a flourishing

275 Jan 02, 2023
This repository provides a set of easy to understand and tested Python samples for using Acronis Cyber Platform API.

Base Acronis Cyber Platform API operations with Python !!! info Copyright © 2019-2021 Acronis International GmbH. This is distributed under MIT licens

Acronis International GmbH 3 Aug 11, 2022
It's a repo for Cramer's rule, which is some math crap or something idk

It's a repo for Cramer's rule, which is some math crap or something idk (just a joke, it's not crap; don't take that seriously, math teachers)

Module64 0 Aug 31, 2022
Beacon Object File (BOF) to obtain a usable TGT for the current user.

Beacon Object File (BOF) to obtain a usable TGT for the current user.

Connor McGarr 109 Dec 25, 2022
This repo created to complete the task HACKTOBER 2021, contribute now and get your special T-Shirt & Sticker. TO SUPPORT OWNER PLEASE PRESS STAR BUTTON

❤ THIS REPO WILL CLOSED IN 31 OCT 00:00 ❤ This repository will automatically assign the hacktoberfest and hacktoberfest-accepted labels to all submitt

Rajendra Rakha 307 Dec 27, 2022
E5 自动续期

请选择跳转 新版本系统 (2021-2-9采用): 以后更新都在AutoApi,采用v0.0版本号覆盖式更新 AutoApi : 最新版 保留1到2个稳定的简易版,防止萌新大范围报错 AutoApi'X' : 稳定版1 ( 即本版AutpApiP ) AutoApiP ( 即v5.0,稳定版 ) —

95 Feb 15, 2021
Script de monitoramento das teclas do teclado, salvando todos os dados digitados em um arquivo de log juntamente com os dados de rede.

listenerPython Script de monitoramento das teclas do teclado, salvando todos os dados digitados em um arquivo de log juntamente com os dados de rede.

Vinícius Azevedo 4 Nov 27, 2022
Practice10 - Operasi String With Python

Operasi String MY SOSIAL MEDIA : Apa itu Python String ? String adalah urutan si

Maulana Reza Badrudin 1 Jan 05, 2022
Unofficial Valorant documentation and tools for third party developers

Valorant Third Party Toolkit This repository contains unofficial Valorant documentation and tools for third party developers. Our goal is to centraliz

Noah Kim 20 Dec 21, 2022