A functional standard library for Python.

Overview

Toolz

Build Status Coverage Status Version Status

A set of utility functions for iterators, functions, and dictionaries.

See the PyToolz documentation at https://toolz.readthedocs.io

LICENSE

New BSD. See License File.

Install

toolz is on the Python Package Index (PyPI):

pip install toolz

Structure and Heritage

toolz is implemented in three parts:

itertoolz, for operations on iterables. Examples: groupby, unique, interpose,

functoolz, for higher-order functions. Examples: memoize, curry, compose,

dicttoolz, for operations on dictionaries. Examples: assoc, update-in, merge.

These functions come from the legacy of functional languages for list processing. They interoperate well to accomplish common complex tasks.

Read our API Documentation for more details.

Example

This builds a standard wordcount function from pieces within toolz:

>>> def stem(word):
...     """ Stem word to primitive form """
...     return word.lower().rstrip(",.!:;'-\"").lstrip("'\"")

>>> from toolz import compose, frequencies, partial
>>> from toolz.curried import map
>>> wordcount = compose(frequencies, map(stem), str.split)

>>> sentence = "This cat jumped over this other cat!"
>>> wordcount(sentence)
{'this': 2, 'cat': 2, 'jumped': 1, 'over': 1, 'other': 1}

Dependencies

toolz supports Python 3.5+ with a common codebase. It is pure Python and requires no dependencies beyond the standard library.

It is, in short, a lightweight dependency.

CyToolz

The toolz project has been reimplemented in Cython. The cytoolz project is a drop-in replacement for the Pure Python implementation. See CyToolz GitHub Page for more details.

See Also

  • Underscore.js: A similar library for JavaScript
  • Enumerable: A similar library for Ruby
  • Clojure: A functional language whose standard library has several counterparts in toolz
  • itertools: The Python standard library for iterator tools
  • functools: The Python standard library for function tools

Contributions Welcome

toolz aims to be a repository for utility functions, particularly those that come from the functional programming and list processing traditions. We welcome contributions that fall within this scope.

We also try to keep the API small to keep toolz manageable. The ideal contribution is significantly different from existing functions and has precedent in a few other functional systems.

Please take a look at our issue page for contribution ideas.

Community

See our mailing list. We're friendly.

Comments
  • Cython implementation of toolz

    Cython implementation of toolz

    What do you think about having a Cython implementation of toolz that can be used as a regular C extension in CPython, or be cimport-ed by other Cython code?

    I've been messing around with Cython lately, and I became curious how much performance could be gained by implementing toolz in Cython. I am almost finished with a first-pass implementation (it goes quickly when one doesn't try to fine-tune everything), and just have half of itertoolz left to do.

    Performance increases of x2-x4 are common. Some perform even better (like x10), and a few are virtually the same. There is also less overhead when calling functions defined in Cython, which at times can be significant regardless of how things scale.

    However, performance when called from Python isn't the only consideration. A common strategy used by the scientific, mobile, and game communities to increase performance of their applications is to convert Python code that is frequently run to Cython. Developing in Cython also tends to be very imperative. A Cython version of toolz will allow fast implementations to be used in other Cython code (via cimport) while facilitating a more functional style of programming.

    Looking ahead, cython.parallel exposes OpenMP at a low level, which should allow for more efficient parallel processing.

    Thoughts? Any ideas for a name? I am thinking coolz, because ctoolz and cytoolz sound like they are utilities for C or Cython code. I can push what I currently have to a repo once it has a name. Should this be part of pytoolz?

    opened by eriknw 72
  • Join

    Join

    Here is a semi-streaming Join function, analagous to SQL Join

    Join two sequences on common attributes

    This is a semi-streaming operation. The LEFT sequence is fully evaluated and placed into memory. The RIGHT side is evaluated lazily and so can be arbitrarily large.

    The following example joins quantities of sold fruit to the name of the quantity.

    >>> names = [(1, 'one'), (2, 'two'), (3, 'three')]
    >>> fruit = [('apple', 1), ('banana', 2), ('coconut', 2), ('orange', 1)]
    
    >>> result = join(first, second, names, fruit, apply=lambda x, y: x + y)
    >>> for row in result:
    ...     print(row)
    (1, 'one', 'apple', 1)
    (2, 'two', 'banana', 2)
    (2, 'two', 'coconut', 2)
    (1, 'one', 'orange', 1)
    
    opened by mrocklin 46
  • Logical Operators

    Logical Operators

    I found some logical predicate functions useful recently, so I added them to toolz to complement...complement. Does it seem reasonable? The implementation and testing here isn't necessary the final result, just a proof-of-concept.

    Also, I wasn't sure whether to call the new functions the imaginary verbs conjunct and disjunct (they're only nouns and adjectives in my dictionary) as per standard functional style or the clearer but longer conjunction and disjunction. Went with the latter for now.

    opened by karansag 27
  • Add `toolz.sandbox.EqualityHashKey`

    Add `toolz.sandbox.EqualityHashKey`

    This builds upon the discussion and feedback from #166, which also has a faster (but harder to understand) implementation.

    EqualityHashKey creates a hash key that uses equality comparisons between items, which may be used to create hash keys for otherwise unhashable types. The trade-offs for using this are discussed in the docstring. Additional usage cases would qualify as compelling reasons to promote EqualityHashKey out of the sandbox (imho).

    @asmeurer, do you have any suggestions or additional cases where this would be applicable?

    opened by eriknw 25
  • Keyword-only args breaks toolz.curry

    Keyword-only args breaks toolz.curry

    Hey guys! I would expect the following behavior from toolz.curry:

    >>> @toolz.curry
    >>> def kwonly_sum(a, *, b=10):
               return a + b
    >>> b_is_five = kwonly_sum(b=5) # actually raise exception here
    >>> b_is_five(5)
    10 # what I want
    

    The exception gives a suggested solution:

    TypeError                                 Traceback (most recent call last)
    /home/mtartre/.conda/envs/std/lib/python3.4/site-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
        218         try:
    --> 219             return self._partial(*args, **kwargs)
        220         except TypeError:
    
    TypeError: kwonly_sum() missing 1 required positional argument: 'a'
    
    During handling of the above exception, another exception occurred:
    ValueError                                Traceback (most recent call last)
    <ipython-input-31-fd1daf64ecb7> in <module>()
    ----> 1 kwonly_sum(b=5)(5)
    
    /home/mtartre/.conda/envs/std/lib/python3.4/site-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
        220         except TypeError:
        221             # If there was a genuine TypeError
    --> 222             required_args = _num_required_args(self.func)
        223             if (required_args is not None and
        224                     len(args) + len(self.args) >= required_args):
    
    /home/mtartre/.conda/envs/std/lib/python3.4/site-packages/toolz/functoolz.py in _num_required_args(func)
        116         return known_numargs[func]
        117     try:
    --> 118         spec = inspect.getargspec(func)
        119         if spec.varargs:
        120             return None
    
    /home/mtartre/.conda/envs/std/lib/python3.4/inspect.py in getargspec(func)
        934         getfullargspec(func)
        935     if kwonlyargs or ann:
    --> 936         raise ValueError("Function has keyword-only arguments or annotations"
        937                          ", use getfullargspec() API which can support them")
        938     return ArgSpec(args, varargs, varkw, defaults)
    
    ValueError: Function has keyword-only arguments or annotations, use getfullargspec() API which can support them
    

    The issue is exactly the same in cytoolz. Apologies I don't have the fix in a pull request, my firm requires legal approval for that.

    opened by quantology 22
  • Add support for OrderedDicts

    Add support for OrderedDicts

    I like the Dicttoolz package, but for many of my use cases I need the deterministic behaviour of OrderedDict. Adapting Dicttoolz to return OrderedDict if all of its inputs are one is relatively straightforward. Is this something you would consider merging?

    opened by bartvm 22
  • Tracing

    Tracing

    Woah!

    I was curious what it would be like to trace the input and output of toolz functions and user-defined functions. As a proof-of-concept, I created this branch:

    https://github.com/eriknw/toolz/tree/trace_with_q

    Simply do from toolz.traced import * and viola! In another termal, watch the output real-time via tail -f /tmp/toolz.

    To trace a user function use trace as a decorator or function.

    The results are astounding. I would paste example traces here, but I think you guys have got to try this out yourself.

    q was copied from https://github.com/zestyping/q and was slightly modified to output to "/tmp/toolz" instead of "/tmp/q".

    As I said above, this was meant as a proof-of-concept. It begs the question, though, whether such functionality should be added to toolz, how it should behave, etc. Tracing can be very handy for debugging and as an educational tool for new users.

    If you encounter any bugs in the above branch, please post here.

    Thoughts and reactions?

    opened by eriknw 22
  • ENH: Adds excepts

    ENH: Adds excepts

    The idea of this is to use exception based api functions alongside your normal functional code.

    for example:

    map(itemgetter('key'), seq) -> map(excepts(itemgetter('key'), KeyError), seq)
    

    This helps us get around the fact that I cannot put an except clause in a lambda. I have found this to be very useful in my own code.

    Most of this code is for fresh __name__ and __doc__ attributes.

    opened by llllllllll 21
  • Faster groupby!

    Faster groupby!

    Issue #178 impressed upon me just how costly attribute resolution can be. In this case, groupby was made faster by avoiding resolving the attribute list.append.

    This implementation is also more memory efficient than the current version that uses a defaultdict that gets cast to a dict. While casting a defaultdict d to a dict as dict(d) is fast, it is still a fast copy.

    Honorable mention goes to the following implementation:

    def groupby_alt(func, seq):
        d = collections.defaultdict(lambda: [].append)
        for item in seq:
            d[func(item)](item)
        rv = {}
        for k, v in iteritems(d):
            rv[k] = v.__self__
        return rv
    

    This alternative implementation can at times be very impressive. You should play with it!

    opened by eriknw 20
  • Smarter wrapper behavior in functoolz.curry and functoolz.memoize

    Smarter wrapper behavior in functoolz.curry and functoolz.memoize

    Using update_wrapper and wraps would be preferable, however they both cause errors -- pickling issues in curry and attribute errors for memoizing a partial in Python 2. However, more than just __name__ and __doc__ should be transferred: __module__, if present and __qualname__ and __annotations__ in Python 3. Updating with func.__dict__ isn't possible in curry (source of pickling problems), but should be done in memoize.

    opened by justanr 19
  • Remove positional arg

    Remove positional arg "func" from curry.__init__

    conflicted with kwargs['func']

    example.py:

    from toolz import curry
    @curry
    def foo(x, y, func=int, bar=str):
        return str(func(x*y))
    foo(bar=float)(4.2, 3.8, func=round)
    foo(func=int)(4.2, 3.8, bar=str)
    

    The last line would throw TypeError: __init__() got multiple values for keyword argument 'func' because curry.__init__(self, func, *args, **kwargs) names its first positional argument "func" This effectively prevented creating a curry object with such a kwarg. I didn't find other functions with the same problem.

    > /home/digenis/src/toolz/toolz/example.py(1)<module>()
    -> from toolz import curry
    (pdb) c
    Traceback (most recent call last):
     ...
      File "/home/digenis/src/toolz/toolz/functoolz.py", line 224, in __call__
        return curry(self._partial, *args, **kwargs)
    TypeError: __init__() got multiple values for keyword argument 'func'
    ...
    > /home/digenis/src/toolz/toolz/functoolz.py(224)__call__()
    -> return curry(self._partial, *args, **kwargs)
    (pdb) self._partial  # __init__ will receive this as the positional argument "func"
    <functools.partial object at 0x7f16874ee0a8>
    (pdb) pp args
    ()
    (pdb) pp kwargs  # but it will also receive a kwarg named "func"
    {'func': <type 'int'>}
    
    
    Post mortem debugger finished. The example.py will be restarted
    ...
    

    I abbreviated some line ranges with "..."

    opened by Digenis 19
  • Bump pypa/gh-action-pypi-publish from 1.5.0 to 1.6.4

    Bump pypa/gh-action-pypi-publish from 1.5.0 to 1.6.4

    Bumps pypa/gh-action-pypi-publish from 1.5.0 to 1.6.4.

    Release notes

    Sourced from pypa/gh-action-pypi-publish's releases.

    v1.6.4

    oh, boi! again?

    This is the last one tonight, promise! It fixes this embarrassing bug that was actually caught by the CI but got overlooked due to the lack of sleep. TL;DR GH passed $HOME from the external env into the container and that tricked the Python's site module to think that the home directory is elsewhere, adding non-existent paths to the env vars. See #115.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.3...v1.6.4

    v1.6.3

    Another Release!? Why?

    In pypa/gh-action-pypi-publish#112, it was discovered that passing a $PATH variable even breaks the shebang. So this version adds more safeguards to make sure it keeps working with a fully broken $PATH.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.2...v1.6.3

    v1.6.2

    What's Fixed

    • Made the $PATH and $PYTHONPATH environment variables resilient to broken values passed from the host runner environment, which previously allowed the users to accidentally break the container's internal runtime as reported in pypa/gh-action-pypi-publish#112

    Internal Maintenance Improvements

    New Contributors

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.1...v1.6.2

    v1.6.1

    What's happened?!

    There was a sneaky bug in v1.6.0 which caused Twine to be outside the import path in the Python runtime. It is fixed in v1.6.1 by updating $PYTHONPATH to point to a correct location of the user-global site-packages/ directory.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.0...v1.6.1

    v1.6.0

    Anything's changed?

    The only update is that the Python runtime has been upgraded from 3.9 to 3.11. There are no functional changes in this release.

    Full Changelog: https://github.com/pypa/gh-action-pypi-publish/compare/v1.5.2...v1.6.0

    v1.5.2

    What's Improved

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.5.1...v1.5.2

    v1.5.1

    What's Changed

    ... (truncated)

    Commits
    • c7f29f7 🐛 Override $HOME in the container with /root
    • 644926c 🧪 Always run smoke testing in debug mode
    • e71a4a4 Add support for verbose bash execusion w/ $DEBUG
    • e56e821 🐛 Make id always available in twine-upload
    • c879b84 🐛 Use full path to bash in shebang
    • 57e7d53 🐛Ensure the default $PATH value is pre-loaded
    • ce291dc 🎨🐛Fix the branch @ pre-commit.ci badge links
    • 102d8ab 🐛 Rehardcode devpi port for GHA srv container
    • 3a9eaef 🐛Use different ports in/out of GHA containers
    • a01fa74 🐛 Use localhost @ GHA outside the containers
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • The

    The "collect" decorator

    Hi! I often encounter the same pattern in my code:

    def to_power(array, powers):
        result = []
        for entry in array:
            power = powers.get(entry)
            if power is not None and power >= 0:
                result.append(entry ** power)
    
        return result
    
    
    def reverse_mapping(expensive_func, values):
        result = {}
        for entry in values:
            value = expensive_func(entry)
            if value is not None:
                result[value] = entry
    
        return result
    

    The examples are somewhat simplistic, but you get the idea:

    1. create a container
    2. iterate some data, do some branching, etc and fill the container
    3. return the container

    I came up with several decorators that reduce this to:

    @collect
    def to_power(array, powers):
        for entry in array:
            power = powers.get(entry)
            if power is not None and power >= 0:
                yield entry ** power
    
    
    @composed(dict)
    def reverse_mapping(expensive_func, values):
        for entry in values:
            value = expensive_func(entry)
            if value is not None:
                yield value, entry
    

    composed(func) simply applies func to the result of the decorated function, which effectively gathers the generator. And collect is just a shorter version of composed(list)

    I can create a PR with my implementation, if you are interested in adding it to toolz.

    opened by maxme1 0
  • Idea: Compose class should be iterable

    Idea: Compose class should be iterable

    It would be really nice to be able to iterate all the funcs in compose, without having to combine the first and funcs properties. It would let you immediately use Compose objects as iterables in the itertoolz functions.

    For an example, consider the simple logging strategy outlined in my gist here: https://gist.github.com/ZeroBomb/8ac470b1d4b02c11f2873c5d4e0512a1

    As written, I need to define this somewhat extraneous function

    def get_funcs(composition):
        return (composition.first,)+composition.funcs
    

    in order to map over those functions and re-compose:

    @curry
    def interleave_map(func, items):
        # [1,2,3] -> [func(1), 1, func(2), 2, func(3), 3]
        return interleave([map(func, items), items])
    
    # define a debug function that interleaves logging funcs inbetween each func in an existing composition
    debug = compose_left(get_funcs, interleave_map(passthru_log), star(compose_left))
    

    if the Compose class were iterable, I could completely eliminate the get_funcs function, and comfortably feed the compose object directly into interleave:

    def debug(composition):
        return compose_left(*interleave_map(passthru_log, composition))
    
    opened by ZeroBomb 1
  • Setup pyright type-checking

    Setup pyright type-checking

    I have added

    • A basic config file for pyright
    • A CI job to run pyright
    • comments to ignore errors that pyright detects in existing code.

    This is to type-check any type hints that are added to toolz, as suggested in #496. These can be added incrementally.

    opened by LincolnPuzey 0
  • Use

    Use "yield from" in merge_sorted

    Convert these loops:

    for item in seq:
        yield item
    

    To the more modern and slightly more efficient

    yield from seq
    

    A quick benchmark (a is a list of 30 sorted lists of 60 random integers)

    # old
    In [8]: %timeit list(merge_sorted(*a))
    815 µs ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
    
    # new
    In [7]: %timeit list(merge_sorted(*a))
    766 µs ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
    
    opened by groutr 0
  • get_in function: add example of giving a string to the keys argument

    get_in function: add example of giving a string to the keys argument

    It is currently tempting to test get_in like this:

    get_in('x', {'x':5}) # returns 5
    

    and conclude that this will also work:

    get_in('test', {'test':5}) # actually returns None
    

    It does not work, because 'test' is treated as ['t','e','s','t']. In complex dictionaries, you may actually get a value, like

    get_in('xy', {'x': {'y': 5}} ) # returns 5
    

    The documentation should probably call this out explicitly, if this is the intended behavior, perhaps by giving the 'xy' example above.

    I, for one, wouldn't mind an implementation where get_in('test', {'test':5}) returns 5, but I wouldn't go so far as to say that is the right approach. I'm imagining it would facilitate doing something like this:

    juxt(*map(curry(get_in), ['str1', ['str2', 'str3'], 'etc']))
    
    opened by KevinXOM 2
Releases(0.12.0)
  • 0.12.0(Jul 10, 2022)

    • Add apply (#411)
    • Support newer Python versions--up to Python 3.11-alpha (#525, #527, #533)
    • Improve warning when using toolz.compatibility (#485)
    • Improve documentation (#507, #524, #526, #530)
    • Improve performance of merge_with (#532)
    • Improve import times (#534)
    • Auto-upload new releases to PyPI (#536, #537)
    Source code(tar.gz)
    Source code(zip)
  • 0.11.2(Nov 6, 2021)

  • 0.11.1(Sep 24, 2020)

  • 0.11.0(Sep 23, 2020)

    • Drop Python 2.7 support!
    • Give deprecation warning on using toolz.compatibility
    • Some doc fixes
    • First time using auto-deployment. Fingers crossed!

    Next release will probably be 1.0.0 :)

    Source code(tar.gz)
    Source code(zip)
More routines for operating on iterables, beyond itertools

More Itertools Python's itertools library is a gem - you can compose elegant solutions for a variety of problems with the functions it provides. In mo

2.9k Jan 04, 2023
Make your functions return something meaningful, typed, and safe!

Make your functions return something meaningful, typed, and safe! Features Brings functional programming to Python land Provides a bunch of primitives

dry-python 2.5k Jan 05, 2023
Cython implementation of Toolz: High performance functional utilities

CyToolz Cython implementation of the toolz package, which provides high performance utility functions for iterables, functions, and dictionaries. tool

894 Jan 02, 2023
A functional standard library for Python.

Toolz A set of utility functions for iterators, functions, and dictionaries. See the PyToolz documentation at https://toolz.readthedocs.io LICENSE New

4.1k Jan 03, 2023
Simple, elegant, Pythonic functional programming.

Coconut Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional prog

Evan Hubinger 3.6k Jan 03, 2023
A fancy and practical functional tools

Funcy A collection of fancy functional tools focused on practicality. Inspired by clojure, underscore and my own abstractions. Keep reading to get an

Alexander Schepanovski 2.9k Dec 29, 2022
粤语编程语言.The Cantonese programming language.

粤语编程语言.The Cantonese programming language.

Stepfen Shawn 895 Dec 24, 2022
Functional programming in Python: implementation of missing features to enjoy FP

Fn.py: enjoy FP in Python Despite the fact that Python is not pure-functional programming language, it's multi-paradigm PL and it gives you enough fre

Oleksii Kachaiev 3.3k Jan 04, 2023