Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

Last update: Nov 01, 2021

Overview

Pearpy

The Python package for (pear)allelizing your tasks across multiple CPU threads.

Installation

The latest version of Pearpy can be installed with:

pip install pearpy

To stay up to date with Pearpy's releases, visit the official page on PyPi or take a look at our GitHub Releases!

Usage

Create a Pear() object. This will be a wrapper for all of your multithreaded processes.
Identify the functions on which you would like to paralleilze computation.
Add your tasks to the Pear. If a potential race condition is detected, target processes will be automatically locked.
Run the paraellelized processes simultaneously.

Example

from pearpy.pear import Pear

# First function to be parallelized
def t1(num1, num2):
    print('t1: ', num1 + num2)

# Second function to be paralellized
def t2(num):
    print('t2: ', num)

# Create pear object, add threads, and run
pear = Pear()
pear.add_thread(t1, [4, 5])
pear.add_thread(t2, 4)
pear.run()

Race Condition Handling

When multiple threads utilize the same function, Pear will automatically generate locks for each resource. This allows developers to utilize Pear's multithreading without having to worry about inaccurate data caused by race conditions. The following example shows how race conditions are handled:

from pearpy.pear import Pear

global_var = 10

# This function reads from and writes to a global variable
def t_duplicated(num):
    global global_var
    print('t_duplicated: ', num + global_var)
    global_var += 1

# Pear object created with two threads accessing a shared resource
# A race condition is detected and locks are generated
pear = Pear()
pear.add_thread(t_duplicated, 1) # This should return 11 because 1 + 10 = 11
pear.add_thread(t_duplicated, 1) # This should return 12 because global_var is incremented
pear.run()

##########
# OUTPUT #
##########
t_duplicated: 11
t_duplicated: 12

Benchmarks and Tests

Benchmarks can be examined via the make benchmark command. This will display the threaded vs unthreaded runtimes on a set script, along with the percent improvement between the two. Here is an example of what the benchmarks should look like:

----------------------------------------------------------------------
THREADED BENCHMARK
3.8507602214813232 s
----------------------------------------------------------------------
UNTHREADED BENCHMARK
13.90523624420166 s
----------------------------------------------------------------------
Improvement:  361.1036638072611 %
.
----------------------------------------------------------------------
Ran 1 test in 17.757s

OK

To run tests, utilize the make test command. This will output the results of the functions called in the /tests/test_pear.py script, along with the status of the tests themselves. The console will display 'OK' if the tests are passing.

Contributing

Pear is open source and contributions from anyone are welcome. To contribute to this project, please submit issues and pull requests via GitHub. In order to successfully merge a pull request, all unit tests must be passed when run via make test. Thank you!

You might also like...

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

6.9k Jan 5, 2023

Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

7 Oct 27, 2021

A multithreaded view bot for YouTube

Simple program to increase YouTube views written in Python.

906 Jan 9, 2023

A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers within industrial environments. Utilizing multithreaded processing, Automator-Terminator delivers a powerful wave of spoofed ethernet packets to a null MAC address.

Automator-Terminator A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers (PLCs) w

25 Oct 10, 2022

The producer-consumer problem implemented with threads in Python

This was developed using a Python virtual environment, I would strongly recommend to do the same if you want to clone this repository. How to run this

1 Oct 30, 2021

A python tool for synchronizing the messages from different threads, processes, or hosts.

Sync-stream This project is designed for providing the synchoronization of the stdout / stderr among different threads, processes, devices or hosts.

0 Aug 11, 2021

Python directory buster, multiple threads, gobuster-like CLI, web server brute-forcer, URL replace pattern feature.

pybuster v1.1 pybuster is a tool that is used to brute-force URLs of web servers. Features Directory busting (URI) URL replace patterns (put PYBUSTER

1 Jan 5, 2022

A bot framework for Reddit to manage threads, wiki pages, widgets, menus and more.

Sub Manager Sub Manager is a bot framework for Reddit to automate a variety of tasks on one or more subreddits, and can be configured and run without

3 Aug 26, 2022

City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

2 Mar 12, 2022

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

7.6k Jan 1, 2023

Visualise Ansible execution time across playbooks, tasks, and hosts.

ansible-trace Visualise where time is spent in your Ansible playbooks: what tasks, and what hosts, so you can find where to optimise and decrease play

81 Dec 15, 2022

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

240 Jan 6, 2023

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

9 Nov 16, 2022

A mass creator for Discord's new channel threads.

discord-thread-flooder A mass creator for Discord's new channel threads. (obv created by https://github.com/imvast) Warning: this may lag ur pc if u h

6 Nov 4, 2022

Automatically re-open threads when they get archived, no matter your boost level!

ThreadPersist Automatically re-open threads when they get archived, no matter your boost level! Installation You will need to install poetry to run th

7 Sep 18, 2022

Keepalive - Discord Bot to keep threads from expiring

keepalive Discord Bot to keep threads from expiring Installation Create a new Di

5 Mar 14, 2022

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

1 Feb 9, 2022

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

15 Dec 7, 2022

Comments

Race Condition Handling

Created a dictionary where the key is the name of the group of functions being executed to prevent busy waiting. For this I needed to change the name of the thread to the function name.

opened by Mark-Nawar 0
Shared resources not handled properly
To reproduce:

Create a function that manipulates and reads from a global variable

Create a pear object and add 2 threads that utilizes the function

Run the pear -> notice that race conditions are not handled properly
opened by aidenszeto 0

Releases(v0.1.3)

v0.1.3(Aug 16, 2021)

PATCH - added capability to multithread functions with no args
Source code(tar.gz)
Source code(zip)
v0.1.2(Aug 11, 2021)

PATCH - updated setup.py to include long description
Source code(tar.gz)
Source code(zip)
v0.1.1(Aug 11, 2021)

PATCH - renamed module name to match package name
Source code(tar.gz)
Source code(zip)
v0.1.0(Aug 11, 2021)

Initial release
Source code(tar.gz)
Source code(zip)

Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

Related tags

Overview

Pearpy

Installation

Usage

Example

Race Condition Handling

Benchmarks and Tests

Contributing

You might also like...

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Here I plotted data for the average test scores across schools and class sizes across school districts.

A multithreaded view bot for YouTube

A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers within industrial environments. Utilizing multithreaded processing, Automator-Terminator delivers a powerful wave of spoofed ethernet packets to a null MAC address.

The producer-consumer problem implemented with threads in Python

A python tool for synchronizing the messages from different threads, processes, or hosts.

Python directory buster, multiple threads, gobuster-like CLI, web server brute-forcer, URL replace pattern feature.

A bot framework for Reddit to manage threads, wiki pages, widgets, menus and more.

City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Visualise Ansible execution time across playbooks, tasks, and hosts.

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

Download images from forum threads

A mass creator for Discord's new channel threads.

Automatically re-open threads when they get archived, no matter your boost level!

Keepalive - Discord Bot to keep threads from expiring

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

A python package to avoid writing and maintaining duplicated python docstrings.

Comments

Race Condition Handling

Shared resources not handled properly

Releases(v0.1.3)

v0.1.3(Aug 16, 2021)

v0.1.2(Aug 11, 2021)

v0.1.1(Aug 11, 2021)

v0.1.0(Aug 11, 2021)

Owner

MLH Fellowship

Silky smooth profiling for Django

Shrapnel is a scalable, high-performance cooperative threading library for Python.

Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

guapow is an on-demand and auto performance optimizer for Linux applications.

Python compiler that massively increases Python's code performance without code changes.

Sampling profiler for Python programs

Django query profiler - one profiler to rule them all. Shows queries, detects N+1 and gives recommendations on how to resolve them

Pyccel stands for Python extension language using accelerators.

Cinder is Instagram's internal performance-oriented production version of CPython

This tool allows to gather statistical profile of CPU usage of mixed native-Python code.

A low-impact profiler to figure out how much memory each task in Dask is using

Rip Raw - a small tool to analyse the memory of compromised Linux systems

PerfSpect is a system performance characterization tool based on linux perf targeting Intel microarchitectures