Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

Overview

Build PyPI version

Pearpy

The Python package for (pear)allelizing your tasks across multiple CPU threads.

Installation

The latest version of Pearpy can be installed with:

pip install pearpy

To stay up to date with Pearpy's releases, visit the official page on PyPi or take a look at our GitHub Releases!

Usage

  1. Create a Pear() object. This will be a wrapper for all of your multithreaded processes.
  2. Identify the functions on which you would like to paralleilze computation.
  3. Add your tasks to the Pear. If a potential race condition is detected, target processes will be automatically locked.
  4. Run the paraellelized processes simultaneously.

Example

from pearpy.pear import Pear

# First function to be parallelized
def t1(num1, num2):
    print('t1: ', num1 + num2)

# Second function to be paralellized
def t2(num):
    print('t2: ', num)

# Create pear object, add threads, and run
pear = Pear()
pear.add_thread(t1, [4, 5])
pear.add_thread(t2, 4)
pear.run()

Race Condition Handling

When multiple threads utilize the same function, Pear will automatically generate locks for each resource. This allows developers to utilize Pear's multithreading without having to worry about inaccurate data caused by race conditions. The following example shows how race conditions are handled:

from pearpy.pear import Pear

global_var = 10

# This function reads from and writes to a global variable
def t_duplicated(num):
    global global_var
    print('t_duplicated: ', num + global_var)
    global_var += 1

# Pear object created with two threads accessing a shared resource
# A race condition is detected and locks are generated
pear = Pear()
pear.add_thread(t_duplicated, 1) # This should return 11 because 1 + 10 = 11
pear.add_thread(t_duplicated, 1) # This should return 12 because global_var is incremented
pear.run()

##########
# OUTPUT #
##########
t_duplicated: 11
t_duplicated: 12

Benchmarks and Tests

Benchmarks can be examined via the make benchmark command. This will display the threaded vs unthreaded runtimes on a set script, along with the percent improvement between the two. Here is an example of what the benchmarks should look like:

----------------------------------------------------------------------
THREADED BENCHMARK
3.8507602214813232 s
----------------------------------------------------------------------
UNTHREADED BENCHMARK
13.90523624420166 s
----------------------------------------------------------------------
Improvement:  361.1036638072611 %
.
----------------------------------------------------------------------
Ran 1 test in 17.757s

OK

To run tests, utilize the make test command. This will output the results of the functions called in the /tests/test_pear.py script, along with the status of the tests themselves. The console will display 'OK' if the tests are passing.

Contributing

Pear is open source and contributions from anyone are welcome. To contribute to this project, please submit issues and pull requests via GitHub. In order to successfully merge a pull request, all unit tests must be passed when run via make test. Thank you!

You might also like...
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

Here I plotted data for the average test scores across schools and class sizes across school districts.
Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

A multithreaded view bot for YouTube
A multithreaded view bot for YouTube

Simple program to increase YouTube views written in Python.

A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers within industrial environments. Utilizing multithreaded processing, Automator-Terminator delivers a powerful wave of spoofed ethernet packets to a null MAC address.
The producer-consumer problem implemented with threads in Python

This was developed using a Python virtual environment, I would strongly recommend to do the same if you want to clone this repository. How to run this

A python tool for synchronizing the messages from different threads, processes, or hosts.

Sync-stream This project is designed for providing the synchoronization of the stdout / stderr among different threads, processes, devices or hosts.

Python directory buster, multiple threads, gobuster-like CLI, web server brute-forcer, URL replace pattern feature.

pybuster v1.1 pybuster is a tool that is used to brute-force URLs of web servers. Features Directory busting (URI) URL replace patterns (put PYBUSTER

A bot framework for Reddit to manage threads, wiki pages, widgets, menus and more.

Sub Manager Sub Manager is a bot framework for Reddit to automate a variety of tasks on one or more subreddits, and can be configured and run without

City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

Visualise Ansible execution time across playbooks, tasks, and hosts.
Visualise Ansible execution time across playbooks, tasks, and hosts.

ansible-trace Visualise where time is spent in your Ansible playbooks: what tasks, and what hosts, so you can find where to optimise and decrease play

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

ShadowClone allows you to distribute your long running tasks dynamically across thousands of serverless functions and gives you the results within seconds where it would have taken hours to complete

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

A mass creator for Discord's new channel threads.

discord-thread-flooder A mass creator for Discord's new channel threads. (obv created by https://github.com/imvast) Warning: this may lag ur pc if u h

Automatically re-open threads when they get archived, no matter your boost level!

ThreadPersist Automatically re-open threads when they get archived, no matter your boost level! Installation You will need to install poetry to run th

Keepalive - Discord Bot to keep threads from expiring

keepalive Discord Bot to keep threads from expiring Installation Create a new Di

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

This speeds up PyCharm's package index processes and avoids CPU & memory overloading

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

Comments
  • Race Condition Handling

    Race Condition Handling

    Created a dictionary where the key is the name of the group of functions being executed to prevent busy waiting. For this I needed to change the name of the thread to the function name.

    opened by Mark-Nawar 0
  • Shared resources not handled properly

    Shared resources not handled properly

    To reproduce:

    • Create a function that manipulates and reads from a global variable
    • Create a pear object and add 2 threads that utilizes the function
    • Run the pear -> notice that race conditions are not handled properly
    opened by aidenszeto 0
Releases(v0.1.3)
Owner
MLH Fellowship
An internship alternative for software engineers, by Major League Hacking
MLH Fellowship
Cinder is Instagram's internal performance-oriented production version of CPython

Cinder is Instagram's internal performance-oriented production version of CPython 3.8. It contains a number of performance optimizations, including bytecode inline caching, eager evaluation of corout

Facebook Incubator 2.2k Dec 30, 2022
Silky smooth profiling for Django

Silk Silk is a live profiling and inspection tool for the Django framework. Silk intercepts and stores HTTP requests and database queries before prese

Jazzband 3.7k Jan 01, 2023
Pearpy - a Python package for writing multithreaded code and parallelizing tasks across CPU threads.

Pearpy The Python package for (pear)allelizing your tasks across multiple CPU threads. Installation The latest version of Pearpy can be installed with

MLH Fellowship 5 Nov 01, 2021
guapow is an on-demand and auto performance optimizer for Linux applications.

guapow is an on-demand and auto performance optimizer for Linux applications. This project's name is an abbreviation for Guarana powder (Guaraná is a fruit from the Amazon rainforest with a highly ca

Vinícius Moreira 19 Nov 18, 2022
Rip Raw - a small tool to analyse the memory of compromised Linux systems

Rip Raw Rip Raw is a small tool to analyse the memory of compromised Linux systems. It is similar in purpose to Bulk Extractor, but particularly focus

Cado Security 127 Oct 28, 2022
This tool allows to gather statistical profile of CPU usage of mixed native-Python code.

Sampling Profiler for Python This tool allows to gather statistical profile of CPU usage of mixed native-Python code. Currently supported platforms ar

Intel Corporation 13 Oct 04, 2022
PerfSpect is a system performance characterization tool based on linux perf targeting Intel microarchitectures

PerfSpect PerfSpect is a system performance characterization tool based on linux perf targeting Intel microarchitectures. The tool has two parts perf

Intel Corporation 139 Dec 30, 2022
Pyccel stands for Python extension language using accelerators.

Pyccel stands for Python extension language using accelerators.

Pyccel 242 Jan 02, 2023
Sampling profiler for Python programs

py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe

Ben Frederickson 9.5k Jan 01, 2023
Django query profiler - one profiler to rule them all. Shows queries, detects N+1 and gives recommendations on how to resolve them

Django Query Profiler This is a query profiler for Django applications, for helping developers answer the question "My Django code/page/API is slow, H

Django Query Profiler 116 Dec 15, 2022
Python compiler that massively increases Python's code performance without code changes.

Flyable - A python compiler for highly performant code Flyable is a Python compiler that generates efficient native code. It uses different techniques

Flyable 35 Dec 16, 2022
Shrapnel is a scalable, high-performance cooperative threading library for Python.

This Python library was evolved at IronPort Systems and has been provided as open source by Cisco Systems under an MIT license. Intro Shrapnel is a li

216 Nov 06, 2022
A low-impact profiler to figure out how much memory each task in Dask is using

dask-memusage If you're using Dask with tasks that use a lot of memory, RAM is your bottleneck for parallelism. That means you want to know how much m

Itamar Turner-Trauring 23 Dec 09, 2022