Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

Overview

iterable-subprocess CircleCI Test Coverage

Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed.

Data is sent to a subprocess's standard input via an iterable, and extracted from its standard output via another iterable. This allows an external subprocess to be naturally placed in a chain of iterables for streaming processing.

Installation

pip install iterable-subprocess

Usage

A single function iterable_subprocess is exposed. The first parameter is the args argument passed to the Popen Constructor, and the second is an iterable whose items must be bytes instances and are sent to the subprocess's standard input.

Returned from the function is an iterable whose items are bytes instances of the process's standard output.

from iterable_subprocess import iterable_subprocess

def yield_input():
    # In a real case could read from the filesystem or the network
    yield b'first\n'
    yield b'second\n'
    yield b'third\n'

output = iterable_subprocess(['cat'], yield_input())

for chunk in output:
    print(chunk)

Usage: unzip the first file of a ZIP archive while downloading

While its not typically possible to completely unzip an arbitrary ZIP file on-the-fly, it is possible to unzip the first file in a ZIP archive using funzip, as in the following example.

from iterable_subprocess import iterable_subprocess
import httpx

def zipped_chunks():
    with httpx.stream('GET', 'https://www.example.com/my.zip') as r:
        yield from r.iter_bytes()

unzipped_chunks = iterable_subprocess(['funzip'], zipped_chunks())

for chunk in unzipped_chunks:
    print(chunk)

Ideally Python's zipfile module or Python's zlib module would be able to do this without calling into funzip. However, at the time of writing this does not appear easily possible.

Owner
Department for International Trade
Department for International Trade
Oracle Cloud Infrastructure Object Storage fsspec implementation

Oracle Cloud Infrastructure Object Storage fsspec implementation The Oracle Cloud Infrastructure Object Storage service is an internet-scale, high-per

Oracle 9 Dec 18, 2022
Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty

Cabot Maintainers wanted Cabot is stable and used by hundreds of companies and individuals in production, but it is not actively maintained. We would

Arachnys 5.4k Dec 23, 2022
The leading native Python SSHv2 protocol library.

Paramiko Paramiko: Python SSH module Copyright: Copyright (c) 2009 Robey Pointer 8.1k Jan 04, 2023

Some automation scripts to setup a deployable development database server (with docker).

Postgres-Docker Database Initializer This is a simple automation script that will create a Docker Postgres database with a custom username, password,

Pysogge 1 Nov 11, 2021
Wiremind Kubernetes helper

Wiremind Kubernetes helper This Python library is a high-level set of Kubernetes Helpers allowing either to manage individual standard Kubernetes cont

Wiremind 3 Oct 09, 2021
This is a tool to develop, build and test PHP extensions in Docker containers.

Develop, Build and Test PHP Extensions This is a tool to develop, build and test PHP extensions in Docker containers. Installation Clone this reposito

Suora GmbH 10 Oct 22, 2022
Oncall is a calendar tool designed for scheduling and managing on-call shifts. It can be used as source of dynamic ownership info for paging systems like http://iris.claims.

Oncall See admin docs for information on how to run and manage Oncall. Development setup Prerequisites Debian/Ubuntu - sudo apt-get install libsasl2-d

LinkedIn 928 Dec 22, 2022
A colony of interacting processes

NColony Infrastructure for running "colonies" of processes. Hacking $ tox Should DTRT -- if it passes, it means unit tests are passing, and 100% cover

23 Apr 04, 2022
pyinfra automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more.

pyinfra automates/provisions/manages/deploys infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployme

Nick Barrett 2.1k Dec 29, 2022
Autoscaling volumes for Kubernetes (with the help of Prometheus)

Kubernetes Volume Autoscaler (with Prometheus) This repository contains a service that automatically increases the size of a Persistent Volume Claim i

DevOps Nirvana 142 Dec 28, 2022
Ralph is the CMDB / Asset Management system for data center and back office hardware.

Ralph Ralph is full-featured Asset Management, DCIM and CMDB system for data centers and back offices. Features: keep track of assets purchases and th

Allegro Tech 1.9k Jan 01, 2023
Ajenti Core and stock plugins

Ajenti is a Linux & BSD modular server admin panel. Ajenti 2 provides a new interface and a better architecture, developed with Python3 and AngularJS.

Ajenti Project 7k Jan 03, 2023
Dockerized iCloud drive

iCloud-drive-docker is a simple iCloud drive client in Docker environment. It uses pyiCloud python library to interact with iCloud

Mandar Patil 376 Jan 01, 2023
A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts

🏃 Simple Local CI Runner 🏃 A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts ⚙️ Setup Inst

Tom Stowe 0 Jan 11, 2022
Webinar oficial Zabbix Brasil. Uma série de 4 aulas sobre API do Zabbix.

Repositório de scripts do Webinar de API do Zabbix Webinar oficial Zabbix Brasil. Uma série de 4 aulas sobre API do Zabbix. Nossos encontros [x] 04/11

Robert Silva 7 Mar 31, 2022
This repository contains useful docker-swarm-tools.

docker-swarm-tools This repository contains useful docker-swarm-tools. swarm-guardian This Docker image is intended to be used in a multihost docker e

NeuroForge GmbH & Co. KG 4 Jan 12, 2022
Daemon to ban hosts that cause multiple authentication errors

__ _ _ ___ _ / _|__ _(_) |_ ) |__ __ _ _ _ | _/ _` | | |/ /| '_ \/ _` | ' \

Fail2Ban 7.8k Jan 09, 2023
Honcho: a python clone of Foreman. For managing Procfile-based applications.

___ ___ ___ ___ ___ ___ /\__\ /\ \ /\__\ /\ \ /\__\ /\

Nick Stenning 1.5k Jan 03, 2023
Visual disk-usage analyser for docker images

whaler What? A command-line tool for visually investigating the disk usage of docker images Why? Large images are slow to move and expensive to store.

Treebeard Technologies 194 Sep 01, 2022
SSH to WebSockets Bridge

wssh wssh is a SSH to WebSockets Bridge that lets you invoke a remote shell using nothing but HTTP. The client connecting to wssh doesn't need to spea

Andrea Luzzardi 1.3k Dec 25, 2022