ZipFly is a zip archive generator based on zipfile.py

Overview

Build Status GitHub release (latest by date) Downloads

Buzon - ZipFly

ZipFly is a zip archive generator based on zipfile.py. It was created by Buzon.io to generate very large ZIP archives for immediate sending out to clients, or for writing large ZIP archives without memory inflation.

Requirements

Python 3.6+

Install

pip3 install zipfly

Basic usage, compress on-the-fly during writes

Using this library will save you from having to write the Zip to disk. Some data will be buffered by the zipfile deflater, but memory inflation is going to be very constrained. Data will be written to destination by default at regular 32KB intervals.

ZipFly defaults attributes:

  • paths: [ ]
  • mode: (write) w
  • chunksize: (bytes) 32768
  • compression: Stored
  • allowZip64: True
  • compresslevel: None
  • storesize: (bytes) 0
  • encode: utf-8

paths list of dictionaries:

.
fs Should be the path to a file on the filesystem
n (Optional) Is the name which it will have within the archive
(by default, this will be the same as fs)

    import zipfly

    paths = [
        {
            'fs': '/path/to/large/file'
        },
    ]

    zfly = zipfly.ZipFly(paths = paths)

    generator = zfly.generator()
    print (generator)
    # 
   


    with open("large.zip", "wb") as f:
        for i in generator:
            f.write(i)

Examples

Streaming multiple files in a zip with Django or Flask Send forth large files to clients with the most popular frameworks

Create paths Easy way to create the array paths from a parent folder.

Predict the size of the zip file before creating it Use the BufferPredictionSize to compute the correct size of the resulting archive before creating it.

Streaming a large file Efficient way to read a single very large binary file in python

Set a comment Your own comment in the zip file

Maintainer

Santiago Debus (@santiagodebus.com)

License

This library was created by Buzon.io and is released under the MIT. Copyright 2021 Cardallot, Inc.

Comments
  • Cannot include empty folders

    Cannot include empty folders

    Python's zipfile module supports writing empty folders to zip archive.

    when trying with zipfly i get:

    File "/.../zipfly.py", line 211, in generator
        with open( path[self.filesystem], 'rb' ) as e:
    IsADirectoryError: [Errno 21] Is a directory: '/path/to/dir'
    

    Is there an undocumented way for writing empty folders?

    opened by rnixx 2
  • Is compression possible?

    Is compression possible?

    Hi,

    I see that only uncompressed ZIP files are can be created with zipfly. What is the reason for that? I don't see any compression level specific code in ZipFly.generator, except file size estimation (which doesn't work for ZIP64 achives).

    So, is compression just an unimplemented feature? Or are there any aspects which don't allow using compression for streaming made this way?

    I understand that compression is often useless in streaming scenarios but it's not true in our case. We're looking for a replacement for zipstream-new which seems to be dead. ZipFly looks nice for us...

    opened by dezhin 2
  • ASGI breaks functionability in HttpStreamingResponse

    ASGI breaks functionability in HttpStreamingResponse

    Running Django 4.1 in async ASGI mode (channels & websockets needed)

    Zipfly functionability breaks as it tries to compile the entire zip to memory before sending zip to client, filling up memory and causing a crash

    opened by T-101 1
  • Release master branch to pypi

    Release master branch to pypi

    It looks like the last version on PyPI has been published 2 years ago and there have been many changes in the code since.

    It would be helpful if you could release the latest changes.

    PS since the project is already using GH action for testing, it may be helpful to set up GH actions for release as well.

    https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

    https://github.com/marketplace/actions/pypi-publish

    opened by 1oglop1 1
  • Stream into stream

    Stream into stream

    Hi, thank you for the inspirational code.

    I'm wondering if it's possible to adapt your code and make it compatible with S3 storage.

    I use django-storages and boto3 client returns a streaming body (already open fp) and all metadata.

    I need a zipfile (or any other archive) to be created from the set of files during the GET request (not great but it has to be).

    So to save the memory I have 2 options.

    1. Use zipfly as is and download files into a temporary location (and remove it after the operation)
    2. Or better solution that doesn't require intermediate storage so I could pipe the content of streaming body into zipfly and return that as a streaming response.

    streaming body (many of them) -> zipfly(zip file) -> streaming response

    Do you think that option two is possible? If so, could you please point out what pieces of code I should focus on to adapt zipfly?

    Thank you

    opened by 1oglop1 1
  • Add functionality to provide file streams and zip them on the fly

    Add functionality to provide file streams and zip them on the fly

    It is a common use case to store large files remotely and not next to the code. See e.g. for django https://github.com/jschneier/django-storages. If many (large) files must be shipped synchronuosly as a zip file, it saves memory and storage to pass them through the web worker as a stream without saving anything to disk. To archive that, file buffers as input may be supported.

    I would appreciate if we could add this functionality. This pull request contains a tested initial attempt. Feel free to improve!

    This pull request is based on https://github.com/BuzonIO/zipfly/pull/75 and https://github.com/BuzonIO/zipfly/pull/76

    opened by beckedorf 0
  • Using bytes instead of actual files

    Using bytes instead of actual files

    Hi, thanks for the great work. I have some bytes generators and I want to make a zip on the fly out of them. Is there an option to do this? Sorry for my poor python knowledge. Thanks :)

    opened by TaToTanWeb 2
Owner
Buzon
Buzon.io is a Cardallot Inc. service
Buzon
A Certificate renaming tool made for IEEE CS SBC, SJCE.

PDF Batch Renamer Made for IEEE CS SBC, SJCE How to use? Before using the python script, ensure that pytesseract, pdf2image, opencv and other supporti

Ashwin Kumar U 2 Nov 14, 2021
The best way to convert files on your computer, be it .pdf to .png, .pdf to .docx, .png to .ico, or anything you can imagine.

The best way to convert files on your computer, be it .pdf to .png, .pdf to .docx, .png to .ico, or anything you can imagine.

JareBear 2 Nov 20, 2021
A tool written in python to generate basic repo files from github

A tool written in python to generate basic repo files from github

Riley 7 Dec 02, 2021
Yadl - it is a simple library for working with both dotenv files and environment variables.

Yadl Yadl - it is a simple library for working with both dotenv files and environment variables. Features Validation of whitespaces. Validation of num

Ivan Kapranov 3 Oct 19, 2021
Find potentially sensitive files

find_files Find potentially sensitive files This script searchs for potentially sensitive files based off of file name or string contained in the file

4 Aug 20, 2022
Python's Filesystem abstraction layer

PyFilesystem2 Python's Filesystem abstraction layer. Documentation Wiki API Documentation GitHub Repository Blog Introduction Think of PyFilesystem's

pyFilesystem 1.8k Jan 02, 2023
Two scripts help you to convert csv file to md file by template

Two scripts help you to convert csv file to md file by template. One help you generate multiple md files with different filenames from the first colume of csv file. Another can generate one md file w

2 Oct 15, 2022
File-manager - A basic file manager, written in Python

File Manager A basic file manager, written in Python. Installation Install Pytho

Samuel Ko 1 Feb 05, 2022
Powerful Python library for atomic file writes.

Powerful Python library for atomic file writes.

Markus Unterwaditzer 313 Oct 19, 2022
CSV-Handler written in Python3

CSVHandler This code allows you to work intelligently with CSV files. A file in CSV syntax is converted into several lists, which are combined in a to

Max Tischberger 1 Jan 13, 2022
shred - A cross-platform library for securely deleting files beyond recovery.

shred Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal: https:

4 Sep 04, 2021
Python Sreamlit Duplicate Records Finder Remover

Python-Sreamlit-Duplicate-Records-Finder-Remover Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom w

RONALD KANYEPI 1 Jan 21, 2022
BREP : Binary Search in plaintext and gzip files

BREP : Binary Search in plaintext and gzip files Search large files in O(log n) time using binary search. We support plaintext and Gzipped files. Benc

Arnaud de Saint Meloir 5 Dec 24, 2021
A JupyterLab extension that allows opening files and directories with external desktop applications.

A JupyterLab extension that allows opening files and directories with external desktop applications.

martinRenou 0 Oct 14, 2021
Ini adalah program python untuk mengubah background foto dalam 1 folder, tidak perlu satu satu

Myherokuapp my web drive You can see my web drive and can request film/Application do you want in here my blog you can visit my blog RemBg ini adalah

XnuxersXploitXen 13 Dec 01, 2022
This is a junk file creator tool which creates junk files in Internal Storage

This is a junk file creator tool which creates junk files in Internal Storage

KiLL3R_xRO 3 Jun 20, 2021
Add Ranges and page numbers to IIIF Manifest from a CSV.

Add Ranges and page numbers to IIIF Manifest from CSV specific to a workflow of the Bibliotheca Hertziana.

Raffaele Viglianti 3 Apr 28, 2022
Dragon Age: Origins toolset to extract/build .erf files, patch language-specific .dlg files, and view the contents of files in the ERF or GFF format

DAOTools This is a set of tools for Dragon Age: Origins modding. It can patch the text lines of .dlg files, extract and build an .erf file, and view t

8 Dec 06, 2022
Python module that parse power builder file (PBD) and analyze code

PowerBuilder-decompile Python module that parse power builder file (PBD) and analyze code (Incomplete) this tool is composed of: pbd_dump.py pbd file

Samy Sultan 8 Dec 15, 2022
This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that have that extension.

FileBulk This is a file deletion program that asks you for an extension of a file (.mp3, .pdf, .docx, etc.) to delete all of the files in a dir that h

Enoc Mena 1 Jun 26, 2022