Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Overview

open(LARGE)

Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Motivation

Oftentimes, especially when working with data-heavy applications, large files can proliferate in a repository. Version controlling them is an obvious next step, however, GitHub's git LFS implementation doesn't support the deletion of large files, making it easy for them to eat-up the LFS quota and explode the size of your repos.

Solution

pip install open-large

Simple example

from open_large import LargeFile

LargeFile.configure_credentials({
    "aws_region_name": "your_region_like_eu-west-2",
    "aws_access_key_id": "YOUR_ACCESS_KEY_ID",
    "aws_secret_access_key": "YOUR_VERY_SECRET_ACCESS_KEY",
    "large_files_bucket_name": "create_a_bucket_and_put_its_name_here",
})

# Creates a new version and deletes the older version leaving the 3 most recently used intact
with LargeFile("test.txt", "w", keep_last_n=3) as f:
    for i in range(100000):
        f.write('test\n')

# By default the latest version is returned
# but an optional `version` keyword argument can be provided as well
with LargeFile("test.txt", "r") as f:
    print(f.readlines()[0])

Automatically creates a file, writes to it, uploads it to S3, and then queries the most recent version of it. In this case, the latest version is already in the local cache, no download is required.

More details

LargeFile behaves like an opened file (in the background it is a temp file after all). Binary reading and writing is supported along with the different keywords open() accepts.

The local cache can be configured with these properties:

LargeFile.cache_path = Path('.cache')
LargeFile.max_cache_size = "30 GB"

I only need a path

In case you only need a path to the "remote" file, this pattern can be applied:

path_to_model = LargeFile("folder-of-my-bert-model", version=31).get()

This will first download the file/folder into your local cache folder. Then, it returns a Path object to the local version. Which can be turned into a string with str(path_to_model).

The same approach works for uploads:

LargeFile("folder-of-my-bert-model").push('path_to_local/folder_or_file')

This way, both regular files and folders can be handled. The uploaded file is called folder-of-my-bert-model, the local name is ignored.

Lastly, all version of the remote object can be deleted by calling LargeFile("my-file").delete(). It will still reside in your local cache afterwards, its deletion will happen next time your local cache has to be pruned.

Command-line example

The package can be used as a module from the command-line to give you more flexibility.

Setup

Create an .ini file (or use ~/.aws/credentials). It may look like this:

[DEFAULT]
aws_region_name = your_region_like_eu-west-2
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_VERY_SECRET_ACCESS_KEY
large_files_bucket_name = my_large_files

Just like in example secrets.

Print the expected options

python3 -m open_large --help

Upload some files

python3 -m open_large --secrets secrets.ini --push my_first_file.json folder/my_second_file my_folder

Only the filename is used as the S3 name, the rest of the path is ignored.

Download some files to the local cache

This can be useful when building a Docker image for example. This way, the files can already reside inside the container and need not be downloaded later.

python3 -m open_large --secrets ~/.aws/credentials --cache my_first_file.json:3 my_second_file my_folder:0

Versions may be specified by using :-s.

Delete remote files

python3 -m open_large --secrets ~/.aws/credentials --delete my_first_file.json
A prometheus exporter for torrent downloader like qbittorrent/transmission/deluge

downloader-exporter A prometheus exporter for qBitorrent/Transmission/Deluge. Get metrics from multiple servers and offers them in a prometheus format

Lei Shi 41 Nov 18, 2022
AI Dungeon Catalog Archive Toolkit

AI Dungeon Content Archive Toolkit (AID CAT) AID CAT is a command-line utility that will allow you to download JSON backups of: Your private and publi

Mimi 31 Oct 26, 2022
Implementation of Cross-category Video Highlight Detection via Set-based Learning (ICCV 2021).

Cross-category Video Highlight Detection via Set-based Learning Introduction This project is an implementation of ``Cross-category Video Highlight Det

Minghao (Alan) Xu 49 Dec 17, 2022
Downloads and Updates GOG Galaxy 2.0 Plugins/Integrations

GOG Galaxy Plugins Downloader Summary This program downloads GOG Galaxy 2.0 Plugins and installs them to the proper location. You probably do not want

slashbunny 253 Dec 12, 2022
Download your bandcamp collection using this python script.

bandcamp-downloader Download your Bandcamp collection using this python script. It requires you to have a browser with a logged in session of bandcamp

72 Dec 20, 2022
Python script to download (TCR) genes from IMGT/GENE-DB

IMGTgeneDL 0.1.0 Jamie Heather | CCR @ MGH | 2021 This script provides an alternative way to access TCR and IG genes stored in IMGT/GENE-DB. It's prim

Jamie Heather 1 Mar 30, 2022
Pypixiv - A fully-typed, asynchronous api wrapper for pixiv

pypixiv this library is a fully-typed, asynchronous api wrapper for pixiv. featu

DeltaLaboratory 2 Nov 16, 2022
music downloader written in python. (Uses jiosaavn API)

music downloader written in python. (Uses jiosaavn API)

Rohn Chatterjee 35 Jul 20, 2022
Neon: an add-on for making it easier to handle component interactions

Neon Neon is an add-on for Lightbulb making it easier to handle component interactions. Installation pip install git+https://github.com/neonjonn/light

Neon Jonn 9 Apr 29, 2022
Youtube playlist downloader with full metadata support

ytrake GUI tool to embed metadata for albums on Youtube with youtube-dl. Requires youtube-dl v2021.06.06. Post-processing Album metadata: Usage ytrake

28 Jul 12, 2022
Python based YouTube video Downloader GUI Application.

Youtube video Downloader Python based Youtube video Downloader GUI Application. Installation Python Dependencies Import pytube pip install pytube Im

Naem Azam 1 Jan 03, 2022
A Python script that allows you to download all of an anime's episodes at once.

BitAnime A Python script that allows you to download all of an anime's episodes at once. · Download executable version · About BitAnime BitAnime is a

sh1nobu 17 Aug 10, 2022
Get the latest updates around you as they happen

Adherent We all are different, experience various things happening around us but we stick together. We are all a part of a greater community. As human

Shreyas Daniel 1 Nov 10, 2022
A Udemy downloader that can download DRM protected videos and non-DRM protected videos.

Udemy Downloader with DRM support NOTE This program is WIP, the code is provided as-is and i am not held resposible for any legal repercussions result

Puyodead1 468 Dec 29, 2022
Python Program that downloads gaming required packages based on your Linux Distribution.

LibreGaming Python Program that downloads gaming required packages based on your Linux Distribution. Table of contents Distributions Prerequisites Dep

Ahmed Al Balochi 195 Jan 01, 2023
Tool to get Canvas cover videos from Spotify tracks.

Spotify Canvas Downloader Tool to get Canvas cover videos from Spotify tracks. ✨ Try it out Building Clone the repository git clone https://github.com

Gabriel 35 Dec 28, 2022
code for paper"3D reconstruction method based on a generative model in continuous latent space"

PyTorch implementation of 3D-VGT(3D-VAE-GAN-Transformer) This repository contains the source code for the paper "3D reconstruction method based on a g

Tong 5 Apr 25, 2022
Let's you download entire YT-playlists.

Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier

11 Dec 18, 2022
An Inline Telegram bot that can download YouTube videos with permanent thumbnail support

Tube (YouTube Downloader) An Inline Telegram bot that can download YouTube videos with permanent thumbnail support About Bot need to be in Inline Mode

Renjith Mangal 30 Dec 14, 2022
Download YouTube videos/music and images in MP4, JPG with this tool.

ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,

TrollSkull 5 Jan 02, 2023