Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Overview

open(LARGE)

Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

Motivation

Oftentimes, especially when working with data-heavy applications, large files can proliferate in a repository. Version controlling them is an obvious next step, however, GitHub's git LFS implementation doesn't support the deletion of large files, making it easy for them to eat-up the LFS quota and explode the size of your repos.

Solution

pip install open-large

Simple example

from open_large import LargeFile

LargeFile.configure_credentials({
    "aws_region_name": "your_region_like_eu-west-2",
    "aws_access_key_id": "YOUR_ACCESS_KEY_ID",
    "aws_secret_access_key": "YOUR_VERY_SECRET_ACCESS_KEY",
    "large_files_bucket_name": "create_a_bucket_and_put_its_name_here",
})

# Creates a new version and deletes the older version leaving the 3 most recently used intact
with LargeFile("test.txt", "w", keep_last_n=3) as f:
    for i in range(100000):
        f.write('test\n')

# By default the latest version is returned
# but an optional `version` keyword argument can be provided as well
with LargeFile("test.txt", "r") as f:
    print(f.readlines()[0])

Automatically creates a file, writes to it, uploads it to S3, and then queries the most recent version of it. In this case, the latest version is already in the local cache, no download is required.

More details

LargeFile behaves like an opened file (in the background it is a temp file after all). Binary reading and writing is supported along with the different keywords open() accepts.

The local cache can be configured with these properties:

LargeFile.cache_path = Path('.cache')
LargeFile.max_cache_size = "30 GB"

I only need a path

In case you only need a path to the "remote" file, this pattern can be applied:

path_to_model = LargeFile("folder-of-my-bert-model", version=31).get()

This will first download the file/folder into your local cache folder. Then, it returns a Path object to the local version. Which can be turned into a string with str(path_to_model).

The same approach works for uploads:

LargeFile("folder-of-my-bert-model").push('path_to_local/folder_or_file')

This way, both regular files and folders can be handled. The uploaded file is called folder-of-my-bert-model, the local name is ignored.

Lastly, all version of the remote object can be deleted by calling LargeFile("my-file").delete(). It will still reside in your local cache afterwards, its deletion will happen next time your local cache has to be pruned.

Command-line example

The package can be used as a module from the command-line to give you more flexibility.

Setup

Create an .ini file (or use ~/.aws/credentials). It may look like this:

[DEFAULT]
aws_region_name = your_region_like_eu-west-2
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_VERY_SECRET_ACCESS_KEY
large_files_bucket_name = my_large_files

Just like in example secrets.

Print the expected options

python3 -m open_large --help

Upload some files

python3 -m open_large --secrets secrets.ini --push my_first_file.json folder/my_second_file my_folder

Only the filename is used as the S3 name, the rest of the path is ignored.

Download some files to the local cache

This can be useful when building a Docker image for example. This way, the files can already reside inside the container and need not be downloaded later.

python3 -m open_large --secrets ~/.aws/credentials --cache my_first_file.json:3 my_second_file my_folder:0

Versions may be specified by using :-s.

Delete remote files

python3 -m open_large --secrets ~/.aws/credentials --delete my_first_file.json
A YouTube downloader which allows you to choose which video you want

Youtube Video Downloader Download multiple videos in one go! How to Use 1.First type the video you want to download 2.On clicking the Search button yo

2 Dec 17, 2021
A Python script to download PDB files associated with a Portable Executable (PE)

A Python script to download PDB files associated with a Portable Executable (PE)

Podalirius 33 Jan 03, 2023
Download clips from youtube videos with a few clicks and a GUI!

YouClip v2.0.0 Table Of Contents: What Is YouClip Installation Usage Stuff To Fix Changelog What Is YouClip? ! IMPORTANT: The source files are a total

ador 2 Oct 05, 2021
VD Song Bot - A telegram bot that can download songs

VD Song Bot A telegram bot that can download songs Reach me on Telegram @MusicVNDbot Deploy to Heroku The easiest way to deploy this Song Bot Mandator

Venuja Thilakarathna 2 Feb 19, 2022
A simple GUI video downloader built off of the python module 'yt-dlp'

Simple-Youtube-DL-Gui Supported Operating Systems Windows 7 (x64), Windows 8 (x64), and Windows 10 (x64) How to use Main Gui Extract program from arch

12 Dec 30, 2022
A cross platform front-end GUI of the popular youtube-dl written in wxPython.

youtube-dlG A cross platform front-end GUI of the popular youtube-dl media downloader written in wxPython. Supported sites Screenshots Requirements Py

8.7k Dec 31, 2022
Youtube Downloader GUI

Python Youtube Downloader GUI This is a GUI application that allows you to download videos from Youtube. Features Download videos from Youtube in MP3

Daniel Carrillo 2 Dec 14, 2021
YouTube Video publisher using youtube-dl & ROS2🐢

YouTube-publisher-ROS2 Publish sensor_msgs/Image by "YouTube" 🤗 🤗 🤗 ! You don't have to use webcamera or your video to check demos. Purpose Quick d

Ar-Ray 5 Dec 04, 2022
Code to scrape , download and upload to youtube daily

Youtube_Automated_Channel Code to scrape , download and upload to youtube daily INSTRUCTIONS Download the Github Repository Download and install Pytho

Atsiksdong 2 Dec 19, 2021
Download all your URI Online Judge source codes and upload to GitHub with simple steps.

URI-Code-Downloader Download all your URI Online Judge source codes and upload to GitHub with simple steps. Prerequisites Python 3.x Installing Downlo

Luan Simões 9 Mar 23, 2022
Download Youtube videos in mp4 format in a fast, easy, convenient way made with Python!

yt_downloader Download Youtube videos in mp4 format in a fast, easy, convenient way made with Python! Required Modules pytube os time colorama Errors

3 Jul 02, 2022
this is udemy course downloader, before a start you know how to get access token.

udemy_downloader this is udemy course downloader, before a start you know how to get access token. To get the access_token on Google Chrome (once on U

OkUgur 18 Dec 04, 2022
YTPY Youtube Downloader Made by: Ferreira, Amarau and Rodric

YTPY Youtube Downloader Made by: Ferreira, Amarau and Rodric How to Install on Linux: sudo apt install python3 python3-pip git pip install pytube git

7 Nov 24, 2022
A youtube-dl fork with additional features and fixes

yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keepin

yt-dlp 37.1k Jan 03, 2023
MMDL (Mega Music Downloader) - A tool to easily download music.

mmdl - Mega Music Downloader What is mmdl ❓ TLDR: MMDL is a cli app which allows you to quickly and efficiently download one or multiple songs from Yo

techboy-coder 30 Dec 13, 2022
Python based Telegram bot. Search and download YouTube video or audio.

Python-Telegram-Youtube-Media-Bot Python based Telegram bot. Search and download YouTube video or audio. Just change settings.py and start TelegramBot

Ahmet Bohur 2 Oct 02, 2022
In this repository you will find the test carried out to enter, as a python developer, the company Keeper Solutions.

Bookmarks In this repository you will find the test carried out to enter, as a python developer, the company Keeper Solutions. First it is necessary t

0 Jan 12, 2022
Music and video downloader, Made with love by Bryan Herrera

Python-Mp3Mp4-Downloader Music and video downloader, Made with love by Bryan Herrera Requirements CHOCOLATELY windows command If your system does not

ርᚱ1ናተᛰ ᚻህᚥተპᚱ 104 Dec 27, 2022
GTK4 + Python tutorial with code examples

Taiko's GTK4 Python tutorial Wanna make apps for Linux but not sure how to start with GTK? This guide will hopefully help! The intent is to show you h

190 Jan 08, 2023
This repository contains code for a youtube-dl GUI written in PyQt.

youtube-dl-GUI This repository contains code for a youtube-dl GUI written in PyQt. It is based on youtube-dl which is a Video downloading script maint

M.Yasoob Ullah Khalid ☺ 191 Jan 02, 2023