Unified file system operation experience for different backend

Overview

megfile - Megvii FILE library

build docs Latest version Support python versions License

megfile provides a silky operation experience with different backends (currently including local file system and OSS), which enable you to focus more on the logic of your own project instead of the question of "Which backend is used for this file?"

megfile provides:

  • Almost unified file system operation experience. Target path can be easily moved from local file system to OSS.
  • Complete boundary case handling. Even the most difficult (or even you can't even think of) boundary conditions, megfile can help you easily handle it.
  • Perfect type hints and built-in documentation. You can enjoy the IDE's auto-completion and static checking.
  • Semantic version and upgrade guide, which allows you enjoy the latest features easily.

megfile's advantages are:

  • smart_open can open resources that use various protocols, including fs, s3, http(s) and stdio. Especially, reader / writer of s3 in megfile is implemented with multi-thread, which is faster than known competitors.
  • smart_glob is available on s3. And it supports zsh extended pattern syntax of [], e.g. s3://bucket/video.{mp4,avi}.
  • All-inclusive functions like smart_exists / smart_stat / smart_sync. If you don't find the functions you want, submit an issue.
  • Compatible with pathlib.Path interface, referring to S3Path and SmartPath.

Quick Start

Here's an example of writing a file to OSS, syncing to local, reading and finally deleting it.

from megfile import smart_open, smart_exists, smart_sync, smart_remove, smart_glob
from megfile.smart_path import SmartPath

# open a file in s3 bucket
with smart_open('s3://playground/refile-test', 'w') as fp:
    fp.write('refile is not silver bullet')

# test if file in s3 bucket exist
smart_exists('s3://playground/refile-test')

# copy files or directories
smart_sync('s3://playground/refile-test', '/tmp/playground')

# remove files or directories
smart_remove('s3://playground/refile-test')

# glob files or directories in s3 bucket
smart_glob('s3://playground/video-?.{mp4,avi}')

# or in local file system
smart_exists('/tmp/playground/refile-test')

# smart_open also support protocols like http / https
smart_open('https://www.google.com')

# SmartPath interface
path = SmartPath('s3://playground/megfile-test')
if path.exists():
    with path.open() as f:
        result = f.read(7)
        assert result == b'megfile'

Installation

PyPI

pip3 install megfile

You can specify megfile version as well

pip3 install "megfile~=0.0"

Build from Source

megfile can be installed from source

git clone [email protected]:megvii-research/megfile.git
cd megfile
pip3 install -U .

Development Environment

git clone [email protected]:megvii-research/megfile.git
cd megfile
sudo apt install libgl1-mesa-glx libfuse-dev fuse
pip3 install -r requirements.txt -r requirements-dev.txt

How to Contribute

  • We welcome everyone to contribute code to the megfile project, but the contributed code needs to meet the following conditions as much as possible:

    You can submit code even if the code doesn't meet conditions. The project members will evaluate and assist you in making code changes

    • Code format: Your code needs to pass code format check. megfile uses yapf as lint tool and the version is locked at 0.27.0. The version lock may be removed in the future

    • Static check: Your code needs complete type hint. megfile uses pytype as static check tool. If pytype failed in static check, use # pytype: disable=XXX to disable the error and please tell us why you disable it.

      Note : Because pytype doesn't support variable type annation, the variable type hint format introduced by py36 cannot be used.

      i.e. variable: int is invalid, replace it with variable # type: int

    • Test: Your code needs complete unit test coverage. megfile uses pyfakefs and moto as local file system and OSS virtual environment in unit tests. The newly added code should have a complete unit test to ensure the correctness

  • You can help to improve megfile in many ways:

    • Write code.
    • Improve documentation.
    • Report or investigate bugs and issues.
    • If you find any problem or have any improving suggestion, submit a new issuse as well. We will reply as soon as possible and evaluate whether to adopt.
    • Review pull requests.
    • Star megfile repo.
    • Recommend megfile to your friends.
    • Any other form of contribution is welcomed.
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Capsule endoscopy detection DACON challenge

capsule_endoscopy_detection (DACON Challenge) Overview Yolov5, Yolor, mmdetection기반의 모델을 사용 (총 11개 모델 앙상블) 모든 모델은 학습 시 Pretrained Weight을 yolov5, yolo

MAILAB 11 Nov 25, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Polyp-PVT by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao. This repo is the official implementation of "Polyp-PVT: Polyp Se

Deng-Ping Fan 102 Jan 05, 2023
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Automated Side Channel Analysis of Media Software with Manifold Learning Official implementation of USENIX Security 2022 paper: Automated Side Channel

Yuanyuan Yuan 175 Jan 07, 2023
Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)

Discovering Non-monotonic Autoregressive Orderings with Variational Inference Description This package contains the source code implementation of the

Xuanlin (Simon) Li 10 Dec 29, 2022
K-Nearest Neighbor in Pytorch

Pytorch KNN CUDA 2019/11/02 This repository will no longer be maintained as pytorch supports sort() and kthvalue on tensors. git clone https://github.

Chris Choy 65 Dec 01, 2022
Official implementation of VQ-Diffusion

Vector Quantized Diffusion Model for Text-to-Image Synthesis Overview This is the official repo for the paper: [Vector Quantized Diffusion Model for T

Microsoft 592 Jan 03, 2023
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources

Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources (e.g. just the lead vocals).

Victor Basu 14 Nov 07, 2022
Speed-Test - You can check your intenet speed using this tool

Speed-Test Tool By Hez_X AVAILABLE ON : Termux & Kali linux & Ubuntu (Linux E

Hez-X 3 Feb 17, 2022
A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

Emma 1 Jan 18, 2022
Fast Scattering Transform with CuPy/PyTorch

Announcement 11/18 This package is no longer supported. We have now released kymatio: http://www.kymat.io/ , https://github.com/kymatio/kymatio which

Edouard Oyallon 289 Dec 07, 2022
Group-Free 3D Object Detection via Transformers

Group-Free 3D Object Detection via Transformers By Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong. This repo is the official implementation of "Group-

Ze Liu 213 Dec 07, 2022
🏖 Keras Implementation of Painting outside the box

Keras implementation of Image OutPainting This is an implementation of Painting Outside the Box: Image Outpainting paper from Standford University. So

Bendang 1.1k Dec 10, 2022
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 본 repo 는 mAy-I Inc. 팀으로 참가한 2021 인공지능 온라인 경진대회 중 [이미지] 운전 사고 예방을 위한 운전자 부주의 행동 검출 모델] 태스크 수행을 위한 레포지토리입니다. mAy-I 는 과학기술정보통신부가 주최하

Junhyuk Park 9 Dec 01, 2022
Learning Optical Flow from a Few Matches (CVPR 2021)

Learning Optical Flow from a Few Matches This repository contains the source code for our paper: Learning Optical Flow from a Few Matches CVPR 2021 Sh

Shihao Jiang (Zac) 159 Dec 16, 2022
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
The CLRS Algorithmic Reasoning Benchmark

Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.

DeepMind 251 Jan 05, 2023
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation

992 Dec 30, 2022