ZFS, in Python, without reading the original C.

Related tags

Storagezfsp
Overview

ZFSp

What?

ZFS, in Python, without reading the original C.

What?!

That's right.

How?

Many hours spent staring at hexdumps, and asking friends to search the internet for explanations of various features.

Why?

Why not?

It seemed like it might be a fun project.

Installation

The Pipfile lists the dependencies; there aren't many.

Python 3.5+ is required, but 3.6 is not, as PyPy runs this code much, much faster (around 4x on the test suite) and didn't support 3.6 until recently.

pipenv install -e .

N.B.: -e gets you an "editable" install; changes to the source tree will affect the installed package's behavior.

Test Suite.

Running the test suite requires one-time access to a system with ZFS, to generate the test pools.

Run tests/fixtures.sh on such a system and make the resulting directory tests/fixtures (this is where it will end up if you run fixtures.sh directly from the tests directory).

The tests themselves can be run with py.test. Most of the tests pass, but there are some known failures. Feel free to try fixing them.

The tests are heavily parameterized and attempt to run all tests against all relevant pool variations.

Usage

zexplore is the main command line interface. It's reasonably well documented (I hope).

There's a subcommand for some limited FUSE support, which depends on fusepy (developed with 2.0.4). This is not installed by default in the Pipfile.

Example commands:

$ zexplore label -p tests/fixtures/feature_large_blocks
{b'errata': 0,
 b'features_for_read': {},
 b'guid': 6168868809305637343,
 b'hostid': 8323329,
 b'hostname': b'ubuntu-zesty',
 b'name': b'feature_large_blocks',
 b'pool_guid': 16637796155898928459,
 b'state': 1,
 b'top_guid': 6168868809305637343,
 b'txg': 16,
 b'vdev_children': 1,
 b'vdev_tree': {b'ashift': 9,
                b'asize': 62390272,
                b'create_txg': 4,
                b'guid': 6168868809305637343,
                b'id': 0,
                b'is_log': 0,
                b'metaslab_array': 33,
                b'metaslab_shift': 24,
                b'path': b'/vagrant/fixtures/feature_large_blocks',
                b'type': b'file'},
 b'version': 5000}

$ zexplore objset -p tests/fixtures/nested_datasets -P 1
{'config': 27,
 'creation_version': 5000,
 'deflate': 1,
 'feature_descriptions': 30,
 'features_for_read': 28,
 'features_for_write': 29,
 'free_bpobj': 11,
 'history': 32,
 'root_dataset': 2,
 'sync_bplist': 31}

$ zexplore ls -p tests/fixtures/nested_datasets /n1/n2/n3
x
y

$ zexplore cat -p tests/fixtures/nested_datasets /gzip-9/gzip-9 | head -n 5
A
a
aa
aal
aalii

Caveats

Lots of things won't work, including:

  • raidz with more than 4 disks will probably fail with an inscrutable error. ZFS uses a magic allocation function that's a non-linear black box. I haven't figured it out. raidz where the disks are larger than a few GB will probably also fail, but I haven't tested that at all.
  • Some new feature flags have come out since I last worked on this seriously, and those features are (obviously) not supported. That's things like (incomplete list):
    • SHA-512/256
    • Skein
    • spacemap_v2
  • Some feature flags that did exist when I was actively working on this are not implemented, either because I failed to figure out how they work, or because I didn't get around to trying.
  • pyndata is the struct description library used to read all the on-disk structures. I wrote it first, and it's bad. If I was doing this again today, I'd probably build something with dataclasses.
  • There's a Rust implementation of LZJB which is not currently included, but decompression is not the bottleneck.
Owner
Colin Valliant
Colin Valliant
A generic JSON document store with sharing and synchronisation capabilities.

Kinto Kinto is a minimalist JSON storage service with synchronisation and sharing abilities. Online documentation Tutorial Issue tracker Contributing

Kinto 4.2k Dec 26, 2022
ZODB Client-Server framework

ZEO - Single-server client-server database server for ZODB ZEO is a client-server storage for ZODB for sharing a single storage among many clients. Wh

Zope 40 Nov 04, 2022
Barman - Backup and Recovery Manager for PostgreSQL

Barman, Backup and Recovery Manager for PostgreSQL Barman (Backup and Recovery Manager) is an open-source administration tool for disaster recovery of

EDB 1.5k Dec 30, 2022
Cross-platform desktop synchronization client for the Nuxeo platform.

Nuxeo Drive Desktop Synchronization Client for Nuxeo This is an ongoing development project for desktop synchronization of local folders with remote N

Nuxeo 63 Dec 16, 2022
ZFS, in Python, without reading the original C.

ZFSp What? ZFS, in Python, without reading the original C. What?! That's right. How? Many hours spent staring at hexdumps, and asking friends to searc

Colin Valliant 569 Oct 28, 2022
The Tahoe-LAFS decentralized secure filesystem.

Free and Open decentralized data store Tahoe-LAFS (Tahoe Least-Authority File Store) is the first free software / open-source storage technology that

Tahoe-LAFS 1.2k Jan 01, 2023
Automatic SQL injection and database takeover tool

sqlmap sqlmap is an open source penetration testing tool that automates the process of detecting and exploiting SQL injection flaws and taking over of

sqlmapproject 25.7k Jan 02, 2023
Synchronize local directories with Tahoe-LAFS storage grids

Gridsync Gridsync aims to provide a cross-platform, graphical user interface for Tahoe-LAFS, the Least Authority File Store. It is intended to simplif

171 Dec 16, 2022
Postgres CLI with autocompletion and syntax highlighting

A REPL for Postgres This is a postgres client that does auto-completion and syntax highlighting. Home Page: http://pgcli.com MySQL Equivalent: http://

dbcli 10.8k Dec 30, 2022
Nerd-Storage is a simple web server for sharing files on the local network.

Nerd-Storage is a simple web server for sharing files on the local network. It supports the download of files and directories, the upload of multiple files at once, making a directory, updates and de

ハル 68 Jun 07, 2022
Continuous Archiving for Postgres

WAL-E Continuous archiving for Postgres WAL-E is a program designed to perform continuous archiving of PostgreSQL WAL files and base backups. To corre

3.4k Dec 30, 2022
a full featured file system for online data storage

S3QL S3QL is a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provi

917 Dec 25, 2022
A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting.

mycli A command line client for MySQL that can do auto-completion and syntax highlighting. HomePage: http://mycli.net Documentation: http://mycli.net/

dbcli 10.7k Jan 07, 2023
The next generation relational database.

What is EdgeDB? EdgeDB is an open-source object-relational database built on top of PostgreSQL. The goal of EdgeDB is to empower its users to build sa

EdgeDB 9.9k Dec 31, 2022
The web end of seafile server.

Introduction Seahub is the web frontend for Seafile. Preparation Build and deploy Seafile server from source. See http://manual.seafile.com/build_seaf

476 Dec 29, 2022
The command-line tool that gives easy access to all of the capabilities of B2 Cloud Storage

B2 Command Line Tool The command-line tool that gives easy access to all of the capabilities of B2 Cloud Storage. This program provides command-line a

Backblaze 467 Dec 08, 2022
An open source multi-tool for exploring and publishing data

Datasette An open source multi-tool for exploring and publishing data Datasette is a tool for exploring and publishing data. It helps people take data

Simon Willison 6.8k Jan 01, 2023
TrueNAS CORE/Enterprise/SCALE Middleware Git Repository

TrueNAS CORE/Enterprise/SCALE main source repo Want to contribute or collaborate? Join our Slack instance. IMPORTANT NOTE: This is the master branch o

TrueNAS 2k Jan 07, 2023