A different spin on dataclasses.

Overview

dataklasses

Dataklasses is a library that allows you to quickly define data classes using Python type hints. Here's an example of how you use it:

from dataklasses import dataklass

@dataklass
class Coordinates:
    x: int
    y: int

The resulting class works in a well civilised way, providing the usual __init__(), __repr__(), and __eq__() methods that you'd normally have to type out by hand:

>>> a = Coordinates(2, 3)
>>> a
Coordinates(2, 3)
>>> a.x
2
>>> a.y
3
>>> b = Coordinates(2, 3)
>>> a == b
True
>>>

It's easy! Almost too easy.

Wait, doesn't this already exist?

No, it doesn't. Yes, certain naysayers will be quick to point out the existence of @dataclass from the standard library. Ok, sure, THAT exists. However, it's slow and complicated. Dataklasses are neither of those things. The entire dataklasses module is less than 100 lines. The resulting classes import 15-20 times faster than dataclasses. See the perf.py file for a benchmark.

Theory of Operation

While out walking with his puppy, Dave had a certain insight about the nature of Python byte-code. Coming back to the house, he had to try it out:

>>> def __init1__(self, x, y):
...     self.x = x
...     self.y = y
...
>>> def __init2__(self, foo, bar):
...     self.foo = foo
...     self.bar = bar
...
>>> __init1__.__code__.co_code == __init2__.__code__.co_code
True
>>>

How intriguing! The underlying byte-code is exactly the same even though the functions are using different argument and attribute names. Aha! Now, we're onto something interesting.

The dataclasses module in the standard library works by collecting type hints, generating code strings, and executing them using the exec() function. This happens for every single class definition where it's used. If it sounds slow, that's because it is. In fact, it defeats any benefit of module caching in Python's import system.

Dataklasses are different. They start out in the same manner--code is first generated by collecting type hints and using exec(). However, the underlying byte-code is cached and reused in subsequent class definitions whenever possible.

A Short Story

Once upon a time, there was this programming language that I'll refer to as "Lava." Anyways, anytime you started a program written in Lava, you could just tell by the awkward silence and inactivity of your machine before the fans kicked in. "Ah shit, this is written in Lava" you'd exclaim.

Questions and Answers

Q: What methods does dataklass generate?

A: By default __init__(), __repr__(), and __eq__() methods are generated. __match_args__ is also defined to assist with pattern matching.

Q: Does dataklass enforce the specified types?

A: No. The types are merely clues about what the value might be and the Python language does not provide any enforcement on its own.

Q: Are there any additional features?

A: No. You can either have features or you can have performance. Pick one.

Q: Does dataklass use any advanced magic such as metaclasses?

A: No.

Q: How do I install dataklasses?

A: There is no setup.py file, installer, or an official release. You install it by copying the code into your own project. dataklasses.py is small. You are encouraged to modify it to your own purposes.

Q: But what if new features get added?

A: What new features? The best new features are no new features.

Q: Who maintains dataklasses?

A: If you're using it, you do. You maintain dataklasses.

Q: Who wrote this?

A: dataklasses is the work of David Beazley. http://www.dabeaz.com.

Owner
David Beazley
Author of the Python Essential Reference (Addison-Wesley), Python Cookbook (O'Reilly), and former computer science professor. Come take a class!
David Beazley
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
Code for reproducible experiments presented in KSD Aggregated Goodness-of-fit Test.

Code for KSDAgg: a KSD aggregated goodness-of-fit test This GitHub repository contains the code for the reproducible experiments presented in our pape

Antonin Schrab 5 Dec 15, 2022
Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*. The algorithm was extremely

1 Mar 28, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack Case study of the FCA. The code can be find in FCA. Cas

IDRL 21 Dec 15, 2022
Synthetic structured data generators

Join us on What is Synthetic Data? Synthetic data is artificially generated data that is not collected from real world events. It replicates the stati

YData 850 Jan 07, 2023
Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network This repository is the implementation of ACE-HGNN in PyTorch. Environment pyt

9 Nov 28, 2022
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 01, 2023
Put blind watermark into a text with python

text_blind_watermark Put blind watermark into a text. Can be used in Wechat dingding ... How to Use install pip install text_blind_watermark Alice Pu

郭飞 164 Dec 30, 2022
A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Monte Carlo Simulation to the Paper A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Sören Kohnert 0 Dec 06, 2021
Code for our paper "Graph Pre-training for AMR Parsing and Generation" in ACL2022

AMRBART An implementation for ACL2022 paper "Graph Pre-training for AMR Parsing and Generation". You may find our paper here (Arxiv). Requirements pyt

xfbai 60 Jan 03, 2023
Pyramid Scene Parsing Network, CVPR2017.

Pyramid Scene Parsing Network by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page. Introduction This

Hengshuang Zhao 1.5k Jan 05, 2023
Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

Code accompanying "Dynamic Neural Relational Inference" This codebase accompanies the paper "Dynamic Neural Relational Inference" from CVPR 2020. This

Colin Graber 48 Dec 23, 2022
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
lightweight python wrapper for vowpal wabbit

vowpal_porpoise Lightweight python wrapper for vowpal_wabbit. Why: Scalable, blazingly fast machine learning. Install Install vowpal_wabbit. Clone and

Joseph Reisinger 163 Nov 24, 2022
The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

Yu Meng 38 Dec 12, 2022
An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.

SSIDprobeCollector An ML & Correlation platform for transforming disparate data points of interest into usable intelligence. At a High level the platf

Bill Reyor 1 Jan 30, 2022
基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

37 Jan 01, 2023
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022