Image augmentation library in Python for machine learning.

Overview

AugmentorLogo

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independent, which is more convenient, allows for finer grained control over augmentation, and implements the most real-world relevant augmentation techniques. It employs a stochastic approach using building blocks that allow for operations to be pieced together in a pipeline.

PyPI Supported Python Versions Documentation Status Build Status License Project Status: Active – The project has reached a stable, usable state and is being actively developed. Binder

Installation

Augmentor is written in Python. A Julia version of the package is also being developed as a sister project and is available here.

Install using pip from the command line:

pip install Augmentor

See the documentation for building from source. To upgrade from a previous version, use pip install Augmentor --upgrade.

Documentation

Complete documentation can be found on Read the Docs: http://augmentor.readthedocs.io/

Quick Start Guide and Usage

The purpose of Augmentor is to automate image augmentation (artificial data generation) in order to expand datasets as input for machine learning algorithms, especially neural networks and deep learning.

The package works by building an augmentation pipeline where you define a series of operations to perform on a set of images. Operations, such as rotations or transforms, are added one by one to create an augmentation pipeline: when complete, the pipeline can be executed and an augmented dataset is created.

To begin, instantiate a Pipeline object that points to a directory on your file system:

import Augmentor
p = Augmentor.Pipeline("/path/to/images")

You can then add operations to the Pipeline object p as follows:

p.rotate(probability=0.7, max_left_rotation=10, max_right_rotation=10)
p.zoom(probability=0.5, min_factor=1.1, max_factor=1.5)

Every function requires you to specify a probability, which is used to decide if an operation is applied to an image as it is passed through the augmentation pipeline.

Once you have created a pipeline, you can sample from it like so:

p.sample(10000)

which will generate 10,000 augmented images based on your specifications. By default these will be written to the disk in a directory named output relative to the path specified when initialising the p pipeline object above.

If you wish to process each image in the pipeline exactly once, use process():

p.process()

This function might be useful for resizing a dataset for example. It would make sense to create a pipeline where all of its operations have their probability set to 1 when using the process() method.

Multi-threading

Augmentor (version >=0.2.1) now uses multi-threading to increase the speed of generating images.

This may slow down some pipelines if the original images are very small. Set multi_threaded to False if slowdown is experienced:

p.sample(100, multi_threaded=False)

However, by default the sample() function uses multi-threading. This is currently only implemented when saving to disk. Generators will use multi-threading in the next version update.

Ground Truth Data

Images can be passed through the pipeline in groups of two or more so that ground truth data can be identically augmented.

Original image and mask[3] Augmented original and mask images
OriginalMask AugmentedMask

To augment ground truth data in parallel to any original data, add a ground truth directory to a pipeline using the ground_truth() function:

p = Augmentor.Pipeline("/path/to/images")
# Point to a directory containing ground truth data.
# Images with the same file names will be added as ground truth data
# and augmented in parallel to the original data.
p.ground_truth("/path/to/ground_truth_images")
# Add operations to the pipeline as normal:
p.rotate(probability=1, max_left_rotation=5, max_right_rotation=5)
p.flip_left_right(probability=0.5)
p.zoom_random(probability=0.5, percentage_area=0.8)
p.flip_top_bottom(probability=0.5)
p.sample(50)

Multiple Mask/Image Augmentation

Using the DataPipeline class (Augmentor version >= 0.2.3), images that have multiple associated masks can be augmented:

Multiple Mask Augmentation
MultipleMask

Arbitrarily long lists of images can be passed through the pipeline in groups and augmented identically using the DataPipeline class. This is useful for ground truth images that have several masks, for example.

In the example below, the images and their masks are contained in the images data structure (as lists of lists), while their labels are contained in y:

p = Augmentor.DataPipeline(images, y)
p.rotate(1, max_left_rotation=5, max_right_rotation=5)
p.flip_top_bottom(0.5)
p.zoom_random(1, percentage_area=0.5)

augmented_images, labels = p.sample(100)

The DataPipeline returns images directly (augmented_images above), and does not save them to disk, nor does it read data from the disk. Images are passed directly to DataPipeline during initialisation.

For details of the images data structure and how to create it, see the Multiple-Mask-Augmentation.ipynb Jupyter notebook.

Generators for Keras and PyTorch

If you do not wish to save to disk, you can use a generator (in this case with Keras):

g = p.keras_generator(batch_size=128)
images, labels = next(g)

which returns a batch of images of size 128 and their corresponding labels. Generators return data indefinitely, and can be used to train neural networks with augmented data on the fly.

Alternatively, you can integrate it with PyTorch:

import torchvision
transforms = torchvision.transforms.Compose([
    p.torch_transform(),
    torchvision.transforms.ToTensor(),
])

Main Features

Elastic Distortions

Using elastic distortions, one image can be used to generate many images that are real-world feasible and label preserving:

Input Image Augmented Images
eight_hand_drawn_border eights_border

The input image has a 1 pixel black border to emphasise that you are getting distortions without changing the size or aspect ratio of the original image, and without any black/transparent padding around the newly generated images.

The functionality can be more clearly seen here:

Original Image[1] Random distortions applied
Original Distorted

Perspective Transforms

There are a total of 12 different types of perspective transform available. Four of the most common are shown below.

Tilt Left Tilt Right Tilt Forward Tilt Backward
TiltLeft Original Original Original

The remaining eight types of transform are as follows:

Skew Type 0 Skew Type 1 Skew Type 2 Skew Type 3
Skew0 Skew1 Skew2 Skew3
Skew Type 4 Skew Type 5 Skew Type 6 Skew Type 7
Skew4 Skew5 Skew6 Skew7

Size Preserving Rotations

Rotations by default preserve the file size of the original images:

Original Image Rotated 10 degrees, automatically cropped
Original Rotate

Compared to rotations by other software:

Original Image Rotated 10 degrees
Original Rotate

Size Preserving Shearing

Shearing will also automatically crop the correct area from the sheared image, so that you have an image with no black space or padding.

Original image Shear (x-axis) 20 degrees Shear (y-axis) 20 degrees
Original ShearX ShearY

Compare this to how this is normally done:

Original image Shear (x-axis) 20 degrees Shear (y-axis) 20 degrees
Original ShearX ShearY

Cropping

Cropping can also be handled in a manner more suitable for machine learning image augmentation:

Original image Random crops + resize operation
Original Original

Random Erasing

Random Erasing is a technique used to make models robust to occlusion. This may be useful for training neural networks used in object detection in navigation scenarios, for example.

Original image[2] Random Erasing
Original Original

See the Pipeline.random_erasing() documentation for usage.

Chaining Operations in a Pipeline

With only a few operations, a single image can be augmented to produce large numbers of new, label-preserving samples:

Original image Distortions + mirroring
Original DistortFlipFlop

In the example above, we have applied three operations: first we randomly distort the image, then we flip it horizontally with a probability of 0.5 and then vertically with a probability of 0.5. We then sample from this pipeline 100 times to create 100 new data.

p.random_distortion(probability=1, grid_width=4, grid_height=4, magnitude=8)
p.flip_left_right(probability=0.5)
p.flip_top_bottom(probability=0.5)
p.sample(100)

Tutorial Notebooks

Integration with Keras using Generators

Augmentor can be used as a replacement for Keras' augmentation functionality. Augmentor can create a generator which produces augmented data indefinitely, according to the pipeline you have defined. See the following notebooks for details:

  • Reading images from a local directory, augmenting them at run-time, and using a generator to pass the augmented stream of images to a Keras convolutional neural network, see Augmentor_Keras.ipynb
  • Augmenting data in-memory (in array format) and using a generator to pass these new images to the Keras neural network, see Augmentor_Keras_Array_Data.ipynb

Per-Class Augmentation Strategies

Augmentor allows for pipelines to be defined per class. That is, you can define different augmentation strategies on a class-by-class basis for a given classification problem.

See an example of this in the following Jupyter notebook: Per_Class_Augmentation_Strategy.ipynb

Complete Example

Let's perform an augmentation task on a single image, demonstrating the pipeline and several features of Augmentor.

First import the package and initialise a Pipeline object by pointing it to a directory containing your images:

import Augmentor

p = Augmentor.Pipeline("/home/user/augmentor_data_tests")

Now you can begin adding operations to the pipeline object:

p.rotate90(probability=0.5)
p.rotate270(probability=0.5)
p.flip_left_right(probability=0.8)
p.flip_top_bottom(probability=0.3)
p.crop_random(probability=1, percentage_area=0.5)
p.resize(probability=1.0, width=120, height=120)

Once you have added the operations you require, you can sample images from this pipeline:

p.sample(100)

Some sample output:

Input Image[3] Augmented Images
Original Augmented

The augmented images may be useful for a boundary detection task, for example.

Licence and Acknowledgements

Augmentor is made available under the terms of the MIT Licence. See Licence.md.

[1] Checkerboard image obtained from Wikimedia Commons and is in the public domain: https://commons.wikimedia.org/wiki/File:Checkerboard_pattern.svg

[2] Street view image is in the public domain: http://stokpic.com/project/italian-city-street-with-shoppers/

[3] Skin lesion image obtained from the ISIC Archive:

You can use urllib to obtain the skin lesion image in order to reproduce the augmented images above:

>>> from urllib import urlretrieve
>>> im_url = "https://isic-archive.com:443/api/v1/image/5436e3abbae478396759f0cf/download"
>>> urlretrieve(im_url, "ISIC_0000000.jpg")
('ISIC_0000000.jpg', <httplib.HTTPMessage instance at 0x7f7bd949a950>)

Note: For Python 3, use from urllib.request import urlretrieve.

Logo created at LogoMakr.com

Tests

To run the automated tests, clone the repository and run:

$ py.test -v

from the command line. To view the CI tests that are run after each commit, see https://travis-ci.org/mdbloice/Augmentor.

Citing Augmentor

If you find this package useful and wish to cite it, you can use

Marcus D Bloice, Peter M Roth, Andreas Holzinger, Biomedical image augmentation using Augmentor, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz259

Asciicast

Click the preview below to view a video demonstration of Augmentor in use:

asciicast

Owner
Marcus D. Bloice
Researcher in applied machine learning for healthcare, Medical University of Graz, Austria.
Marcus D. Bloice
利用yolov5和TensorRT从0到1实现目标检测的模型训练到模型部署全过程

写在前面 利用TensorRT加速推理速度是以时间换取精度的做法,意味着在推理速度上升的同时将会有精度的下降,不过不用太担心,精度下降微乎其微。此外,要有NVIDIA显卡,经测试,CUDA10.2可以支持20系列显卡及以下,30系列显卡需要CUDA11.x的支持,并且目前有bug。 默认你已经完成了

Helium 6 Jul 28, 2022
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
Official Pytorch implementation of MixMo framework

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks Official PyTorch implementation of the MixMo framework | paper | docs Alexandr

79 Nov 07, 2022
PyoMyo - Python Opensource Myo library

PyoMyo Python module for the Thalmic Labs Myo armband. Cross platform and multithreaded and works without the Myo SDK. pip install pyomyo Documentati

PerlinWarp 81 Jan 08, 2023
Code for visualizing the loss landscape of neural nets

Visualizing the Loss Landscape of Neural Nets This repository contains the PyTorch code for the paper Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer

Tom Goldstein 2.2k Jan 09, 2023
minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

rust-mdbg: Minimizer-space de Bruijn graphs (mdBG) for whole-genome assembly rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) impleme

Barış Ekim 148 Dec 01, 2022
LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

LVI-SAM This repository contains code for a lidar-visual-inertial odometry and mapping system, which combines the advantages of LIO-SAM and Vins-Mono

Tixiao Shan 1.1k Dec 27, 2022
A benchmark framework for Tensorflow

TensorFlow benchmarks This repository contains various TensorFlow benchmarks. Currently, it consists of two projects: PerfZero: A benchmark framework

1.1k Dec 30, 2022
This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

Observe then Incentivize Experiments This is the code used for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents",

Cong Shen Research Group 0 Mar 08, 2022
Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

LPTN Paper | Supplementary Material | Poster High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network Ji

372 Dec 26, 2022
ROMP: Monocular, One-stage, Regression of Multiple 3D People, ICCV21

Monocular, One-stage, Regression of Multiple 3D People ROMP, accepted by ICCV 2021, is a concise one-stage network for multi-person 3D mesh recovery f

Yu Sun 937 Jan 04, 2023
make ASCII Art by Deep Learning

DeepAA This is convolutional neural networks generating ASCII art. This repository is under construction. This work is accepted by NIPS 2017 Workshop,

OsciiArt 1.4k Dec 28, 2022
[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning. This repo contains the PyTorch code and implementation for the paper E

Akuchi 18 Dec 22, 2022
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022
GPT, but made only out of gMLPs

GPT - gMLP This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will

Phil Wang 80 Dec 01, 2022
MIMO-UNet - Official Pytorch Implementation

MIMO-UNet - Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Rethinking Coarse-to-

Sungjin Cho 248 Jan 02, 2023
WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"

BiRTE WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction" Requirements The main requirements are: py

9 Dec 27, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 05, 2023