Simple, efficient and flexible vision toolbox for mxnet framework.

Last update: Oct 19, 2019

Overview

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox is a toolbox aiming to provide a general and simple interface for vision tasks. This project is greatly inspired by PyTorch and torchvision. Detailed copyright files are on the way. Improvements and suggestions are welcome.

Installation

MXBox is now available on PyPi.

pip install mxbox

Features

Define preprocess as a flow

transform = transforms.Compose([
    transforms.RandomSizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.mx.ToNdArray(),
    transforms.mx.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                            std  = [ 0.229, 0.224, 0.225 ]),
])

PS: By default, mxbox uses PIL to read and transform images. But it also supports other backends like accimage and skimage.

More usages can be found in documents and examples.

Build an multi-thread DataLoader in few lines

Common datasets such as cifar10, cifar100, SVHN, MNIST are out-of-the-box. You can simply load them from mxbox.datasets.

from mxbox import transforms, datasets, DataLoader
trans = transforms.Compose([
        transforms.mx.ToNdArray(), 
        transforms.mx.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                                std  = [ 0.229, 0.224, 0.225 ]),
])
dataset = datasets.CIFAR10('~/.mxbox/cifar10', transform=trans, download=True)

batch_size = 32
feedin_shapes = {
    'batch_size': batch_size,
    'data': [mx.io.DataDesc(name='data', shape=(batch_size, 3, 32, 32), layout='NCHW')],
    'label': [mx.io.DataDesc(name='softmax_label', shape=(batch_size, ), layout='N')]
}
loader = DataLoader(dataset, feedin_shapes, threads=8, shuffle=True)

Or you can also easily create your own, which only requires to implement __getitem__ and __len__.

class TooYoungScape(mxbox.Dataset):
    def __init__(self, root, lst, transform=None):
        self.root = root
        with open(osp.join(root, lst), 'r') as fp:
            self.lst = [line.strip().split('\t') for line in fp.readlines()]
        self.transform = transform

    def __getitem__(self, index):
        img = self.pil_loader(osp.join(self.root, self.lst[index][0]))
        if self.transform is not None:
            img = self.transform(img)
        return {'data': img, 'softmax_label': img}

    def __len__(self):
        return len(self.lst)
        
dataset = TooYoungScape('~/.mxbox/TooYoungScape', "train.lst", transform=trans)
loader = DataLoader(dataset, feedin_shapes, threads=8, shuffle=True)

Load popular model with pretrained weights

Note: current under construction, many models lack of pretrained weights and some of their definition files are missing.

vgg = mxbox.models.vgg(num_classes=10, pretrained=True)
resnet = mxbox.models.resnet152(num_classes=10, pretrained=True)

TODO list

FLAG options?
Efficient prefetch.
Common Models preparation.
More friendly error logging.

Simple, efficient and flexible vision toolbox for mxnet framework.

Related tags

Overview

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework.

Installation

Features

TODO list

Owner

Ligeng Zhu

RetinaFace: Deep Face Detection Library in TensorFlow for Python

Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning

Implementation of paper "DeepTag: A General Framework for Fiducial Marker Design and Detection"

Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

On-device speech-to-intent engine powered by deep learning

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set

Self-Adaptable Point Processes with Nonparametric Time Decays

A modular domain adaptation library written in PyTorch.

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Download & Install mods for your favorit game with a few simple clicks

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

Low Complexity Channel estimation with Neural Network Solutions

This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions,spherical coordinates, and intensity

Arabic Car License Recognition. A solution to the kaggle competition Machathon 3.0.

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose