BankNote-Net: Open dataset and encoder model for assistive currency recognition

Overview

BankNote-Net: Open Dataset for Assistive Currency Recognition

Millions of people around the world have low or no vision. Assistive software applications have been developed for a variety of day-to-day tasks, including currency recognition. To aid with this task, we present BankNote-Net, an open dataset for assistive currency recognition. The dataset consists of a total of 24,816 embeddings of banknote images captured in a variety of assistive scenarios, spanning 17 currencies and 112 denominations. These compliant embeddings were learned using supervised contrastive learning and a MobileNetV2 architecture, and they can be used to train and test specialized downstream models for any currency, including those not covered by our dataset or for which only a few real images per denomination are available (few-shot learning). We deploy a variation of this model for public use in the last version of the Seeing AI app developed by Microsoft, which has over a 100 thousand monthly active users.

If you make use of this dataset or pre-trained model in your own project, please consider referencing this GitHub repository and citing our paper:

@article{oviedoBankNote-Net2022,
  title   = {BankNote-Net: Open Dataset for Assistive Currency Recognition},
  author  = {Felipe Oviedo, Srinivas Vinnakota, Eugene Seleznev, Hemant Malhotra, Saqib Shaikh & Juan Lavista Ferres},
  journal = {https://arxiv.org/pdf/2204.03738.pdf},
  year    = {2022},
}

Data Structure

The dataset data structure consists of 256-dimensional vector embeddings with additional columns for currency, denomination and face labels, as explained in the data exploration notebook. The dataset is saved as 24,826 x 258 flat table in feather and csv file formats. Figure 1 presents some of these learned embeddings.

Figure 1: t-SNE representations of the BankNote-Net embeddings for a few selected currencies.

Setup and Dataset Usage

  1. Install requirements.

    Please, use the conda environment file env.yaml to install the right dependencies.

    # Create conda environment
    conda create env -f env.yaml
    
    # Activate environment to run examples
    conda activate banknote_net
    
  2. Example 1: Train a shallow classifier directly from the dataset embeddings for a currency available in the dataset. For inference, images should be encoded first using the keras MobileNet V2 pre-trained encoder model.

    Run the following file from root: train_from_embedding.py

    python src/train_from_embedding.py --currency AUD --bsize 128 --epochs 25 --dpath ./data/banknote_net.feather
    
      usage: train_from_embedding.py [-h] --currency
                                  {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}
                                  [--bsize BSIZE] [--epochs EPOCHS]
                                  [--dpath DPATH]
    
      Train model from embeddings.
    
      optional arguments:
      -h, --help            show this help message and exit
      --currency {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}, --c {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}
                              String of currency for which to train shallow
                              classifier
      --bsize BSIZE, --b BSIZE
                              Batch size for shallow classifier
      --epochs EPOCHS, --e EPOCHS
                              Number of epochs for training shallow top classifier
      --dpath DPATH, --d DPATH
                              Path to .feather BankNote Net embeddings
                          
    
  3. Example 2: Train a classifier on top of the BankNote-Net pre-trained encoder model using images in a custom directory. Input images must be of size 224 x 224 pixels and have square aspect ratio. For this example, we use a couple dozen images spanning 8 classes for Swedish Krona, structured as in the example_images/SEK directory, that contains both training and validation images.

    Run the following file from root: train_custom.py

    python src/train_custom.py --bsize 4 --epochs 25 --data_path ./data/example_images/SEK/ --enc_path ./models/banknote_net_encoder.h5
    
    usage: train_custom.py [-h] [--bsize BSIZE] [--epochs EPOCHS]
                      [--data_path DATA_PATH] [--enc_path ENC_PATH]
    
    Train model from custom image folder using pre-trained BankNote-Net encoder.
    
    optional arguments:
    -h, --help            show this help message and exit
    --bsize BSIZE, --b BSIZE
                          Batch size
    --epochs EPOCHS, --e EPOCHS
                          Number of epochs for training shallow top classifier.
    --data_path DATA_PATH, --data DATA_PATH
                          Path to folder with images.
    --enc_path ENC_PATH, --enc ENC_PATH
                          Path to .h5 file of pre-trained encoder model.                       
    
  4. Example 3: Perform inference using the SEK few-shot classifier of Example 2, and the validation images on example_images/SEK/val

    Run the following file from root: predict_custom.py, returns encoded predictions.

      python src/predict_custom.py --bsize 1 --data_path ./data/example_images/SEK/val/ --model_path ./src/trained_models/custom_classifier.h5
    
      usage: predict_custom.py [-h] [--bsize BSIZE] [--data_path DATA_PATH]
                              [--model_path MODEL_PATH]
    
      Perform inference using trained custom classifier.
    
      optional arguments:
      -h, --help            show this help message and exit
      --bsize BSIZE, --b BSIZE
                              Batch size
      --data_path DATA_PATH, --data DATA_PATH
                              Path to custom folder with validation images.
      --model_path MODEL_PATH, --enc MODEL_PATH
                              Path to .h5 file of trained classification model.                           
    

License for Dataset and Model

Copyright (c) Microsoft Corporation. All rights reserved.

The dataset is open for anyone to use under the CDLA-Permissive-2.0 license. The embeddings should not be used to reconstruct high resolution banknote images.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

Jie-Neng Chen 130 Jan 01, 2023
Breast Cancer Classification Model is applied on a different dataset

Breast Cancer Classification Model is applied on a different dataset

1 Feb 04, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.

Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf

Mini Project 64 Nov 19, 2022
An implementation of the proximal policy optimization algorithm

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

Discrete Denoising Flows This repository contains the code for the experiments presented in the paper Discrete Denoising Flows [1]. To give a short ov

Alexandra Lindt 3 Oct 09, 2022
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

NVIDIA Research Projects 74 Dec 22, 2022
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
Snscrape-jsonl-urls-extractor - Extracts urls from jsonl produced by snscrape

snscrape-jsonl-urls-extractor extracts urls from jsonl produced by snscrape Usag

1 Feb 26, 2022
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
A smaller subset of 10 easily classified classes from Imagenet, and a little more French

Imagenette ๐ŸŽถ Imagenette, gentille imagenette, Imagenette, je te plumerai. ๐ŸŽถ (Imagenette theme song thanks to Samuel Finlayson) NB: Versions of Image

fast.ai 718 Jan 01, 2023
official implemntation for "Contrastive Learning with Stronger Augmentations"

CLSA CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations. Copyright (C) 2020 Xiao Wang, Guo-Jun

Lab for MAchine Perception and LEarning (MAPLE) 47 Nov 29, 2022
IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales. In this case, we ended up using XGBoost because it was the o

1 Jan 04, 2022
This is the repository for the paper "Have I done enough planning or should I plan more?"

Metacognitive Learning Tool box https://re.is.mpg.de What Is This? This repository contains two modules used to analyse metacognitive learning in huma

0 Dec 01, 2021
A simple, high level, easy-to-use open source Computer Vision library for Python.

ZoomVision : Slicing Aid Detection A simple, high level, easy-to-use open source Computer Vision library for Python. Installation Installing dependenc

Nurettin SinanoฤŸlu 2 Mar 04, 2022
Noether Networks: meta-learning useful conserved quantities

Noether Networks: meta-learning useful conserved quantities This repository contains the code necessary to reproduce experiments from "Noether Network

Dylan Doblar 33 Nov 23, 2022
REBEL: Relation Extraction By End-to-end Language generation

REBEL: Relation Extraction By End-to-end Language generation This is the repository for the Findings of EMNLP 2021 paper REBEL: Relation Extraction By

Babelscape 222 Jan 06, 2023
Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

Phil Wang 602 Jan 03, 2023
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP ๐Ÿšง WIP ๐Ÿšง Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP ๐Ÿ“„ ๐Ÿ”— Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022