Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Overview

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

arXiv

Description

Convert offline handwritten mathematical expression to LaTeX sequence using bidirectionally trained transformer.

How to run

First, install dependencies

# clone project   
git clone https://github.com/Green-Wood/BTTR

# install project   
cd BTTR
conda create -y -n bttr python=3.7
conda activate bttr
conda install --yes -c pytorch pytorch=1.7.0 torchvision cudatoolkit=<your-cuda-version>
pip install -e .   

Next, navigate to any file and run it. It may take 6~7 hours to coverage on 4 gpus using ddp.

# module folder
cd BTTR

# train bttr model using 4 gpus and ddp
python train.py --config config.yaml  

For single gpu user, you may change the config.yaml file to

gpus: 1
# gpus: 4
# accelerator: ddp

Imports

This project is setup as a package which means you can now easily import any file into any other file like so:

from bttr.datamodule import CROHMEDatamodule
from bttr import LitBTTR
from pytorch_lightning import Trainer

# model
model = LitBTTR()

# data
dm = CROHMEDatamodule(test_year=test_year)

# train
trainer = Trainer()
trainer.fit(model, datamodule=dm)

# test using the best model!
trainer.test(datamodule=dm)

Note

Metrics used in validation is not accurate.

For more accurate metrics:

  1. use test.py to generate result.zip
  2. download and install crohmelib, lgeval, and tex2symlg tool.
  3. convert tex file to symLg file using tex2symlg command
  4. evaluate two folder using evaluate command

Citation

@article{zhao2021handwritten,
  title={Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer},
  author={Zhao, Wenqi and Gao, Liangcai and Yan, Zuoyu and Peng, Shuai and Du, Lin and Zhang, Ziyin},
  journal={arXiv preprint arXiv:2105.02412},
  year={2021}
}
Comments
  • can you provide predict.py code?

    can you provide predict.py code?

    Hi ~ @Green-Wood.

    I feel grateful mind for your help. I wanna get predict.py code that prints latex from an input image. If this code is provided, it will be very useful to others as well.

    Best regards.

    opened by ai-motive 17
  • val_exprate=0 and save checkpoint

    val_exprate=0 and save checkpoint

    hello!thanks for your time! When I transfer some code in decoder or use it directly,the val_exprate are always be 0.000,I don't know why. Another problem is,I noticed that this code don't have the function to save checkpoint or something.Can you give me some help?Thanks again!

    opened by Ashleyyyi 6
  • Val_exprate = 0

    Val_exprate = 0

    When I retrained the model according to the instruction, the val_exprate was always 0.00, did anyone encounter this problem, thank you! (I has not modified any codes) @Green-Wood

    opened by qingqianshuying 4
  • test.py error occurs

    test.py error occurs

    When I run test.py code, the following error occurs. Can i get some helps?

    in test.py code test_year = "2016" ckp_path = "pretrained model"

    GPU available: True, used: True
    TPU available: False, using: 0 TPU cores
    Load data from: /home/motive/PycharmProjects/BTTR/bttr/datamodule/../../data.zip
    Extract data from: 2016, with data size: 1147
    total  1147 batch data loaded
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.01s/it]ExpRate: 0.32258063554763794
    length of total file: 1147
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.52it/s]
    --------------------------------------------------------------------------------
    DATALOADER:0 TEST RESULTS
    {}
    --------------------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/motive/PycharmProjects/BTTR/test.py", line 17, in <module>
        trainer.test(model, datamodule=dm)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
        results = self._run(model)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 759, in _run
        self.post_dispatch()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 789, in post_dispatch
        self.accelerator.teardown()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 51, in teardown
        self.lightning_module.cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 141, in cpu
        return super().cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in cpu
        return self._apply(lambda t: t.cpu())
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
        module._apply(fn)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in _apply
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in <listcomp>
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in <lambda>
        return self._apply(lambda t: t.cpu())
    AttributeError: 'tuple' object has no attribute 'cpu'
    
    opened by ai-motive 3
  • How long does BTTR take to train?

    How long does BTTR take to train?

    Hi, thank you for great repository!

    How long does it take to train for your experiment in the paper? I mean training on CROHME 2014/2016/2019 on four NVIDIA 1080Ti GPUs.

    Thanks,

    opened by RyosukeFukatani 2
  • can you provide transfer learning code?

    can you provide transfer learning code?

    Hi~ @Green-Wood

    I wanna apply trasnfer learning using pretrained model.

    but, LightningCLI() is wrapped and difficult to customize.

    Thanks & best regards.

    opened by ai-motive 1
  • How can it get pretrained model ?

    How can it get pretrained model ?

    Hi, I wanna test your BTTR model but, it need to training process which will take a lot of time. So, can you give me a pretrained model link?

    Best regards.

    opened by ai-motive 1
  • After adding new token in dictionary getting error .

    After adding new token in dictionary getting error .

    Hi , getting error after adding new token in dictionary.txt

    Error(s) in loading state_dict for LitBTTR: size mismatch for bttr.decoder.word_embed.0.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.bias: copying a param with shape torch.Size([113]) from checkpoint, the shape in current model is torch.Size([115]).

    Kindly help me out how can i fix this error.

    opened by shivankaraditi 0
  • About dataset

    About dataset

    Could you tell me how to generate the offline math expression image from inkml file? My experiment show that a large scale image could improve the result obviously,so I'd like to know if there is unified offline data for academic research.

    opened by lightflash7 0
  • predicting on gpu is slower

    predicting on gpu is slower

    Hi ,

    As this model is a bit slower compared to the existing state-of-the-art model on CPU. So I tried to make predictions on GPU and surprisingly it slower on Gpu compare to CPU as well.

    I am attaching a code snapshot here

    device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

    model = LitBTTR.load_from_checkpoint('pretrained-2014.ckpt',map_location=device)

    img = Image.open(img_path) img = ToTensor()(img) img.to(device)

    t1 = time.time() hyp = model.beam_search(img) t2 = time.time()

    Kindly help me out here how i can reduce prediction time

    FYI - using GPU on aws g4dn.xlarge configuration machine

    opened by Suma3 1
  • how to use TensorBoard?

    how to use TensorBoard?

    hello i don't know how to add scalar to TensorBoard? I want to do this kind of topic, hoping to improve some ExpRate, but I don’t know much about lightning TensorBoard.

    opened by win5923 9
Releases(v2.0)
Owner
Wenqi Zhao
Student in Nanjing University
Wenqi Zhao
Official Pytorch implementation of MixMo framework

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks Official PyTorch implementation of the MixMo framework | paper | docs Alexandr

79 Nov 07, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 04, 2023
An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha

Sewon Min 92 Jan 07, 2023
Identifying Stroke Indicators Using Rough Sets

Identifying Stroke Indicators Using Rough Sets With the spirit of reproducible research, this repository contains all the codes required to produce th

Muhammad Salman Pathan 0 Jun 09, 2022
Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

SALOD Source code of our work: "Benchmarking Deep Models for Salient Object Detection". In this works, we propose a new benchmark for SALient Object D

22 Dec 30, 2022
Fast, flexible and easy to use probabilistic modelling in Python.

Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic

Jacob Schreiber 3k Dec 29, 2022
An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.

Federated Averaging (FedAvg) in PyTorch An unofficial implementation of FederatedAveraging (or FedAvg) algorithm proposed in the paper Communication-E

Seok-Ju Hahn 123 Jan 06, 2023
Goal of the project : Detecting Temporal Boundaries in Sign Language videos

MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an

Loubna Ben Allal 6 Dec 21, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

Oliver Hahn 1 Jan 26, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022
EMNLP 2020 - Summarizing Text on Any Aspects

Summarizing Text on Any Aspects This repo contains preliminary code of the following paper: Summarizing Text on Any Aspects: A Knowledge-Informed Weak

Bowen Tan 35 Nov 14, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

THUDM 540 Dec 30, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 07, 2022
sense-py-AnishaBaishya created by GitHub Classroom

Compute Statistics Here we compute statistics for a bunch of numbers. This project uses the unittest framework to test functionality. Pass the tests T

1 Oct 21, 2021
[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

RegionProxy Figure 2. Performance vs. GFLOPs on ADE20K val split. Semantic Segmentation by Early Region Proxy Yifan Zhang, Bo Pang, Cewu Lu CVPR 2022

Yifan 54 Nov 29, 2022
Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

A project for counting vehicles using YOLOv4 + DeepSORT + Flask + Ngrok

Duong Tran Thanh 37 Dec 16, 2022
Zero-Cost Proxies for Lightweight NAS

Zero-Cost-NAS Companion code for the ICLR2021 paper: Zero-Cost Proxies for Lightweight NAS tl;dr A single minibatch of data is used to score neural ne

SamsungLabs 108 Dec 20, 2022
FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning PyTorch implementation for the paper: FACIAL: Synthesizing Dynamic Talking

226 Jan 08, 2023
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022
Build and run Docker containers leveraging NVIDIA GPUs

NVIDIA Container Toolkit Introduction The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includ

NVIDIA Corporation 15.6k Jan 01, 2023