An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Simple program to easily view Euler parameters in 3D.

Simple program to easily view Euler parameters in 3D.

5 Aug 20, 2021
Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Feliks Maak 1 Jan 31, 2022
A functional and efficient python implementation of the 3D version of Maxwell's equations

py-maxwell-fdfd Solving Maxwell's equations via A python implementation of the 3D curl-curl E-field equations. This code contains additional work to e

Nathan Zhao 12 Dec 11, 2022
Deep Illuminator is a data augmentation tool designed for image relighting.

Deep Illuminator Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently genera

George Chogovadze 52 Nov 29, 2022
Fixes 500+ mislabeled MURA images

In this repository, new csv files are provided that fixes 500+ mislabeled MURA x-rays for all categories. The mislabeled x-rays mainly had hardware in them. This project only fixes the false negative

Pieter Zeilstra 4 May 18, 2022
Raven is a tool written in Python3 allowing you to generate an unique image with some text.

🐦 Raven is a tool written in Python3 allowing you to generate an unique image with some text. It does it by searching the text on Google, do

Billy 39 Dec 20, 2022
vsketch is a Python generative art toolkit for plotters

Generative plotter art environment for Python

Antoine Beyeler 380 Dec 29, 2022
An application that maps an image of a LaTeX math equation to LaTeX code.

Convert images of LaTex math equations into LaTex code.

1.3k Jan 06, 2023
PyGtk Color - A couple of python scripts to select a color (for scripting usage)

Selection Scripts This repository contains two scripts to be used within a scripting project, to aquire a color value. Both scripts requir

Spiros Georgaras 1 Oct 31, 2021
This is a python project which detects color of an image when you double click on it.

This is a python project which detects color of an image when you double click on it. You have to press ESC button to close the pop-up Image window. There are mainly two library CV2 and Pandas that a

Yashwant Kumar Singh 0 Aug 16, 2022
The InvGears workbench for FreeCAD allows the creation of gear systems

FreeCAD InvGears workbench Current version 0.1.1 Overview The InvGears workbench allows the creation of gear systems. The gear generation algorithm is

Sebastian Ernesto Garcia 8 Dec 10, 2021
A python script for extracting/removing exif data from images by @AbirHasan2005

Image-Exif A Python script for extracting exif metadata from images. How to use? Using this script you can extract exif data from image and save in .c

Abir Hasan 13 Dec 16, 2022
Python class that generates pixel art from images

Python class that generates pixel art from images

Richard Nagyfi 1.4k Dec 29, 2022
LGVL helper script to batch and convert with lvgl offline image converter

script to batch and convert with lvgl offline image converter

Yohann 1 Oct 05, 2022
Hacking github graph with a easy python script

Hacking-Github-Graph Hacking github graph with a easy python script Requirements git latest version installed. A text editor (eg: vs code, sublime tex

SENPAI LEGEND 1 Nov 01, 2021
Computational Xmas Tree lights!

Computational Xmas Tree This repo contains the code for the computational illumination of a Christmas Tree! It is based on the work by Matt Parker fro

GSD6338 146 Dec 23, 2022
Parking management project which generates barcode parking ticket with user-friendly Tkinter program GUI

Parking-management-system Parking management project which generates barcode parking ticket with user-friendly Tkinter program GUI How to run Download

1 Jul 03, 2022
DP2 graph edit codes.

必要なソフト・パッケージ Python3 Numpy JSON Matplotlib 動作確認環境 MacBook Air M1 Python 3.8.2 (arm64) Numpy 1.22.0 Matplotlib 3.5.1 JSON 2.0.9 使い方 draw_time_histgram(

1 Feb 19, 2022
Short piece of code to create a rainbow gif of gradual contours from two shapefiles

rainbow-elevation-gif Short piece of code to create a rainbow gif of gradual con

Jess Roberts 6 Jan 17, 2022
A python program to generate ANSI art from images and videos

ANSI Art Generator A python program that creates ASCII art (with true color support if enabled) from images and videos Dependencies The program runs u

Pratyush Kumar 12 Nov 08, 2022