An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

GIMP script to export bitmap as GRAPHICS 4 file (aka SCREEN 5)

gimpfu-msx-gr4.py GIMP script to export bitmap as GRAPHICS 4 file (aka SCREEN 5). GRAPHICS 4 specs are: 256x212 (or 256x192); 16 color palette (from 5

Pedro de Medeiros 4 Oct 17, 2022
display: a browser-based graphics server

display: a browser-based graphics server Installation Quick Start Usage Development A very lightweight display server for Torch. Best used as a remote

Szymon Jakubczak 205 Oct 17, 2022
Seaborn-image is a Python image visualization library based on matplotlib and provides a high-level API to draw attractive and informative images quickly and effectively.

seaborn-image: image data visualization Description Seaborn-image is a Python image visualization library based on matplotlib and provides a high-leve

48 Jan 05, 2023
Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Feliks Maak 1 Jan 31, 2022
A functional and efficient python implementation of the 3D version of Maxwell's equations

py-maxwell-fdfd Solving Maxwell's equations via A python implementation of the 3D curl-curl E-field equations. This code contains additional work to e

Nathan Zhao 12 Dec 11, 2022
Find target hash collisions for Apple's NeuralHash perceptual hash function.💣

neural-hash-collider Find target hash collisions for Apple's NeuralHash perceptual hash function. For example, starting from a picture of this cat, we

Anish Athalye 630 Jan 01, 2023
Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom

Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom. Also, rasterize shapefile vectors as corresponding label image.

Gounari Olympia 2 Nov 22, 2021
Seeks to remove text from an image in a convincing way.

Text-Removal This is a Computer Vision project that seeks to successfully remove text from an image by covering the text areas in a convincing way. He

6 Nov 22, 2022
Python script to generate vector graphics of an oriented lattice unit cell

unitcell Python script to generate vector graphics of an oriented lattice unit cell Examples unitcell --type hexagonal --eulers 12 23 34 --axes --crys

Philip Eisenlohr 2 Dec 10, 2021
Python binding to Skia Graphics Library

Skia python binding Python binding to Skia Graphics Library. Binding based on pybind11. Currently, the binding is under active development. Install Bi

Kota Yamaguchi 170 Jan 06, 2023
Repair broken bookmarks to referenced files in Apple Photos

Repair Apple Photos Bookmarks Work in progress to repair file location bookmarks in Apple Photos. Background Starting in macOS 10.15/Catalina, photos

Rhet Turnbull 10 Nov 03, 2022
粉專/IG圖文加工器

粉專/IG圖文加工器 介紹 給PS智障(ex:我)使用,用於產生圖文 腳本省去每次重複步驟 可載入圖片(方形,請先處理過,歡迎PR) 圖片簡易套用濾鏡 可將圖片切片 要求 Python 版本 3.9 安裝 安裝最新 python pip3 install -r requirement.txt 效果

Louis Tang 7 Aug 10, 2022
Create a QR-code Generator app using only Python.

QR-code_Generator Create a QR-code Generator app using only Python. This apps generated a QR code for a single link. Libraryes used in this app -- py

Soham P Phasalkar 1 Oct 17, 2021
Generate your own QR Code and scan it to see the results! Never use it for malicious purposes.

QR-Code-Generator-Python Choose the name of your generated QR .png file. If it happens to open the .py file (the application), there are specific comm

1 Dec 23, 2021
【萝莉图片算法】高损图像压缩算法!?

【萝莉图片算法】高损图像压缩算法!? 我又发明出新算法了! 这次我发明的是新型高损图像压缩算法——萝莉图片算法!为什么是萝莉图片,这是因为它是使动用法,让图片变小所以是萝莉图片,大家一定要学好语文哦! 压缩效果 太神奇了!压缩率竟然高达99.97%! 与常见压缩算法对比 在图片最终大小为1KB的情况

黄巍 49 Oct 17, 2022
Png2Jpg tool will help you convert from png image format to jpg images format.

PNG 2 JPG All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Png2Jpg too

Nguyễn Trường Lâu 2 Dec 27, 2021
Goddard Image Analysis and Navigation Tool

Copyright 2021 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. No copyright is clai

NASA 12 Dec 23, 2022
The friendly PIL fork (Python Imaging Library)

Pillow Python Imaging Library (Fork) Pillow is the friendly PIL fork by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lund

Pillow 10.4k Dec 31, 2022
Fuzzware is a project for automated, self-configuring fuzzing of firmware images

Fuzzware Fuzzware is a project for automated, self-configuring fuzzing of firmware images. The idea of this project is to configure the memory ranges

190 Dec 21, 2022
AutoGiphyMovie lets you search giphy for gifs, converts them to videos, attach a soundtrack and stitches it all together into a movie!

AutoGiphyMovie lets you search giphy for gifs, converts them to videos, attach a soundtrack and stitches it all together into a movie!

Satya Mohapatra 18 Nov 13, 2022