Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Last update: Jan 03, 2023

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This is the PyTorch companion code for the paper:

Amaia Salvador, Erhan Gundogdu, Loris Bazzani, and Michael Donoser. Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning. CVPR 2021

If you find this code useful in your research, please consider citing using the following BibTeX entry:

@inproceedings{salvador2021revamping,
    title={Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning},
    author={Salvador, Amaia and Gundogdu, Erhan and Bazzani, Loris and Donoser, Michael},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Cloning

This repository uses git-lfs to store model checkpoint files. Make sure to install it before cloning by following the instructions here:

Once installed, model checkpoint files will be automatically downloaded when cloning the repository with:

git clone [email protected]:amzn/image-to-recipe-transformers.git

These files can optionally be ignored by using git lfs install --skip-smudge before cloning the repository, and can be downloaded at any time using git lfs pull.

Installation

Create conda environment: conda env create -f environment.yml
Activate it with conda activate im2recipetransformers

Data preparation

Download & uncompress Recipe1M dataset. The contents of the directory DATASET_PATH should be the following:

layer1.json
layer2.json
train/
val/
test/

The directories train/, val/, and test/ must contain the image files for each split after uncompressing.

Make splits and create vocabulary by running:

python preprocessing.py --root DATASET_PATH

This process will create auxiliary files under DATASET_PATH/traindata, which will be used for training.

Training

Launch training with:

python train.py --model_name model --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Tensorboard logging can be enabled with --tensorboard. Then, from the checkpoints directory run:

tensorboard --logdir "./" --port PORT

Run python train.py --help for the full list of available arguments.

Evaluation

Extract features from the trained model for the test set samples of Recipe1M:

python test.py --model_name model --eval_split test --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Compute MedR and recall metrics for the extracted feature set:

python eval.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000

Pretrained models

We provide pretrained model weights under the checkpoints directory. Make sure you run git lfs pull to download the model files.
Extract the zip files. For each model, a folder named MODEL_NAME with two files, args.pkl, and model-best.ckpt is provided.
Extract features for the test set samples of Recipe1M using one of the pretrained models by running:

python test.py --model_name MODEL_NAME --eval_split test --root DATASET_PATH --save_dir ../checkpoints

A file with extracted features will be saved under ../checkpoints/MODEL_NAME.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Related tags

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Cloning

Installation

Data preparation

Training

Evaluation

Pretrained models

Security

License

Owner

Amazon

Japanese NLP Library

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Library for fast text representation and classification.

Python library for parsing resumes using natural language processing and machine learning

This is Assignment1 code for the Web Data Processing System.

Labelling platform for text using distant supervision

Textlesslib - Library for Textless Spoken Language Processing

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Converts python code into c++ by using OpenAI CODEX.

HAIS_2GNN: 3D Visual Grounding with Graph and Attention

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

A curated list of FOSS tools to improve the Hacker News experience

DaCy: The State of the Art Danish NLP pipeline using SpaCy

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

This is a general repo that helps you develop fast/effective NLP classifiers using Huggingface

Backend for the Autocomplete platform. An AI assisted coding platform.

An Open-Source Package for Neural Relation Extraction (NRE)

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search