Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020

Related tags

Testingshow-edit-tell
Overview

Show, Edit and Tell: A Framework for Editing Image Captions | arXiv

This contains the source code for Show, Edit and Tell: A Framework for Editing Image Captions, to appear at CVPR 2020

Requirements

  • Python 3.6 or 3.7
  • PyTorch 1.2

For evaluation, you also need:

Argument Parser is currently not supported. We will add support to it soon.

Pretrained Models

You can download the pretrained models from here. Place them in eval folder.

Download and Prepare Features

In this work, we use 36 fixed bottom-up features. If you wish to use the adaptive features (10-100), please refer to adaptive_features folder in this repository and follow the instructions.

First, download the fixed features from here and unzip the file. Place the unzipped folder in bottom-up_features folder.

Next type this command:

python bottom-up_features/tsv.py

This command will create the following files:

  • An HDF5 file containing the bottom up image features for train and val splits, 36 per image for each split, in an (I, 36, 2048) tensor where I is the number of images in the split.
  • PKL files that contain training and validation image IDs mapping to index in HDF5 dataset created above.

Download/Prepare Caption Data

You can either download all the related caption data files from here or create them yourself. The folder contains the following:

  • WORDMAP_coco: maps the words to indices
  • CAPUTIL: stores the information about the existing captions in a dictionary organized as follows: {"COCO_image_name": {"caption": "existing caption to be edited", "encoded_previous_caption": an encoded list of the words, "previous_caption_length": a list contaning the length of the caption, "image_ids": the COCO image id}
  • CAPTIONS the encoded ground-truth captions (a list with number_images x 5 lists. Example: we have 113,287 training images in Karpathy Split, thereofre there is 566,435 lists for the training split)
  • CAPLENS: the length of the ground-truth captions (a list with number_images x 5 vallues)
  • NAMES: the COCO image name in the same order as the CAPTIONS
  • GENOME_DETS: the splits and image ids for loading the images in accordance to the features file created above

If you'd like to create the caption data yourself, download Karpathy's Split training, validation, and test splits. This zip file contains the captions. Place the file in caption data folder. You should also have the pkl files created from the 'Download Features' section: train36_imgid2idx.pkl and val36_imgid2idx.pkl.

Next, run:

python preprocess_caps.py

This will dump all the files to the folder caption data.

Next, download the existing captios to be edited, and organize them in a list containing dictionaries with each dictionary in the following format: {"image_id": COCO_image_id, "caption": "caption to be edited", "file_name": "split\\COCO_image_name"}. For example: {"image_id": 522418, "caption": "a woman cutting a cake with a knife", "file_name": "val2014\\COCO_val2014_000000522418.jpg"}. In our work, we use the captions produced by AoANet.

Next, run:

python preprocess_existing_caps.py

This will dump all the existing caption files to the folder caption data.

Prepare/Download Sequence-Level Training Data

Download the RL-data for sequence-level training used for computing metric scores from here.

Alternitavely, you may prepare the data yourself:

Run the following command:

python preprocess_rl.py

This will dump two files in the data folder used for computing metric scores.

Training and Validation

XE training stage:

For training DCNet, run:

python dcnet.py

For optimizing DCNet with MSE, run:

python dcnet_with_mse.py

For training editnet:

python editnet.py
Cider-D Optimization stage:

For training DCNet, run:

python dcnet_rl.py

For training editnet:

python editnet_rl.py

Evaluation

Refer to eval folder for instructions. All the generated captions and scores from our model can be found in the outputs folder.

BLEU-1 BLEU-4 CIDEr SPICE
Cross-Entropy Loss 77.9 38.0 1.200 21.2
CIDEr Optimization 80.6 39.2 1.289 22.6

Citation

@InProceedings{Sammani_2020_CVPR,
author = {Sammani, Fawaz and Melas-Kyriazi, Luke},
title = {Show, Edit and Tell: A Framework for Editing Image Captions},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

References

Our code is mainly based on self-critical and show attend and tell. We thank both authors.

Owner
Fawaz Sammani
The human brain is a miracle every human has, and mathematically modelling that brain is an overwhelming matter! I like teaching machines vision-language
Fawaz Sammani
Wraps any WSGI application and makes it easy to send test requests to that application, without starting up an HTTP server.

WebTest This wraps any WSGI application and makes it easy to send test requests to that application, without starting up an HTTP server. This provides

Pylons Project 325 Dec 30, 2022
Simple assertion library for unit testing in python with a fluent API

assertpy Simple assertions library for unit testing in Python with a nice fluent API. Supports both Python 2 and 3. Usage Just import the assert_that

19 Sep 10, 2022
This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

Parquet benchmarks This repository contains a set of benchmarks of different implementations of Parquet (storage format) - Arrow (in-memory format).

11 Dec 21, 2022
Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism"

SUGAR Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism" Overview train.py: the cor

41 Nov 08, 2022
Pynguin, The PYthoN General UnIt Test geNerator is a test-generation tool for Python

Pynguin, the PYthoN General UnIt test geNerator, is a tool that allows developers to generate unit tests automatically.

Chair of Software Engineering II, Uni Passau 997 Jan 06, 2023
Tutorial for integrating Oxylabs' Residential Proxies with Selenium

Oxylabs’ Residential Proxies integration with Selenium Requirements For the integration to work, you'll need to install Selenium on your system. You c

Oxylabs.io 8 Dec 08, 2022
Main purpose of this project is to provide the service to automate the API testing process

PPTester project Main purpose of this project is to provide the service to automate the API testing process. In order to deploy this service use you s

4 Dec 16, 2021
A pytest plugin to skip `@pytest.mark.slow` tests by default.

pytest-skip-slow A pytest plugin to skip @pytest.mark.slow tests by default. Include the slow tests with --slow. Installation $ pip install pytest-ski

Brian Okken 19 Jan 04, 2023
Declarative HTTP Testing for Python and anything else

Gabbi Release Notes Gabbi is a tool for running HTTP tests where requests and responses are represented in a declarative YAML-based form. The simplest

Chris Dent 139 Sep 21, 2022
Sixpack is a language-agnostic a/b-testing framework

Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries

1.7k Dec 24, 2022
User-interest mock backend server implemnted using flask restful, and SQLAlchemy ORM confiugred with sqlite

Flask_Restful_SQLAlchemy_server User-interest mock backend server implemnted using flask restful, and SQLAlchemy ORM confiugred with sqlite. Backend b

Austin Weigel 1 Nov 17, 2022
Plugin for generating HTML reports for pytest results

pytest-html pytest-html is a plugin for pytest that generates a HTML report for test results. Resources Documentation Release Notes Issue Tracker Code

pytest-dev 548 Dec 28, 2022
MongoDB panel for the Flask Debug Toolbar

Flask Debug Toolbar MongoDB Panel Info: An extension panel for Rob Hudson's Django Debug Toolbar that adds MongoDB debugging information Author: Harry

Cenk Altı 4 Dec 11, 2019
hyppo is an open-source software package for multivariate hypothesis testing.

hyppo (HYPothesis Testing in PythOn, pronounced "Hippo") is an open-source software package for multivariate hypothesis testing.

neurodata 137 Dec 18, 2022
d4rk Ghost is all in one hacking framework For red team Pentesting

d4rk ghost is all in one Hacking framework For red team Pentesting it contains all modules , information_gathering exploitation + vulnerability scanning + ddos attacks with 12 methods + proxy scraper

d4rk sh4d0w 15 Dec 15, 2022
Web testing library for Robot Framework

SeleniumLibrary Contents Introduction Keyword Documentation Installation Browser drivers Usage Extending SeleniumLibrary Community Versions History In

Robot Framework 1.2k Jan 03, 2023
tidevice can be used to communicate with iPhone device

tidevice can be used to communicate with iPhone device

Alibaba 1.8k Jan 08, 2023
Show coverage stats online via coveralls.io

Coveralls for Python Test Status: Version Info: Compatibility: Misc: coveralls.io is a service for publishing your coverage stats online. This package

Kevin James 499 Dec 28, 2022
It helps to use fixtures in pytest.mark.parametrize

pytest-lazy-fixture Use your fixtures in @pytest.mark.parametrize. Installation pip install pytest-lazy-fixture Usage import pytest @pytest.fixture(p

Marsel Zaripov 299 Dec 24, 2022
A complete test automation tool

Golem - Test Automation Golem is a test framework and a complete tool for browser automation. Tests can be written with code in Python, codeless using

486 Dec 30, 2022