Automatic caption evaluation metric based on typicality analysis.

Related tags

Deep LearningSMURF
Overview

SeMantic and linguistic UndeRstanding Fusion (SMURF)

made-with-python License: MIT

Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis" (ACL 2021).

arXiv: https://arxiv.org/abs/2106.01444

ACL Anthology: https://aclanthology.org/2021.acl-long.175/

Overview

SMURF is an automatic caption evaluation metric that combines a novel semantic evaluation algorithm (SPARCS) and novel fluency evaluation algorithms (SPURTS and MIMA) for both caption-level and system-level analysis. These evaluations were developed to be generalizable and as a result demonstrate a high correlation with human judgment across many relevant datasets. See paper for more details.

Requirements

You can run requirements/install.sh to quickly install all the requirements in an Anaconda environment. The requirements are:

  • python 3
  • torch>=1.0.0
  • numpy
  • nltk>=3.5.0
  • pandas>=1.0.1
  • matplotlib
  • transformers>=3.0.0
  • shapely
  • sklearn
  • sentencepiece

Usage

./smurf_example.py provides working examples of the following functions:

Caption-Level Scoring

Returns a dictionary with scores for semantic similarity between reference captions and candidate captions (SPARCS), style/diction quality of candidate text (SPURTS), grammar outlier penalty of candidate text (MIMA), and the fusion of these scores (SMURF). Input sentences should be preprocessed before being fed into the smurf_eval_captions object as shown in the example. Evaluations with SPARCS require a list of reference sentences while evaluations with SPURTS and MIMA do not use reference sentences.

System-Level Analysis

After reading in and standardizing caption-level scores, generates a plot that can be used to give an overall evaluation of captioner performances along with relevant system-level scores (intersection with reference captioner and total grammar outlier penalties) for each captioner. An example of such a plot is shown below:

The number of captioners you are comparing should be specified when instantiating a smurf_system_analysis object. In order to generate the plot correctly, the captions fed into the caption-level scoring for each candidate captioner (C1, C2,...) should be organized in the following format with the C1 captioner as the ground truth:

[C1 image 1 output, C2 image 1 output,..., C1 image 2 output, C2 image 2 output,...].

Author/Maintainer:

Joshua Feinglass (https://scholar.google.com/citations?user=V2h3z7oAAAAJ&hl=en)

If you find this repo useful, please cite:

@inproceedings{feinglass2021smurf,
  title={SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis},
  author={Joshua Feinglass and Yezhou Yang},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  year={2021},
  url={https://aclanthology.org/2021.acl-long.175/}
}
Owner
Joshua Feinglass
Joshua Feinglass
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.

SFace Code of TIP2021 Paper 《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet, PyTorch and Jittor versi

Zhong Yaoyao 47 Nov 25, 2022
magiCARP: Contrastive Authoring+Reviewing Pretraining

magiCARP: Contrastive Authoring+Reviewing Pretraining Welcome to the magiCARP API, the test bed used by EleutherAI for performing text/text bi-encoder

EleutherAI 43 Dec 29, 2022
Official Implementation for HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing Yuval Alaluf*, Omer Tov*, Ron Mokady, Rinon Gal, Amit H. Bermano *Denotes equ

885 Jan 06, 2023
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

MoCo v3 for Self-supervised ResNet and ViT Introduction This is a PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT. The original M

Facebook Research 887 Jan 08, 2023
ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. 🔥

AI4Finance 2.5k Jan 08, 2023
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 114 Jan 06, 2023
FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

FocusFace This is the official repository of "FocusFace: Multi-task Contrastive Learning for Masked Face Recognition" accepted at IEEE International C

Pedro Neto 21 Nov 17, 2022
High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

Image Completion Transformer (ICT) Project Page | Paper (ArXiv) | Pre-trained Models | Supplemental Material This repository is the official pytorch i

Ziyu Wan 243 Jan 03, 2023
Code for the Image similarity challenge.

ISC 2021 This repository contains code for the Image Similarity Challenge 2021. Getting started The docs subdirectory has step-by-step instructions on

Facebook Research 173 Dec 12, 2022
Goal of the project : Detecting Temporal Boundaries in Sign Language videos

MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an

Loubna Ben Allal 6 Dec 21, 2022
Make your AirPlay devices as TTS speakers

Apple AirPlayer Home Assistant integration component, make your AirPlay devices as TTS speakers. Before Use 2021.6.X or earlier Apple Airplayer compon

George Zhao 117 Dec 15, 2022
A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

Benedek Rozemberczki 697 Dec 27, 2022
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Sequential model-based optimization with a `scipy.optimize` interface

Scikit-Optimize Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements

Scikit-Optimize 2.5k Jan 04, 2023
This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

CRGNN Paper : Improving the Training of Graph Neural Networks with Consistency Regularization Environments Implementing environment: GeForce RTX™ 3090

THUDM 28 Dec 09, 2022
Sequential Model-based Algorithm Configuration

SMAC v3 Project Copyright (C) 2016-2018 AutoML Group Attention: This package is a reimplementation of the original SMAC tool (see reference below). Ho

AutoML-Freiburg-Hannover 778 Jan 05, 2023