Lexical Substitution Framework

Overview

LexSubGen

Lexical Substitution Framework

This repository contains the code to reproduce the results from the paper:

Arefyev Nikolay, Sheludko Boris, Podolskiy Alexander, Panchenko Alexander, "Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution", Proceedings of the 28th International Conference on Computational Linguistics, 2020

Installation

Clone LexSubGen repository from github.com.

git clone https://github.com/Samsung/LexSubGen
cd LexSubGen

Setup anaconda environment

  1. Download and install conda
  2. Create new conda environment
    conda create -n lexsubgen python=3.7.4
  3. Activate conda environment
    conda activate lexsubgen
  4. Install requirements
    pip install -r requirements.txt
  5. Download spacy resources and install context2vec and word_forms from github repositories
    ./init.sh

Setup Web Application

If you do not plan to use the Web Application, skip this section and go to the next!

  1. Download and install NodeJS and npm.
  2. Run script for install dependencies and create build files.
bash web_app_setup.sh

Install lexsubgen library

python setup.py install

Results

Results of the lexical substitution task are presented in the following table. To reproduce them, follow the instructions above to install the correct dependencies.

Model SemEval COINCO
GAP [email protected] [email protected] [email protected] GAP [email protected] [email protected] [email protected]
OOC 44.65 16.82 12.83 18.36 46.3 19.58 15.03 12.99
C2V 55.82 7.79 5.92 11.03 48.32 8.01 6.63 7.54
C2V+embs 53.39 28.01 21.72 33.52 50.73 29.64 24.0 21.97
ELMo 53.66 11.58 8.55 13.88 49.47 13.58 10.86 11.35
ELMo+embs 54.16 32.0 22.2 31.82 52.22 35.96 26.62 23.8
BERT 54.42 38.39 27.73 39.57 50.5 42.56 32.64 28.73
BERT+embs 53.87 41.64 30.59 43.88 50.85 46.05 35.63 31.67
RoBERTa 56.74 32.25 24.26 36.65 50.82 35.12 27.35 25.41
RoBERTa+embs 58.74 43.19 31.19 44.61 54.6 46.54 36.17 32.1
XLNet 59.12 31.75 22.83 34.95 53.39 38.16 28.58 26.47
XLNet+embs 59.62 49.53 34.9 47.51 55.63 51.5 39.92 35.12

Results reproduction

Here we list XLNet reproduction commands that correspond to the results presented in the table above. Reproduction commands for all models you can find in scripts/lexsub-all-models.sh Besides saving to the 'run-directory' all results are saved using mlflow. To check them you can run mlflow ui in LexSubGen directory and then open the web page in a browser.

Also you can use pytest to check the reproducibility. But it may take a long time:

pytest tests/results_reproduction
  • XLNet:

XLNet Semeval07:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet.jsonnet --dataset-config-path configs/dataset_readers/lexsub/semeval_all.jsonnet --run-dir='debug/lexsub-all-models/semeval_all_xlnet' --force --experiment-name='lexsub-all-models' --run-name='semeval_all_xlnet'

XLNet CoInCo:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet.jsonnet --dataset-config-path configs/dataset_readers/lexsub/coinco.jsonnet --run-dir='debug/lexsub-all-models/coinco_xlnet' --force --experiment-name='lexsub-all-models' --run-name='coinco_xlnet'

XLNet with embeddings similarity Semeval07:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet_embs.jsonnet --dataset-config-path configs/dataset_readers/lexsub/semeval_all.jsonnet --run-dir='debug/lexsub-all-models/semeval_all_xlnet_embs' --force --experiment-name='lexsub-all-models' --run-name='semeval_all_xlnet_embs'

XLNet with embeddings similarity CoInCo:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet_embs.jsonnet --dataset-config-path configs/dataset_readers/lexsub/coinco.jsonnet --run-dir='debug/lexsub-all-models/coinco_xlnet_embs' --force --experiment-name='lexsub-all-models' --run-name='coinco_xlnet_embs'

Word Sense Induction Results

Model SemEval 2013 SemEval 2010
AVG AVG
XLNet 33.4 52.1
XLNet+embs 37.3 54.1

To reproduce these results use 2.3.0 version of transformers and the following command:

bash scripts/wsi.sh

Web application

You could use command line interface to run Web application.

# Run main server
lexsubgen-app run --host HOST 
                  --port PORT 
                  [--model-configs CONFIGS] 
                  [--start-ids START-IDS] 
                  [--start-all] 
                  [--restore-session]

Example:

# Run server and serve models BERT and XLNet. 
# For BERT create server for serving model and substitute generator instantly (load resources in memory).
# For XLNet create only server.
lexsubgen-app run --host '0.0.0.0' 
                  --port 5000 
                  --model-configs '["my_cool_configs/bert.jsonnet", "my_awesome_configs/xlnet.jsonnet"]' 
                  --start-ids '[0]'

# After shutting down server JSON file with session dumps in the '~/.cache/lexsubgen/app_session.json'.
# The content of this file looks like:
# [
#     'my_cool_configs/bert.jsonnet',
#     'my_awesome_configs/xlnet.jsonnet',
# ]
# You can restore it with flag 'restore-session'
lexsubgen-app run --host '0.0.0.0' 
                  --port 5000 
                  --restore-session
# BERT and XLNet restored now
Arguments:
Argument Default Description
--help Show this help message and exit
--host IP address of running server host
--port 5000 Port for starting the server
--model-configs [] List of file paths to the model configs.
--start-ids [] Zero-based indices of served models for which substitute generators will be created
--start-all False Whether to create substitute generators for all served models
--restore-session False Whether to restore session from previous Web application run

FAQ

  1. How to use gpu? - You can use environment variable CUDA_VISIBLE_DEVICES to use gpu for inference: export CUDA_VISIBLE_DEVICES='1' or CUDA_VISIBLE_DEVICES='1' before your command.
  2. How to run tests? - You can use pytest: pytest tests
Owner
Samsung
Samsung Electronics Co.,Ltd.
Samsung
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
K-FACE Analysis Project on Pytorch

Installation Setup with Conda # create a new environment conda create --name insightKface python=3.7 # or over conda activate insightKface #install t

Jung Jun Uk 7 Nov 10, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca

Google Research Datasets 89 Jan 08, 2023
Compare neural networks by their feature similarity

PyTorch Model Compare A tiny package to compare two neural networks in PyTorch. There are many ways to compare two neural networks, but one robust and

Anand Krishnamoorthy 181 Jan 04, 2023
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi

AWS Samples 916 Dec 23, 2022
Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

16 Nov 19, 2022
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 05, 2022
Predicting Event Memorability from Contextual Visual Semantics

Predicting Event Memorability from Contextual Visual Semantics

0 Oct 06, 2021
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
An implementation of MobileFormer

MobileFormer An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al. Including [1] Mobile-Former proposed in:

slwang9353 62 Dec 28, 2022
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Zhengxia Zou 1.5k Dec 28, 2022
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Usage First, install PyTorch 1.7.1+, torchvision 0.8.2

40 Dec 12, 2022
MvtecAD unsupervised Anomaly Detection

MvtecAD unsupervised Anomaly Detection This respository is the unofficial implementations of DFR: Deep Feature Reconstruction for Unsupervised Anomaly

0 Feb 25, 2022
Automatic meme generation model using Tensorflow Keras.

Memefly You can find the project at MemeflyAI. Contributors Nick Buukhalter Harsh Desai Han Lee Project Overview Trello Board Product Canvas Automatic

BloomTech Labs 2 Jan 13, 2022
Tiny Object Detection in Aerial Images.

AI-TOD AI-TOD is a dataset for tiny object detection in aerial images. [Paper] [Dataset] Description AI-TOD comes with 700,621 object instances for ei

jwwangchn 116 Dec 30, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Multi-modal Vision Transformers Excel at Class-agnostic Object Detection

Muhammad Maaz 206 Jan 04, 2023
Lane assist for ETS2, built with the ultra-fast-lane-detection model.

Euro-Truck-Simulator-2-Lane-Assist Lane assist for ETS2, built with the ultra-fast-lane-detection model. This project was made possible by the amazing

36 Jan 05, 2023
Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

Max 1 Dec 29, 2021
Shōgun

The SHOGUN machine learning toolbox Unified and efficient Machine Learning since 1999. Latest release: Cite Shogun: Develop branch build status: Donat

Shōgun ML 2.9k Jan 04, 2023