Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Last update: Dec 25, 2022

Related tags

Overview

ConSERT

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Requirements

torch==1.6.0
cudatoolkit==10.0.103
cudnn==7.6.5
sentence-transformers==0.3.9
transformers==3.4.0
tensorboardX==2.1
pandas==1.1.5
sentencepiece==0.1.85
matplotlib==3.4.1
apex==0.1.0

Get Started

Download pre-trained language model (e.g. bert-base-uncased) from HuggingFace's Library
Download STS datasets to ./data folder using SentEval toolkit

Run the following script to run the unsupervised experiment:

python3 main.py --no_pair --seed 1 --use_apex_amp --apex_amp_opt_level O1 --batch_size 96 --max_seq_length 64 --evaluation_steps 200 --add_cl --cl_loss_only --cl_rate 0.15 --temperature 0.1 --learning_rate 0.0000005 --train_data stssick --num_epochs 10 --da_final_1 feature_cutoff --da_final_2 shuffle --cutoff_rate_final_1 0.2 --model_name_or_path [PRETRAINED_BERT_FOLDER] --model_save_path ./output/unsup-base-feature_cutoff-shuffle --force_del --no_dropout --patience 10

where [PRETRAINED_BERT_FOLDER] should be replaced to the folder that contains downloaded pre-trained language model

Citation

@article{yan2021consert,
  title={ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer},
  author={Yan, Yuanmeng and Li, Rumei and Wang, Sirui and Zhang, Fuzheng and Wu, Wei and Xu, Weiran},
  journal={arXiv preprint arXiv:2105.11741},
  year={2021}
}

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Related tags

Overview

ConSERT

Requirements

Get Started

Citation

Owner

Yan Yuanmeng

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Spert NLP Relation Extraction API deployed with torchserve for inference

An A-SOUL Text Generator Based on CPM-Distill.

Checking spelling of form elements

Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Ελληνικά νέα (Python script) / Greek News Feed (Python script)

Ukrainian TTS (text-to-speech) using Coqui TTS

A python wrapper around the ZPar parser for English.

Tracking Progress in Natural Language Processing

Code for text augmentation method leveraging large-scale language models

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

A simple word search made in python

Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

Natural Language Processing Specialization

COVID-19 Related NLP Papers

News-Articles-and-Essays - NLP (Topic Modeling and Clustering)

Mesh TensorFlow: Model Parallelism Made Easier

Code examples for my Write Better Python Code series on YouTube.

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

SimpleChinese2 集成了许多基本的中文NLP功能，使基于 Python 的中文文字处理和信息提取变得简单方便。