MCSE: Multimodal Contrastive Learning of Sentence Embeddings
This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multimodal Contrastive Learning of Sentence Embeddings. If you find this reposity useful, please consider citing our paper.
Contact: Miaoran Zhang ([email protected])
Pre-trained Models & Results
| Model | Avg. STS | 
|---|---|
| flickr-mcse-bert-base-uncased [Google Drive] | 77.70 | 
| flickr-mcse-roberta-base [Google Drive] | 78.44 | 
| coco-mcse-bert-base-uncased [Google Drive] | 77.08 | 
| coco-mcse-roberta-base [Google Drive] | 78.17 | 
Note: flickr indicates that models are trained on wiki+flickr, and coco indicates that models are trained on wiki+coco.
Quickstart
Setup
- Python 3.9.5
- Pytorch 1.7.1
- Install other packages:
pip install -r requirements.txt
Data Preparation
Please organize the data directory as following:
REPO ROOT
|
|--data    
|  |--wiki1m_for_simcse.txt  
|  |--flickr_random_captions.txt    
|  |--flickr_resnet.hdf5    
|  |--coco_random_captions.txt    
|  |--coco_resnet.hdf5  
Wiki1M
wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt
Flickr30k & MS-COCO 
 You can either download the preprocessed data we used: 
 (annotation sources: flickr30k-entities and coco).
Or preprocess the data by yourself (take Flickr30k as an example):
- Download the flickr30k-entities.
- Request access to the flickr-images from here. Note that the use of the images much abide by the Flickr Terms of Use.
- Run script:
unzip ${path_to_flickr-entities}/annotations.zip python preprocess/prepare_flickr.py \ --flickr_entities_dir ${path_to_flickr-entities} \ --flickr_images_dir ${path_to_flickr-images} \ --output_dir data/ --batch_size 32
Train & Evaluation
-  Prepare the senteval datasets for evaluation: cd SentEval/data/downstream/ bash download_dataset.sh
-  Run scripts: # For example: (more examples are given in scripts/.) sh scripts/run_wiki_flickr.shNote: In the paper we run experiments with 5 seeds (0,1,2,3,4). You can find the detailed parameter settings in Appendix.