CRL_EGPG
Pytorch Implementation of Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation
We use contrastive loss implemented by HobbitLong.
How to train
- download the dataset from here and put it to project directory.
You can directly use preprocessed dataset(data/: QQP-Pos, data2: ParaNMT)
Or process them(Quora and Para)by your own throughquora_process.pyandpara_process.pyrespectively.
If you take the second method, you need to set the variabletext_pathin the above two programs. python train.py --datasets quora --model_save_path directory_to_save_model
How to evaluate
- Firstly, generate the test target sentences by running
python evaluate --model_save_path your_saved_model --idx which_model_you_want_to_test
After running the command, you will find the generated target filetrg_genidx.txtand corresponding exemplar fileexmidx.txt - Follow the repository provided by malllabiisc.
and setup the evaluation code. Then run
python -m src.evaluation.eval -i path/trg_genidx.txt -r path/test_trg.txt -t path/exmidx.txt
change to the corresponding path
How to generate multiple paraphrases for one input
You can modify generate.py or just run
python generate.py