Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

Overview

DensePhrases

DensePhrases Demo

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time. While it efficiently searches the answers out of 60 billion phrases in Wikipedia, it is also very accurate having competitive accuracy with state-of-the-art open-domain QA models. Please see our paper Learning Dense Representations of Phrases at Scale (Lee et al., 2020) for more details.

***** You can try out our online demo of DensePhrases here! *****

Quick Links

Installation

# Use conda & pip
conda create -n dph python=3.7
conda activate dph
conda install pytorch cudatoolkit=11.0 -c pytorch
pip install faiss-gpu==1.6.5 h5py tqdm transformers==2.9.0 blosc ujson rouge wandb nltk flask flask_cors tornado requests-futures

# Install apex
git clone https://www.github.com/nvidia/apex
cd apex
python setup.py install

Please check your CUDA version before the installation and modify it accordingly. Since it can be tricky to install a recent version of PyTorch together with the GPU version of Faiss, please post Github issues if you have any problem (See Training DensePhrases to check whether the installation is complete).

Before downloading the required files below, please set the default directories as follows and ensure that you have enough storage to download and unzip the files:

# Running config.sh will set the following three environment variables:
# DPH_DATA_DIR: for datasets (including 'kilt', 'open-qa', 'single-qa', 'truecase', 'wikidump')
# DPH_SAVE_DIR: for pre-trained models or dumps; new models and dumps will also be saved here
# DPH_CACHE_DIR: for cache files from huggingface transformers
source config.sh

1. Datasets

  • Datasets (6GB) - All pre-processed datasets used in our experiments including reading comprehension, open-domain QA, slot filling, pre-processed Wikipedia. Download and unzip it under DPH_DATA_DIR.
ls $DPH_DATA_DIR
kilt  open-qa  single-qa  truecase  wikidump

2. Pre-trained Models

  • Pre-trained models (5GB) - Pre-trained DensePhrases and cross-encoder teacher models. Download and unzip it under DPH_SAVE_DIR.
ls $DPH_SAVE_DIR
dph-nqsqd-pb2  dph-nqsqd-pb2_pq96-multi6  dph-nqsqd-pb2_pq96-nq-10  spanbert-base-cased-nq  spanbert-base-cased-sqdnq  spanbert-base-cased-squad
  • dph-nqsqd-pb2 : DensePhrases (C_phrase = {NQ, SQuAD}) before any query-side fine-tuning
  • dph-nqsqd-pb2_pq96-nq-10 : DensePhrases query-side fine-tuned on NQ (PQ index, NQ=40.9 EM)
  • dph-nqsqd-pb2_pq96-multi6 : DensePhrases query-side fine-tuned on 4 open-domain QA (NQ, WQ, TREC, TQA) + 2 slot filling datasets (PQ index, NQ=40.3 EM); Used for the demo
  • spanbert-base-cased-* : cross-encoder teacher models trained on *

3. Phrase Index

  • DensePhrases-IVFPQ96 (88GB) - Phrase index for the 20181220 version of Wikipedia. Download and unzip it under DPH_SAVE_DIR.
ls $DPH_SAVE_DIR
...  dph-nqsqd-pb2_20181220_concat

Since hosting the 320GB phrase index (+500GB original vectors for query-side fine-tuning) - the phrase index described in our paper - is costly, we provide an index with a much smaller size, which includes our recent efforts to reduce the size of the phrase index with Product Quantization (IVFPQ). With IVFPQ, you do not need any SSDs for the real-time inference (the index is loaded on RAM), and you can also reconstruct the phrase vectors from it for the query-side fine-tuning (hence do not need the additional 500GB).

For the reimplementation of DensePhrases with IVFSQ4 as described in the paper, see Training DensePhrases.

Playing with a DensePhrases Demo

There are two ways of using DensePhrases.

  1. You can simply use the demo that we are serving on our server. The running demo is using dph-nqsqd-pb2_pq96-multi6 (NQ=40.3 EM) as a query encoder and dph-nqsqd-pb2_20181220_concat as a phrase index.
  2. You can install the demo on your own server, which enables you to change the query encoder (e.g., to dph-nqsqd-pb2_pq96-nq-10) or to process multiple queries in parallel (using HTTP POST). We recommend installing your own demo as described below since our demo can be unstable due to a large number of requests. Also, query-side fine-tuning is only available to those who installed DensePhrases on their server.

The minimum resource requirement for running the demo is:

  • Single 11GB GPU
  • 125GB RAM
  • 100GB HDD

Note that you no longer need any SSDs to run the demo unlike previous phrase retrieval models (DenSPI, DenSPI+Sparc), but setting $DPH_SAVE_DIR to an SSD can reduce the loading time of the demo. The following commands serve exactly the same demo as here on your http://localhost:51997.

# Serve a query encoder on port 1111
make q-serve MODEL_NAME=dph-nqsqd-pb2_pq96-multi6 Q_PORT=1111

# Serve a phrase index on port 51997 (takes several minutes)
make p-serve DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_20181220_concat/dump/ Q_PORT=1111 I_PORT=51997

You can change the query encoder or the phrase index accordingly. Once you set up the demo, the log files in $DPH_SAVE_DIR/logs/ will be automatically updated whenever a new question comes in. You can also send queries to your server using mini-batches of questions for faster inference.

# Test on NQ test set (takes 60~90 sec)
make eval-od-req I_PORT=51997
(...)
INFO - densephrases.experiments.run_open -   {'exact_match_top1': 40.30470914127424, 'f1_score_top1': 47.18394271164363}
INFO - densephrases.experiments.run_open -   {'exact_match_top10': 63.57340720221607, 'f1_score_top10': 72.15437717099778}
INFO - densephrases.experiments.run_open -   Saving prediction file to $DPH_SAVE_DIR/pred/test_preprocessed_3610.pred

For more details (e.g., changing the test set), please see the targets in Makefile (q-serve, p-serve, eval-od-req, etc).

DensePhrases: Training, Indexing and Inference

In this section, we introduce the steps to train DensePhrases from scratch, create phrase dumps and indexes, and running inferences with the trained model (which can be also used as a demo described above). The minimum requirement is as follows:

  • Single 24GB GPU (for training)
  • up to 150GB RAM (for creating a phrase index of the entire Wikipedia)
  • up to 500GB storage (for creating a phrase dump of the entire Wikipedia)

All of our commands below are specified as Makefile targets, which include dataset paths, hyperparameter settings, etc. Before training DensePhrases, run the following command to check whether the installation is complete. If this command runs without an error, you are good to go!

# Test run for checking installation (ignore the performance)
make draft MODEL_NAME=test
DensePhrases Steps
  • A figure summarizing the overall process below

1. Training phrase and query encoders

To train DensePhrase from scratch, use train-single-nq, which trains DensePhrases on NQ (pre-processed for the reading comprehension setting). You can simply change the training set by modifying the dependencies of train-single-nq (e.g., nq-single-data => sqd-single-data and nq-param => sqd-param for training on SQuAD).

# Train DensePhrases on NQ with Eq. 9
make train-single-nq MODEL_NAME=dph-nq

train-single-nq is composed of the four consecutive commands as follows (in case of training on NQ):

  1. make train-single ...: Train DensePhrases on NQ with Eq. 9 (L = lambda1 L_single + lambda2 L_distill + lambda3 L_neg) with in-batch negatives and generated questions.
  2. make train-single ...: Load trained DensePhrases in the previous step and further train it with Eq. 9 with pre-batch negatives (dump D_small at the end).
  3. make index-sod: Create a phrase index for D_small
  4. make eval-sod ...: Evaluate the development set with D_small

At the end of step 2, you will see the performance on the reading comprehension setting where a gold passage is given (72.0 EM on NQ dev). Step 4 gives the performance on the semi-open-domain setting (denoted as D_small; see Table 6 in the paper.) where the entire passages from the NQ development set is used for the indexing (64.0 EM with NQ dev questions). The trained model will be saved under $DPH_SAVE_DIR/$MODEL_NAME. Note that during the single-passage training on NQ, we exclude some questions in the development set, whose annotated answers are found from a list or a table.

2. Creating a phrase index

Now let's assume that you have a model trained on NQ + SQuAD named dph-nqsqd-pb2, which can also be downloaded from here. You can make a bigger phrase dump using dump-large as follows:

# Create large-scale phrase dumps with a trained model (default = dev_wiki)
make dump-large MODEL_NAME=dph-nqsqd-pb2 START=0 END=8

The default text corpus for creating phrase dump is dev_wiki located in $DPH_DATA_DIR/wikidump. We have three options for larger text corpora:

  • dev_wiki: 1/100 Wikipedia scale (sampled), 8 files
  • dev_wiki_noise: 1/10 Wikipedia scale (sampled), 500 files
  • 20181220_concat: full Wikipedia (20181220) scale, 5621 files

The dev_wiki* corpora contain passages from the NQ development set, so that you can track the performance of your model depending on the size of the text corpus (usually decreases as it gets larger). The phrase dump will be saved as hdf5 files in $DPH_SAVE_DIR/$(MODEL_NAME)_(data_name)/dump ($DPH_SAVE_DIR/dph-nqsqd-pb2_dev_wiki/dump in this case), which will be referred to $DUMP_DIR.

Parallelization

START and END specify the file index in the corpus (e.g., START=0 END=8 for dev_wiki and START=0 END=5621 for 20181220_concat). Each run of dump-large only consumes 2GB of a single GPU, and you can distribute the processes with different START and END (use slurm or shell scripts). Distributing 28 processes on 4 24GB GPUs (each processing 200 files) can create a phrase dump for 20181220_concat in 8 hours.

After creating the phrase dump, you need to create a phrase index (or a MIPS index) for the sublinear time search of phrases. In our paper, we used IVFSQ4 for the phrase index.

# Create IVFSQ4 index for large indexes
make index-large DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_dev_wiki/dump/

For dev_wiki_noise and 20181220_concat, you need to modify the number of clusters to 101,372 and 1,048,576, respectively, and also use index-add and index-merge to add phrase representations to the index (see Makefile for details). If you want to use IVFPQ, using index-large-pq is enough in any case.

For evaluating the performance of DensePhrases on these larger phrase indexes, use eval-dump.

# Evaluate on the NQ development set questions
make eval-dump MODEL_NAME=dph-nqsqd-pb2 DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_dev_wiki/dump/

Optionally, you may want to compress the metadata (phrase dumps saved as hdf5 files) for a faster inference by loading it on RAM. This is only supported for the PQ index.

# Compress metadata of the entire Wikipedia (20181220_concat)
make compress-meta DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_20181220_concat/dump

3. Query-side fine-tuning

With a single 11GB GPU, you can easily train a query encoder to retrieve phrase-level knowledge from Wikipedia. First, you need a phrase index for the full Wikipedia (20181220_concat), which can be obtained by simply downloading from here (dph-nqsqd-pb2_20181220_concat) or by creating a custom phrase index as described above.

The following command query-side fine-tunes dph-nqsqd-pb2 on NQ.

# Query-side fine-tune on Natural Questions (model will be saved as MODEL_NAME)
make train-query MODEL_NAME=dph-nqsqd-pb2-nq DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_20181220_concat/dump/

Note that the pre-trained encoder is specified in train-query as --query_encoder_path $(DPH_SAVE_DIR)/dph-nqsqd-pb2 and a new model will be saved as dph-nqsqd-pb2-nq as specified above. You can also train on different datasets by changing the dependency nq-open-data to *-open-data (e.g., trec-open-data).

IVFPQ vs IVFSQ4

Currently, train-query uses the IVFPQ index for query-side fine-tuning, and you should modify the arguments --index_dir start/1048576_flat_PQ96_8 to --index_dir start/1048576_flat_SQ4 for using IVFSQ4 index used in our paper. For IVFPQ, training takes 2 to 3 hours per epoch for large datasets (NQ, TQA, SQuAD), and 3 to 8 minutes for small datasets (WQ, TREC). For IVFSQ4, the training time is highly dependent on the File I/O speed, so using SSDs is recommended for IVFSQ4.

4. Inference

With a pre-trained DensePhrases encoder (e.g., dph-nqsqd-pb2_pq96-nq-10) and a phrase index (e.g., dph-nqsqd-pb2_20181220_concat), you can test your queries as follows:

# Evaluate on Natural Questions
make eval-od MODEL_NAME=dph-nqsqd-pb2_pq96-nq-10 DUMP_DIR=$DPH_SAVE_DIR/dph-nqsqd-pb2_20181220_concat/dump/

# If the demo is being served on http://localhost:51997
make eval-od-req I_PORT=51997

Pre-processing

At the bottom of Makefile, we list commands that we used for pre-processing the datasets and Wikipedia. For training question generation models (T5-large), we used https://github.com/patil-suraj/question_generation (see also here for QG). Note that all datasets are already pre-processed including the generated questions, so you do not need to run most of these scripts. For creating test sets for custom (open-domain) questions, see preprocess-openqa in Makefile.

Reference

Please cite our paper if you use DensePhrases in your work:

@article{lee2020learning,
  title={Learning Dense Representations of Phrases at Scale},
  author={Lee, Jinhyuk and Sung, Mujeen and Kang, Jaewoo and Chen, Danqi},
  journal={arXiv preprint arXiv:2012.12624},
  year={2020}
}

License

Please see LICENSE for details.

Comments
  • Issue while creating faiss index, Command is not clear

    Issue while creating faiss index, Command is not clear

    Hi,

    What is the all in this command, I am getting unrecognized command error when i remove all.

    python build_phrase_index.py \
        $SAVE_DIR/densephrases-multi_sample/dump all \
        --replace \
        --num_clusters 32 \
        --fine_quant OPQ96 \
        --doc_sample_ratio 1.0 \
        --vec_sample_ratio 1.0 \
        --cuda
    

    I corrected that by giving --dump_dir before but its not creating anything. Please find the screenshot below, Screenshot from 2021-10-15 12-49-51

    opened by SAIVENKATARAJU 14
  • Modifying num_clusters in index-vecs

    Modifying num_clusters in index-vecs

    I tried to run index-vecs using custom wikidump, dataset and model, but got this error

    image

    Modifying num_clusters flags to 96 doesn't seem to help, the k in error message is still 256.

    opened by light42 11
  • Unable to file folder for phrase in wikidump

    Unable to file folder for phrase in wikidump

    Hi, First of all many thanks for work.

    I am trying to test this. As per documentation I downloaded all 4 tar files (datasets, wikipediadump, pretrained models and phrase index). but while running getting the below mentioned error: image which seems to be finding some phrase folder in wikidump, which is not available at all.

    Can u suggest the reason for same.

    I have given correct path for all folders.

    opened by tiwari93 11
  • Reproduction of DensePhrase (w/ PQ, w/o qft) on SQuAD

    Reproduction of DensePhrase (w/ PQ, w/o qft) on SQuAD

    I've built the compressed DensePhrase index on SQuAD using OPQ96. I haven't run any query-side finetuning yet but here are the results:


    11/22/2021 19:50:57 - INFO - main - no_ans/all: 0, 10570 11/22/2021 19:50:57 - INFO - main - Evaluating 10570 answers 11/22/2021 19:50:58 - INFO - main - EM: 21.63, F1: 27.96 11/22/2021 19:50:58 - INFO - main - 1) Which NFL team represented the AFC at Super Bowl 50 11/22/2021 19:50:58 - INFO - main - => groundtruths: ['Denver Broncos', 'Denver Broncos', 'Denver Broncos'], top 5 prediction: ['Denver Broncos', 'Pittsburgh Steelers', 'Pittsburgh Steelers', 'Pittsburgh Steelers', 'Pittsburgh Steelers'] 11/22/2021 19:50:58 - INFO - main - 2) Which NFL team represented the NFC at Super Bowl 50 11/22/2021 19:50:58 - INFO - main - => groundtruths: ['Carolina Panthers', 'Carolina Panthers', 'Carolina Panthers'], top 5 prediction: ['San Francisco 49ers', 'Chicago Bears', 'Seattle Seahawks', 'Tampa Bay Buccaneers', 'Green Bay Packers'] 11/22/2021 19:50:58 - INFO - main - 3) Where did Super Bowl 50 take place 11/22/2021 19:50:58 - INFO - main - => groundtruths: ['Santa Clara, California', "Levi's Stadium", "Levi's Stadium in the San Francisco Bay Area at Santa Clara, California."], top 5 prediction: ['Tacoma, Washington, USA', "Levi's Stadium in Santa Clara, California", 'DeVault Vineyards in Concord, Virginia', "Levi's Stadium in Santa Clara", 'Jinan Olympic Sports Center Gymnasium in Jinan, China'] 11/22/2021 19:53:44 - INFO - main - {'exact_match_top1': 21.62724692526017, 'f1_score_top1': 27.958255585698414} 11/22/2021 19:53:44 - INFO - main - {'exact_match_top200': 57.48344370860927, 'f1_score_top200': 73.28679644685603} 11/22/2021 19:53:44 - INFO - main - {'redundancy of top200': 5.308987701040681} 11/22/2021 19:53:44 - INFO - main - Saving prediction file to .//outputs/densephrases-squad-ddp/pred/test_preprocessed_10570_top200.pred 10570it [00:23, 448.84it/s] 11/22/2021 19:54:58 - INFO - main - avg psg len=124.84 for 10570 preds 11/22/2021 19:54:58 - INFO - main - dump to .//outputs/densephrases-squad-ddp/pred/test_preprocessed_10570_top200_psg-top100.json ctx token length: 124.84 unique titles: 98.20

    Top-1 = 27.02% Top-5 = 42.80% Top-20 = 56.40% Top-100 = 69.20% [email protected] when [email protected] = 39.05% [email protected] = 34.30 [email protected] = 8.94


    I understand that index compression results in accuracy loss w/o query-side finetuning. However, the score still looks a little bit too low to me. Could @jhyuklee confirm whether this looks alright?

    opened by alexlimh 9
  • Unable to Reproduce Passage Retrieval Results on NQ

    Unable to Reproduce Passage Retrieval Results on NQ

    Hi Jinhyuk,

    I was trying to reproduce the third row of Table 1 in your paper (https://arxiv.org/pdf/2109.08133.pdf). I'm using the index and pre-trained ckpt on NQ you gave me several days ago. Here's my results:

    Top-1 = 34.32%
    Top-5 = 54.13%
    Top-20 = 66.59%
    Top-100 = 76.43%
    [email protected] when [email protected] = 44.91%
    [email protected] = 43.12
    [email protected] = 14.61
    

    Here's the command I use:

    make eval-index-psg MODEL_NAME=densephrases-nq-query-nq DUMP_DIR=densephrases-nq_wiki-20181220-p100/dump/ TEST_DATA=open-qa/nq-open/test_preprocessed.json
    

    Any idea what I might do wrong? Thanks in advance.

    Minghan

    opened by alexlimh 9
  • The question about reproduce RC-SQD results

    The question about reproduce RC-SQD results

    Hi~ Thanks a lot for your open source work. When I run your code for SQuAD dataset in one passage training, I got 77.3 EM and 85.7 F1. I ran code in this script- python train_rc.py --model_type bert --pretrained_name_or_path SpanBERT/spanbert-base-cased --data_dir densephrases/densephrases-data/single-qa --cache_dir densephrases/cache --train_file squad/train-v1.1_qg_ents_t5large_3500_filtered.json --predict_file squad/dev-v1.1.json --do_train --do_eval --per_gpu_train_batch_size 24 --learning_rate 3e-5 --num_train_epochs 3.0 --max_seq_length 384 --seed 42 --fp16 --fp16_opt_level O1 --lambda_kl 4.0 --lambda_neg 2.0 --lambda_flt 1.0 --filter_threshold -2.0 --append_title --evaluate_during_training --overwrite_output_dir --teacher_dir densephrases/outputs/spanbert-base-cased-squad I also train this model for another 2 epochs like your makefile using pre-batch negative and train-v1.1.json (the real squad data), but the results is still below the paper results. (1) Does I should use different hyperparameters? I found your paper use different parameters with your script, such as batch size (84 vs 24) or lambda weight, etc. (2) In the paper, the results are the average of random seeds? (3) Do you use the whole nq and squad datasets to train the model?

    opened by kugwzk 7
  • Representations of phrases

    Representations of phrases

    Hi,

    Thanks for the interesting project!

    One question: If I want to get only phrase representations from your pre-trained model, how can I do that? I plan to use them as baselines. Thank you!

    Best, Jiacheng

    opened by JiachengLi1995 6
  • How to extract phrases from Wikipedia?

    How to extract phrases from Wikipedia?

    Hi!

    First of all thanks a lot for this solid project!

    I just want to figure out how to extract phrases from Wikipedia? Which script is the right one? I am a little confused when I see so many scripts in the preprocess folder.

    opened by Albert-Ma 5
  • Train custom dataset

    Train custom dataset

    Hi Jhyuklee,

    Thank you for good works and support. I have one query. Here I want use my custom pdf statements as a dump in place of Wikipedia dump, and want a model to get information from pdf data rather than getting it from wikipedia.

    Do I need to freshly train our whole dump data or is there a way where I can fine tune this model based on checkpoints trained by you.

    Pls guide.

    opened by tiwari93 5
  • Significance of line 174 in train_query.py code

    Significance of line 174 in train_query.py code

    Hi,

    I was going through the code for query finetuning and I am not able to understand one condition in the code:

    image

    Is the above highlighted line redundant and if not what is the significance (I feel we can directly update the encoder). Just wanted to make sure that I am not missing anything.

    opened by Nishant3815 4
  • Question about faiss parameter

    Question about faiss parameter

    Hi,

    Thanks for the amazing work! May I ask how do you choose the parameter for faiss index? Like the number of clusters and quant type OPQ96? It seems that the number of clusters varies with the number of phrases to save.

    Thanks!

    opened by PlusRoss 4
  • failed with

    failed with "make draft MODEL_NAME=test"

    logs as following, thanks

    convert squad examples to features: 100%|█████████████████████████████████████████████████████████████████████████| 902/902 [00:00<00:00, 2092.37it/s] add example index and unique id: 100%|██████████████████████████████████████████████████████████████████████████| 902/902 [00:00<00:00, 439863.06it/s] 06/14/2022 22:26:05 - INFO - main - Number of trainable params: 258,127,108 Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.

    Defaults for this optimization level are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'") 06/14/2022 22:26:05 - INFO - main - ***** Running training ***** 06/14/2022 22:26:05 - INFO - main - Num examples = 1218 06/14/2022 22:26:05 - INFO - main - Num Epochs = 2 06/14/2022 22:26:05 - INFO - main - Instantaneous batch size per GPU = 48 06/14/2022 22:26:05 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 384 06/14/2022 22:26:05 - INFO - main - Gradient Accumulation steps = 1 06/14/2022 22:26:05 - INFO - main - Total optimization steps = 8 Epoch: 0%| | 0/2 [00:00<?, ?it/s]06/14/2022 22:26:05 - INFO - main -
    [Epoch 1] 06/14/2022 22:26:05 - INFO - main - Initialize pre-batch of size 2 for Epoch 1

    raceback (most recent call last): | 0/4 [00:00<?, ?it/s] File "train_rc.py", line 593, in main() File "train_rc.py", line 537, in main global_step, tr_loss = train(args, train_dataset, model, tokenizer) File "train_rc.py", line 222, in train outputs = model(**inputs) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) StopIteration: Caught StopIteration in replica 0 on device 0. Original Traceback (most recent call last): File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, **kwargs) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/ssd1/zhangyiming/densephrase/DensePhrases/densephrases/encoder.py", line 132, in forward start, end = self.embed_phrase(input_ids, attention_mask, token_type_ids) File "/ssd1/zhangyiming/densephrase/DensePhrases/densephrases/encoder.py", line 94, in embed_phrase outputs = self.phrase_encoder( File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/modeling_bert.py", line 707, in forward attention_mask, input_shape, self.device File "/ssd3/wangxiao/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/modeling_utils.py", line 113, in device return next(self.parameters()).device StopIteration

    opened by xixiaoyao 2
Releases(v1.1.0)
Owner
Princeton Natural Language Processing
Princeton Natural Language Processing
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
An open-source project for applying deep learning to medical scenarios

Auto Vaidya An open source solution for creating end-end web app for employing the power of deep learning in various clinical scenarios like implant d

Smaranjit Ghose 18 May 29, 2022
计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制,还收集了一些即插即用模块。由于能力有限精力有限,可能很多模块并没有包括进来,有任何的建议或者改进,可以提交issue或者进行PR。

PJDong 599 Dec 23, 2022
Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

Unsupervised Learning of Visual 3D Keypoints for Control [Project Website] [Paper] Boyuan Chen1, Pieter Abbeel1, Deepak Pathak2 1UC Berkeley 2Carnegie

Boyuan Chen 34 Jul 22, 2022
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 08, 2022
Uncertain natural language inference

Uncertain Natural Language Inference This repository hosts the code for the following paper: Tongfei Chen*, Zhengping Jiang*, Adam Poliak, Keisuke Sak

Tongfei Chen 14 Sep 01, 2022
Protect against subdomain takeover

domain-protect scans Amazon Route53 across an AWS Organization for domain records vulnerable to takeover deploy to security audit account scan your en

OVO Technology 0 Nov 17, 2022
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur

2 Jan 09, 2022
Replication of Pix2Seq with Pretrained Model

Pretrained-Pix2Seq We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and c

peng gao 51 Nov 22, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)

Score Transformer This is the official repository for "Score Transformer": Score Transformer: Generating Musical Scores from Note-level Representation

22 Dec 22, 2022
Neural Cellular Automata + CLIP

🧠 Text-2-Cellular Automata Using Neural Cellular Automata + OpenAI CLIP (Work in progress) Examples Text Prompt: Cthulu is watching cthulu_is_watchin

Mainak Deb 21 Dec 19, 2022
Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

PyVarInf PyVarInf provides facilities to easily train your PyTorch neural network models using variational inference. Bayesian Deep Learning with Vari

342 Dec 02, 2022
Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker 자동으로 자가진단 해주는 프로그램(python 필요) 중요 이 프로그램이 실행될때에는 절대로 마우스포인터를 움직이거나 키보드를 건드리면 안된다(화면인식, 마우스포인터로 직접 클릭) 사용법 프로그램을 구동할 폴더 내의 cmd창에서 pip

1 Dec 30, 2021
PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

HPNet This repository contains the PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations. Installation The

Siming Yan 42 Dec 07, 2022
Semi-SDP Semi-supervised parser for semantic dependency parsing.

Semi-SDP Semi-supervised parser for semantic dependency parsing. This repo contains the code used for the semi-supervised semantic dependency parser i

12 Sep 17, 2021
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.

The repository contains the implementations for Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects. Model

Ankur Deria 115 Jan 06, 2023
scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

scAR scAR (single cell Ambient Remover) is a package for denoising multiple single cell omics data. It can be used for multiple tasks, such as, sgRNA

19 Nov 28, 2022