RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Related tags

Deep Learningrng-kbqa
Overview

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Authors: Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou and Caiming Xiong

Abstract

main figure

Existing KBQA approaches, despite achieving strong performance on i.i.d. test data, often struggle in generalizing to questions involving unseen KB schema items. Prior rankingbased approaches have shown some success in generalization, but suffer from the coverage issue. We present RnG-KBQA, a Rank-andGenerate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability. Our approach first uses a contrastive ranker to rank a set of candidate logical forms obtained by searching over the knowledge graph. It then introduces a tailored generation model conditioned on the question and the top-ranked candidates to compose the final logical form. We achieve new state-ofthe-art results on GRAILQA and WEBQSP datasets. In particular, our method surpasses the prior state-of-the-art by a large margin on the GRAILQA leaderboard. In addition, RnGKBQA outperforms all prior approaches on the popular WEBQSP benchmark, even including the ones that use the oracle entity linking. The experimental results demonstrate the effectiveness of the interplay between ranking and generation, which leads to the superior performance of our proposed approach across all settings with especially strong improvements in zero-shot generalization.

Paper link: https://arxiv.org/pdf/2109.08678.pdf

Requirements

The code is tested under the following environment setup

  • python==3.8.10
  • pytorch==1.7.0
  • transformers==3.3.1
  • spacy==3.1.1
  • other requirments please see requirements.txt

System requirements:

It's recommended to use a machine with over 300G memory to train the models, and use a machine with 128G memory for inference. However, 256G memory will still be sufficient for runing inference and training all of the models (some tricks for saving memorry is needed in training ranker model for GrailQA).

General Setup

Setup Experiment Directory

Before Running the scripts, please use the setup.sh to setup the experiment folder. Basically it creates some symbolic links in each exp directory.

Setup Freebase

All of the datasets use Freebase as the knowledge source. Please follow Freebase Setup to set up a Virtuoso triplestore service. If you modify the default url, you may need to change the url in /framework/executor/sparql_executor.py accordingly, after starting your virtuoso service,

Reproducing the Results on GrailQA

Please use /GrailQA as the working directory when running experiments on GrailQA.


Prepare dataset and pretrained checkpoints

Dataset

Please download the dataset and put the them under outputs so as to organize dataset as outputs/grailqa_v1.0_train/dev/test.json. (Please rename test-public split to test split).

NER Checkpoints

We use the NER system (under directory entity_linking and entity_linker) from Original GrailQA Code Repo. Please use the following instructions (copied from oringinal repo) to pull related data

Other Checkpoints

Please download the following checkpoints for entity disambiguation, candidate ranking, and augmented generation checkpoints, unzip and put them under checkpoints/ directory

KB Cache

We attach the cache of query results from KB, which can help save some time. Please download the cache file for grailqa, unzip and put them under cache/, so that we have cache/grail-LinkedRelation.bin and cache/grail-TwoHopPath.bin in the place.


Running inference

Demo for Checking the Pipeline

It's recommended to use the one-click demo scripts first to test if everything mentioned above is setup correctly. If it successfully run through, you'll get a final F1 of around 0.86. Please make sure you successfully reproduce the results on this small demo set first, as inference on dev and test can take a long time.

sh scripts/walk_through_demo.sh

Step by Step Instructions

We also provide step-by-step inference instructions as below:

(i) Detecting Entities

Once having the entity linker ready, run

python detect_entity_mention.py --split # eg. --split test

This will write entity mentions to outputs/grail_ _entities.json , we extract up to 10 entities for each mention, which will be further disambiguate in the next step.

!! Running entity detection for the first time will require building surface form index, which can take a long time (but it's only needed for the first time).

(ii) Disambiguating Entities (Entity Linking)

We have provided pretrained ranker model

sh scripts/run_disamb.sh predict

E.g., sh scripts/run_disamb.sh predict checkpoints/grail_bert_entity_disamb test

This will write the prediction results (in the form of selected entity index for each mention) to misc/grail_ _entity_linking.json .

(iii) Enumerating Logical Form Candidates

python enumerate_candidates.py --split --pred_file

E.g., python enumerate_candidates.py --split test --pred_file misc/grail_test_entity_linking.json.

This will write enumerated candidates to outputs/grail_ _candidates-ranking.jsonline .

(iv) Running Ranker

sh scripts/run_ranker.sh predict

E.g., sh scripts/run_ranker.sh predict checkpoints/grail_bert_ranking test

This will write prediction candidate logits (the logits of each candidate for each example) to misc/grail_ _candidates_logits.bin , and prediction result (in original GrailQA prediction format) to misc/grail_ _ranker_results.txt

You may evaluate the ranker results by python grail_evaluate.py

E.g., python grail_evaluate.py outputs/grailqa_v1.0_dev.json misc/grail_dev_ranker_results.txt

(v) Running Generator

First, make prepare generation model inputs

python make_generation_dataset.py --split --logit_file

E.g., python make_generation_dataset.py --split test --logit_file misc/grail_test_candidate_logits.bin.

This will read the canddiates and the use logits to select top-k candidates and write generation model inputs to outputs/grail_ _gen.json .

Second, run generation model to get the top-k prediction

sh scripts/run_gen.sh predict

E.g., sh scripts/run_gen.sh predict checkpoints/grail_t5_generation test.

This will generate top-k decoded logical forms stored at misc/grail_ _topk_generations.json .

(vi) Final Inference Steps

Having the decoded top-k predictions, we'll go down the top-k list, execute the logical form one by one until we find one logical form return valid answers.

python eval_topk_prediction.py --split --pred_file

E.g., python eval_topk_prediction.py --split test --pred_file misc/grail_test_topk_generations.json

prediction result (in original GrailQA prediction format) to misc/grail_ _final_results.txt .

You can then use official GrailQA evaluate script to run evaluation

python grail_evaluate.py

E.g., python grail_evaluate.py outputs/grailqa_v1.0_dev.json misc/grail_dev_final_results.txt


Training Models

We already attached pretrained-models ready for running inference. If you'd like to train your own models please checkout the README at /GrailQA folder.

Reproducing the Results on WebQSP

Please use /WebQSP as the working directory when running experiments on WebQSP.


Prepare dataset and pretrained checkpoints

Dataset

Please download the WebQSP dataset and put the them under outputs so as to organize dataset as outputs/WebQSP.train[test].json.

Evaluation Script

Please make a copy of the official evaluation script (eval/eval.py in the WebQSP zip file) and put the script under this directory (WebQSP) with the name legacy_eval.py.

Model Checkpoints

Please download the following checkpoints for candidate ranking, and augmented generation checkpoints, unzip and put them under checkpoints/ directory

KB Cache

Please download the cache file for webqsp, unzip and put them under cache/ so that we have cache/webqsp-LinkedRelation.bin and cache/webqsp-TwoHopPath.bin in the place.


Running inference

(i) Parsing Sparql-Query to S-Expression

As stated in the paper, we generate s-expressions, which is not provided by the original dataset, so we provide scripts to parse sparql-query to s-expressions.

Run python parse_sparql.py, which will augment original dataset files with s-expressions and save them in outputs as outputs/WebQSP.train.expr.json and outputs/WebQSP.dev.expr.json. Since there is no validation set, we further randomly select 200 examples from the training set for validation, yielding ptrain split and pdev split.

(ii) Entity Detection and Linking using ELQ

This step can be skipped, as we've already include outputs of this step (misc/webqsp_train_elq-5_mid.json, outputs/webqsp_test_elq-5_mid.json).

The scripts and config of ELQ model can be found in elq_linking/run_elq_linker.py. If you'd like to use the script to run entity linking, please copy the run_elq_linker.py python script to ELQ model and run the script there.

(iii) Enumerating Logical Form Candidates

python enumerate_candidates.py --split test

This will write enumerated candidates to outputs/webqsp_test_candidates-ranking.jsonline.

(iv) Runing Ranker

sh scripts/run_ranker.sh predict checkpoints/webqsp_bert_ranking test

This will write prediction candidate logits (the logits of each candidate for each example) to misc/webqsp_test_candidates_logits.bin, and prediction result (in original GrailQA prediction format) to misc/webqsp_test_ranker_results.txt

(v) Running Generator

First, make prepare generation model inputs

python make_generation_dataset.py --split test --logit_file misc/webqsp_test_candidate_logits.bin.

This will read the candidates and the use logits to select top-k candidates and write generation model inputs to outputs/webqsp_test_gen.json.

Second, run generation model to get the top-k prediction

sh scripts/run_gen.sh predict checkpoints/webqsp_t5_generation test

This will generate top-k decoded logical forms stored at misc/webqsp_test_topk_generations.json.

(vi) Final Inference Steps

Having the decoded top-k predictions, we'll go down the top-k list, execute the logical form one by one until we find one logical form return valid answers.

python eval_topk_prediction.py --split test --pred_file misc/webqsp_test_topk_generations.json

Prediction result will be stored (in GrailQA prediction format) to misc/webqsp_test_final_results.txt.

You can then use official WebQSP (only modified in I/O) evaluate script to run evaluation

python webqsp_evaluate.py outputs/WebQSP.test.json misc/webqsp_test_final_results.txt.


Training Models

We already attached pretrained-models ready for running inference. If you'd like to train your own models please checkout the README at /WebQSP folder.

Citation

@misc{ye2021rngkbqa,
    title={RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering}, 
    author={Xi Ye and Semih Yavuz and Kazuma Hashimoto and Yingbo Zhou and Caiming Xiong},
    year={2021},
    eprint={2109.08678},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Questions?

For any questions, feel free to open issues, or shoot emails to

License

The code is released under BSD 3-Clause - see LICENSE for details.

Owner
Salesforce
A variety of vendor agnostic projects which power Salesforce
Salesforce
BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer Project Page | Paper | Video State-of-the-art image-to-image translatio

47 Dec 06, 2022
Replication of Pix2Seq with Pretrained Model

Pretrained-Pix2Seq We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and c

peng gao 51 Nov 22, 2022
Unrolled Generative Adversarial Networks

Unrolled Generative Adversarial Networks Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein arxiv:1611.02163 This repo contains an example notebo

Ben Poole 292 Dec 06, 2022
Magisk module to enable hidden features on Android 12 Developer Preview 1.

Android 12 Extensions This is a Magisk module that enables hidden features on Android 12 Developer Preview 1. Features Scrolling screenshots Wallpaper

Danny Lin 384 Jan 06, 2023
Real life contra a deep learning project built using mediapipe and openc

real-life-contra Description A python script that translates the body movement into in game control. Welcome to all new real life contra a deep learni

Programminghut 7 Jan 26, 2022
Kaggle competition: Springleaf Marketing Response

PruebaEnel Prueba Kaggle-Springleaf-master Prueba Kaggle-Springleaf Kaggle competition: Springleaf Marketing Response Competencia de Kaggle: Marketing

1 Feb 09, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

The Hypersim Dataset For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real i

Apple 1.3k Jan 04, 2023
Official repo for our 3DV 2021 paper "Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements".

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy Paper. Pr

Yu Rong 41 Dec 13, 2022
LyaNet: A Lyapunov Framework for Training Neural ODEs

LyaNet: A Lyapunov Framework for Training Neural ODEs Provide the model type--config-name to train and test models configured as those shown in the pa

Ivan Dario Jimenez Rodriguez 21 Nov 21, 2022
Automatic library of congress classification, using word embeddings from book titles and synopses.

Automatic Library of Congress Classification The Library of Congress Classification (LCC) is a comprehensive classification system that was first deve

Ahmad Pourihosseini 3 Oct 01, 2022
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 07, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023
The MLOps platform for innovators 🚀

​ DS2.ai is an integrated AI operation solution that supports all stages from custom AI development to deployment. It is an AI-specialized platform service that collects data, builds a training datas

9 Jan 03, 2023
GPT, but made only out of gMLPs

GPT - gMLP This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will

Phil Wang 80 Dec 01, 2022
Hough Transform and Hough Line Transform Using OpenCV

Hough transform is a feature extraction method for detecting simple shapes such as circles, lines, etc in an image. Hough Transform and Hough Line Transform is implemented in OpenCV with two methods;

Happy N. Monday 3 Feb 15, 2022
Content shared at DS-OX Meetup

Streamlit-Projects Streamlit projects available in this repo: An introduction to Streamlit presented at DS-OX (Feb 26, 2020) meetup Streamlit 101 - Ja

Arvindra 69 Dec 23, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca

Google Research Datasets 89 Jan 08, 2023
Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

Blender PCOY (Pantone Color of the Year) MCMC (Mid-Century Modern Colors) HG71 (House & Garden Colors 1971) Blender Add-ons That Assign a Custom Color

Don Schnitzius 15 Nov 20, 2022