Benchmark for Answering Existential First Order Queries with Single Free Variable

Overview

EFO-1-QA Benchmark for First Order Query Estimation on Knowledge Graphs

This repository contains an entire pipeline for the EFO-1-QA benchmark. EFO-1 stands for the Existential First Order Queries with Single Free Varibale. The related paper has been submitted to the NeurIPS 2021 track on dataset and benchmark. OpenReview Link, and appeared on arXiv

If this work helps you, please cite

@article{EFO-1-QA,
  title={Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs},
  author={Wang, Zihao and Yin, Hang and Song, Yangqiu},
  journal={arXiv preprint arXiv:2109.08925},
  year={2021}
}

The pipeline overview.

alt text

  1. Query type generation and normalization The query types are generated by the DFS iteration of the context free grammar with the bounded negation hypothesis. The generated types are also normalized to several normal forms
  2. Query grounding and answer sampling The queries are grounded on specific knowledge graphs and the answers that are non-trivial are sampled.
  3. Model training and estimation We train and evaluate the specific query structure

Query type generation and normalization

The OpsTree is represented in the nested objects of FirstOrderSetQuery class in fol/foq_v2.py. We first generate the specific OpsTree and then store then by the formula property of FirstOrderSetQuery.

The OpsTree is generated by binary_formula_iterator in fol/foq_v2.py. The overall process is managed in formula_generation.py.

To generate the formula, just run

python formula_generation.py

Then the file formula csv is generated in the outputs folder. In this paper, we use the file in outputs/test_generated_formula_anchor_node=3.csv

Query grounding and answer sampling

We first prepare the KG data and then run the sampling code

The KG data (FB15k, FB15k-237, NELL995) should be put into under 'data/' folder. We use the data provided in the KGReasoning.

The structure of the data folder should be at least

data
	|---FB15k-237-betae
	|---FB15k-betae
	|---NELL-betae	

Then we can run the benchmark sampling code on specific knowledge graph by

python benchmark_sampling.py --knowledge_graph FB15k-237 
python benchmark_sampling.py --knowledge_graph FB15k
python benchmark_sampling.py --knowledge_graph NELL

Append new forms to existing data One can append new forms to the existing dataset by

python append_new_normal_form.py --knowledge_graph FB15k-237 

Model training and estimation

Models

Examples

The detailed setting of hyper-parameters or the knowledge graph to choose are in config folder, you can modify those configurations to create your own, all the experiments are on FB15k-237 by default.

Besides, the generated benchmark, one can also use the BetaE dataset after converting to our format by running:

python transform_beta_data.py

Use one of the commands in the following, depending on the choice of models:

python main.py --config config/{data_type}_{model_name}.yaml
  • The data_type includes benchmark and beta
  • The model_name includes BetaE, LogicE, NewLook and Query2Box

If you need to evaluate on the EFO-1-QA benchmark, be sure to load from existing model checkpoint, you can train one on your own or download from here:

python main.py --config config/benchmark_beta.yaml --checkpoint_path ckpt/FB15k/Beta_full
python main.py --config config/benchmark_NewLook.yaml --checkpoint_path ckpt/FB15k/NLK_full --load_step 450000
python main.py --config config/benchmark_Logic.yaml --checkpoint_path ckpt/FB15k/Logic_full --load_step 450000

We note that the BetaE checkpoint above is trained from KGReasoning

Paper Checklist

  1. For all authors..

    (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? Yes

    (b) Have you read the ethics review guidelines and ensured that your paper conforms to them? Yes

    (c) Did you discuss any potential negative societal impacts of your work? No

    (d) Did you describe the limitations of your work? Yes

  2. If you are including theoretical results...

    (a) Did you state the full set of assumptions of all theoretical results? N/A

    (b) Did you include complete proofs of all theoretical results? N/A

  3. If you ran experiments...

    (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? Yes

    (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? Yes

    (c) Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? No

    (d) Did you include the amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? No

  4. If you are using existing assets (e.g., code, data, models) or curating/releasing new assets...

    (a) If your work uses existing assets, did you cite the creators? Yes

    (b) Did you mention the license of the assets? No

    (c) Did you include any new assets either in the supplemental material or as a URL? Yes

    (d) Did you discuss whether and how consent was obtained from people whose data you're using/curating? N/A

    (e) Did you discuss whether the data you are using/curating contains personally identifiable information or offensive content? N/A

  5. If you used crowdsourcing or conducted research with human subjects...

    (a) Did you include the full text of instructions given to participants and screenshots, if applicable? N/A

    (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? N/A

    (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? N/A

Owner
HKUST-KnowComp
Knowledge Computation [email protected], led by Yangqiu Song
HKUST-KnowComp
GE2340 project source code without credentials.

GE2340-Project-Public GE2340 project source code without credentials. Run the bot.py to start the bot Telegram: @jasperwong_ge2340_bot If the bot does

0 Feb 10, 2022
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 219 Dec 28, 2022
Reference code for the paper CAMS: Color-Aware Multi-Style Transfer.

CAMS: Color-Aware Multi-Style Transfer Mahmoud Afifi1, Abdullah Abuolaim*1, Mostafa Hussien*2, Marcus A. Brubaker1, Michael S. Brown1 1York University

Mahmoud Afifi 36 Dec 04, 2022
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

47 Dec 16, 2022
Object detection GUI based on PaddleDetection

PP-Tracking GUI界面测试版 本项目是基于飞桨开源的实时跟踪系统PP-Tracking开发的可视化界面 在PaddlePaddle中加入pyqt进行GUI页面研发,可使得整个训练过程可视化,并通过GUI界面进行调参,模型预测,视频输出等,通过多种类型的识别,简化整体预测流程。 GUI界面

杨毓栋 68 Jan 02, 2023
PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

PrimitiveNet Source code for the paper: Jingwei Huang, Yanfeng Zhang, Mingwei Sun. [PrimitiveNet: Primitive Instance Segmentation with Local Primitive

Jingwei Huang 47 Dec 06, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 07, 2022
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Official implementation of the NeurIPS 2021 paper Online Learning Of Neural Computations From Sparse Temporal Feedback

Online Learning Of Neural Computations From Sparse Temporal Feedback This repository is the official implementation of the NeurIPS 2021 paper Online L

Lukas Braun 3 Dec 15, 2021
Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Image Deraining"

SAPNet This repository contains the official Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contr

11 Oct 17, 2022
A library built upon PyTorch for building embeddings on discrete event sequences using self-supervision

pytorch-lifestream a library built upon PyTorch for building embeddings on discrete event sequences using self-supervision. It can process terabyte-si

Dmitri Babaev 103 Dec 17, 2022
An implementation of a discriminant function over a normal distribution to help classify datasets.

CS4044D Machine Learning Assignment 1 By Dev Sony, B180297CS The question, report and source code can be found here. Github Repo Solution 1 Based on t

Dev Sony 6 Nov 09, 2021
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 09, 2022
The Python code for the paper A Hybrid Quantum-Classical Algorithm for Robust Fitting

About The Python code for the paper A Hybrid Quantum-Classical Algorithm for Robust Fitting The demo program was only tested under Conda in a standard

Anh-Dzung Doan 5 Nov 28, 2022
Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic plasticity".

Impression-Learning-Camera-Ready Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic

2 Feb 09, 2022
Config files for my GitHub profile.

Canalyst Candas Data Science Library Name Canalyst Candas Description Built by a former PM / analyst to give anyone with a little bit of Python knowle

Canalyst Candas 13 Jun 24, 2022
Memory efficient transducer loss computation

Introduction This project implements the optimization techniques proposed in Improving RNN Transducer Modeling for End-to-End Speech Recognition to re

Fangjun Kuang 51 Nov 25, 2022
2021 credit card consuming recommendation

2021 credit card consuming recommendation

Wang, Chung-Che 7 Mar 08, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Talk covering the features of skorch

Skorch Talk Skorch - A Union of Scikit-learn and PyTorch Presentation The slides can be downloaded at: download link. Google Colab Part One - MNIST Pa

Thomas J. Fan 3 Oct 20, 2020