"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

Related tags

Deep LearningStAR_KGC
Overview

STAR_KGC

This repo contains the source code of the paper accepted by WWW'2021. "Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021).

1. Thanks

The repository is partially based on huggingface transformers, KG-BERT and RotatE.

2. Installing requirement packages

  • conda create -n StAR python=3.6
  • source activate StAR
  • pip install numpy torch tensorboardX tqdm boto3 requests regex sacremoses sentencepiece matplotlib
2.1 Optional package (for mixed float Computation)

3. Dataset

  • WN18RR, FB15k-237, UMLS

    • Train and test set in ./data
    • As validation on original dev set is costly, we validated the model on dev subset during training.
    • The dev subset of WN18RR is provided in ./data/WN18RR called new_dev.dict. Use below commands to get the dev subset for WN18RR (FB15k-237 is similar without the --do_lower_case) used in training process.
     CUDA_VISIBLE_DEVICES=0 \
      python get_new_dev_dict.py \
     	--model_class bert \
     	--weight_decay 0.01 \
     	--learning_rate 5e-5 \
     	--adam_epsilon 1e-6 \
     	--max_grad_norm 0. \
     	--warmup_proportion 0.05 \
     	--do_train \
     	--num_train_epochs 7 \
     	--dataset WN18RR \
     	--max_seq_length 128 \
     	--gradient_accumulation_steps 4 \
     	--train_batch_size 16 \
     	--eval_batch_size 128 \
     	--logging_steps 100 \
     	--eval_steps -1 \
     	--save_steps 2000 \
     	--model_name_or_path bert-base-uncased \
     	--do_lower_case \
     	--output_dir ./result/WN18RR_get_dev \
     	--num_worker 12 \
     	--seed 42 \
    
     CUDA_VISIBLE_DEVICES=0 \
      python get_new_dev_dict.py \
     	--model_class bert \
     	--weight_decay 0.01 \
     	--learning_rate 5e-5 \
     	--adam_epsilon 1e-6 \
     	--max_grad_norm 0. \
     	--warmup_proportion 0.05 \
     	--do_eval \
     	--num_train_epochs 7 \
     	--dataset WN18RR \
     	--max_seq_length 128 \
     	--gradient_accumulation_steps 4 \
     	--train_batch_size 16 \
     	--eval_batch_size 128 \
     	--logging_steps 100 \
     	--eval_steps 1000 \
     	--save_steps 2000 \
     	--model_name_or_path ./result/WN18RR_get_dev \
     	--do_lower_case \
     	--output_dir ./result/WN18RR_get_dev \
     	--num_worker 12 \
     	--seed 42 \
    
  • NELL-One

    • We reformat original NELL-One as the three benchmarks above.
    • Please run the below command to get the reformatted data.
     python reformat_nell_one.py --data_dir path_to_downloaded --output_dir ./data/NELL_standard
    

4. Training and Test (StAR)

Run the below commands for reproducing results in paper. Note, all the eval_steps is set to -1 to train w/o validation and save the last checkpoint, because standard dev is very time-consuming. This can get similar results as in the paper.

4.1 WN18RR

CUDA_VISIBLE_DEVICES=0 \
python run_link_prediction.py \
    --model_class roberta \
    --weight_decay 0.01 \
    --learning_rate 1e-5 \
    --adam_betas 0.9,0.98 \
    --adam_epsilon 1e-6 \
    --max_grad_norm 0. \
    --warmup_proportion 0.05 \
    --do_train --do_eval \
    --do_prediction \
    --num_train_epochs 7 \
    --dataset WN18RR \
    --max_seq_length 128 \
    --gradient_accumulation_steps 4 \
    --train_batch_size 16 \
    --eval_batch_size 128 \
    --logging_steps 100 \
    --eval_steps 4000 \
    --save_steps 2000 \
    --model_name_or_path roberta-large \
    --output_dir ./result/WN18RR_roberta-large \
    --num_worker 12 \
    --seed 42 \
    --cls_method cls \
    --distance_metric euclidean \
CUDA_VISIBLE_DEVICES=2 \
python run_link_prediction.py \
    --model_class bert \
    --weight_decay 0.01 \
    --learning_rate 5e-5 \
    --adam_betas 0.9,0.98 \
    --adam_epsilon 1e-6 \
    --max_grad_norm 0. \
    --warmup_proportion 0.05 \
    --do_train --do_eval \
    --do_prediction \
    --num_train_epochs 7 \
    --dataset WN18RR \
    --max_seq_length 128 \
    --gradient_accumulation_steps 4 \
    --train_batch_size 16 \
    --eval_batch_size 128 \
    --logging_steps 100 \
    --eval_steps 4000 \
    --save_steps 2000 \
    --model_name_or_path bert-base-uncased \
    --do_lower_case \
    --output_dir ./result/WN18RR_bert \
    --num_worker 12 \
    --seed 42 \
    --cls_method cls \
    --distance_metric euclidean \

4.2 FB15k-237

CUDA_VISIBLE_DEVICES=0 \
python run_link_prediction.py \
    --model_class roberta \
    --weight_decay 0.01 \
    --learning_rate 1e-5 \
    --adam_betas 0.9,0.98 \
    --adam_epsilon 1e-6 \
    --max_grad_norm 0. \
    --warmup_proportion 0.05 \
    --do_train --do_eval \
    --do_prediction \
    --num_train_epochs 7. \
    --dataset FB15k-237 \
    --max_seq_length 100 \
    --gradient_accumulation_steps 4 \
    --train_batch_size 16 \
    --eval_batch_size 128 \
    --logging_steps 100 \
    --eval_steps -1 \
    --save_steps 2000 \
    --model_name_or_path roberta-large \
    --output_dir ./result/FB15k-237_roberta-large \
    --num_worker 12 \
    --seed 42 \
    --fp16 \
    --cls_method cls \
    --distance_metric euclidean \

4.3 UMLS

CUDA_VISIBLE_DEVICES=0 \
python run_link_prediction.py \
    --model_class roberta \
    --weight_decay 0.01 \
    --learning_rate 1e-5 \
    --adam_betas 0.9,0.98 \
    --adam_epsilon 1e-6 \
    --max_grad_norm 0. \
    --warmup_proportion 0.05 \
    --do_train --do_eval \
    --do_prediction \
    --num_train_epochs 20 \
    --dataset UMLS \
    --max_seq_length 16 \
    --gradient_accumulation_steps 1 \
    --train_batch_size 16 \
    --eval_batch_size 128 \
    --logging_steps 100 \
    --eval_steps -1 \
    --save_steps 200 \
    --model_name_or_path roberta-large \
    --output_dir ./result/UMLS_model \
    --num_worker 12 \
    --seed 42 \
    --cls_method cls \
    --distance_metric euclidean 

4.4 NELL-One

CUDA_VISIBLE_DEVICES=0 \
python run_link_prediction.py \
    --model_class bert \
    --do_train --do_eval \usepacka--do_prediction \
    --warmup_proportion 0.1 \
    --learning_rate 5e-5 \
    --num_train_epochs 8. \
    --dataset NELL_standard \
    --max_seq_length 32 \
    --gradient_accumulation_steps 1 \
    --train_batch_size 16 \
    --eval_batch_size 128 \
    --logging_steps 100 \
    --eval_steps -1 \
    --save_steps 2000 \
    --model_name_or_path bert-base-uncased \
    --do_lower_case \
    --output_dir ./result/NELL_model \
    --num_worker 12 \
    --seed 42 \
    --fp16 \
    --cls_method cls \
    --distance_metric euclidean 

5. StAR_Self-Adp

5.1 Data preprocessing

  • Get the trained model of RotatE, more details please refer to RotatE.

  • Run the below commands sequentially to get the training dataset of StAR_Self-Adp.

    • Run the run_get_ensemble_data.py in ./StAR
     CUDA_VISIBLE_DEVICES=0 python run_get_ensemble_data.py \
     	--dataset WN18RR \
     	--model_class roberta \
     	--model_name_or_path ./result/WN18RR_roberta-large \
     	--output_dir ./result/WN18RR_roberta-large \
     	--seed 42 \
     	--fp16 
    
    • Run the ./codes/run.py in rotate. (please replace the TRAINED_MODEL_PATH with your own trained model's path)
     CUDA_VISIBLE_DEVICES=3 python ./codes/run.py \
     	--cuda --init ./models/RotatE_wn18rr_0 \
     	--test_batch_size 16 \
     	--star_info_path /home/wangbo/workspace/StAR_KGC-master/StAR/result/WN18RR_roberta-large \
     	--get_scores --get_model_dataset 
    

5.2 Train and Test

  • Run the run.py in ./StAR/ensemble. Note the --mode should be alternate in head and tail, and perform a average operation to get the final results.
  • Note: Please replace YOUR_OUTPUT_DIR, TRAINED_MODEL_PATH and StAR_FILE_PATH in ./StAR/peach/common.py with your own paths to run the command and code.
CUDA_VISIBLE_DEVICES=2 python run.py \
--do_train --do_eval --do_prediction --seen_feature \
--mode tail \
--learning_rate 1e-3 \
--feature_method mix \
--neg_times 5 \
--num_train_epochs 3 \
--hinge_loss_margin 0.6 \
--train_batch_size 32 \
--test_batch_size 64 \
--logging_steps 100 \
--save_steps 2000 \
--eval_steps -1 \
--warmup_proportion 0 \
--output_dir /home/wangbo/workspace/StAR_KGC-master/StAR/result/WN18RR_roberta-large_ensemble  \
--dataset_dir /home/wangbo/workspace/StAR_KGC-master/StAR/result/WN18RR_roberta-large \
--context_score_path /home/wangbo/workspace/StAR_KGC-master/StAR/result/WN18RR_roberta-large \
--translation_score_path /home/wangbo/workspace/StAR_KGC-master/rotate/models/RotatE_wn18rr_0  \
--seed 42 
Owner
Bo Wang
Ph.D. student at the School of Artificial Intelligence, Jilin University.
Bo Wang
Multi-Glimpse Network With Python

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

9 May 10, 2022
Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation (CVPR2019) This is a pytorch implementatio

Yawei Luo 280 Jan 01, 2023
This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

刘芮金 32 Dec 20, 2022
Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL)

Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL) This repository contains all source code used to generate the results in the article "

Charlotte Loh 3 Jul 23, 2022
Static-test - A playground to play with ideas related to testing the comparability of the code

Static test playground ⚠️ The code is just an experiment. Compiles and runs on U

Igor Bogoslavskyi 4 Feb 18, 2022
Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

.NET Interactive Notebooks for C# Welcome to the home of .NET interactive notebooks for C#! How to Install Download the .NET Coding Pack for VS Code f

.NET Platform 425 Dec 25, 2022
Semi-Autoregressive Transformer for Image Captioning

Semi-Autoregressive Transformer for Image Captioning Requirements Python 3.6 Pytorch 1.6 Prepare data Please use git clone --recurse-submodules to clo

YE Zhou 23 Dec 09, 2022
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

On-Device AI Co., Ltd. 7 Apr 05, 2022
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Results results on COCO val Backbone Method Lr Schd PQ Config Download

155 Dec 20, 2022
Official implementation of ETH-XGaze dataset baseline

ETH-XGaze baseline Official implementation of ETH-XGaze dataset baseline. ETH-XGaze dataset ETH-XGaze dataset is a gaze estimation dataset consisting

Xucong Zhang 134 Jan 03, 2023
Hide screen when boss is approaching.

BossSensor Hide your screen when your boss is approaching. Demo The boss stands up. He is approaching. When he is approaching, the program fetches fac

Hiroki Nakayama 6.2k Jan 07, 2023
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.

numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation

Ryo Yonetani 2 Jan 16, 2022
ConformalLayers: A non-linear sequential neural network with associative layers

ConformalLayers: A non-linear sequential neural network with associative layers ConformalLayers is a conformal embedding of sequential layers of Convo

Prograf-UFF 5 Sep 28, 2022
Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

人像卡通化 (Photo to Cartoon) 中文版 | English Version 该项目为小视科技卡通肖像探索项目。您可使用微信扫描下方二维码或搜索“AI卡通秀”小程序体验卡通化效果。

Minivision_AI 3.5k Dec 30, 2022
Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

RSNA AI Deep Learning Lab 2021 Intro Welcome Deep Learners! This document provides all the information you need to participate in the RSNA AI Deep Lea

RSNA 65 Dec 16, 2022
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

TorchCAM: class activation explorer Simple way to leverage the class-specific activation of convolutional layers in PyTorch. Quick Tour Setting your C

F-G Fernandez 1.2k Dec 29, 2022
The Codebase for Causal Distillation for Language Models.

Causal Distillation for Language Models Zhengxuan Wu*,Atticus Geiger*, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D.

Zen 20 Dec 31, 2022
The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

Booz Allen Hamilton 112 Dec 13, 2022
Simple implementation of OpenAI CLIP model in PyTorch.

It was in January of 2021 that OpenAI announced two new models: DALL-E and CLIP, both multi-modality models connecting texts and images in some way. In this article we are going to implement CLIP mod

Moein Shariatnia 226 Jan 05, 2023
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre

Liming Jiang 460 Jan 04, 2023