RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Related tags

Deep LearningRuleBert
Overview

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

(Paper) (Slides) (Video)

RuleBERT reasons over Natural Language

RuleBERT is a pre-trained language model that has been fine-tuned on soft logical results. This repo contains the required code for running the experiments of the associated paper.

Installation

0. Clone Repo

git clone https://github.com/MhmdSaiid/RuleBert
cd RuleBERT

1. Create virtual env and install reqs

(optional) virtualenv -m python RuleBERT
pip install -r requirements.txt

2. Download Data

The datasets can be found here. (DISCLAIMER: ~25 GB on disk)

You can also run:

bash download_datasets.sh

Run Experiments

When an experiemnt is complete, the model, the tokenizer, and the results are stored in models/**timestamp**.

i) Single Rules

bash experiments/single_rules/SR.sh data/single_rules 

ii) Rule Union Experiment

bash experiments/union_rules/UR.sh data/union_rules 

iii) Rule Chain Experiment

bash experiments/chain_rules/CR.sh data/chain_rules 

iv) External Datasets

Generate Your Own Data

You can generate your own data for a single rule, a union of rules sharing the same rule head, or a chain of rules.

First, make sure you are in the correct directory.

cd data_generation

1) Single Rule

There are two ways to data for a single rule:

i) Pass Data through Arguments

python DataGeneration.py 
       --rule 'spouse(A,B) :- child(A,B).' 
       --pool_list "[['Anne', 'Bob', 'Charlie'],
                    ['Frank', 'Gary', 'Paul']]" 
       --rule_support 0.67
  • --rule : The rule in string format. Consult here to see how to write a rule.
  • --pool_list : For every variable in the rule, we include a list of possible instantiations.
  • --rule_support : A float representing the rule support. If not specified, rule defaults to a hard rule.
  • --max_num_facts : Maximum number of facts in a generated theory.
  • --num : Total number of theories per generated (rule,facts).
  • --TWL : When called, we use three-way-logic instead of negation as failure. Unsatisifed predicates are no longer considered False.
  • --complementary_rules : A string of complementary rules to add.
  • --p_bar : Boolean to show a progress bar. Deafults to True.

ii) Pass a JSON file

This is more convenient for when rules are long or when there are multiple rules. The JSON file specifies the rule(s), pool list(s), and rule support(s). It is passed as an argument.

python DataGeneration.py --rule_json r1.jsonl

2) Union of Rules

For a union of rules sharing the same rule-head predicate, we pass a JSON file to the command that contaains rules with overlapping rule-head predicates.

python DataGeneration.py --rule_json Multi_rule.json 
                         --type union

--type is used to indicate which type of data generation method should be set to. For a union of rules, we use --type union. If --type single is used, we do single-rule data generation for each rule in the file.

3) Chained Rules

For a chain of rules, the json file should include rules that could be chained together.

python DataGeneration.py --rule_json chain_rules.json 
                         --type chain

The chain depth defaults to 5 --chain_depth 5.

Train your Own Model

To fine-tune the model, run:

# train
python trainer.py --data-dir data/R1/
                  --epochs 3
                  --verbose

When complete, the model and tokenizer are saved in models/**timestamp**.

To test the model, run:

# test
python tester.py --test_data_dir data/test_R1/
                 --model_dir models/**timestamp**
                 --verbose

A JSON file will be saved in model_dir containing the results.

Contact Us

For any inquiries, feel free to contact us, or raise an issue on Github.

Reference

You can cite our work:

@inproceedings{saeed-etal-2021-rulebert,
    title = "{R}ule{BERT}: Teaching Soft Rules to Pre-Trained Language Models",
    author = "Saeed, Mohammed  and
      Ahmadi, Naser  and
      Nakov, Preslav  and
      Papotti, Paolo",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.110",
    pages = "1460--1476",
    abstract = "While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at training. Moreover, we demonstrate that logical notions expressed by the rules are transferred to the fine-tuned model, yielding state-of-the-art results on external datasets.",
}

License

MIT

Owner
“If a machine is expected to be infallible, it cannot also be intelligent.” ― Alan Turing
This program was designed to detect whether someone is wearing a facemask through a live video stream.

This program was designed to detect whether someone is wearing a facemask through a live video stream. A custom lightweight CNN trained with TensorFlow on a public dataset provided by Kaggle is used

0 Apr 02, 2022
Graph Attention Networks

GAT Graph Attention Networks (Veličković et al., ICLR 2018): https://arxiv.org/abs/1710.10903 GAT layer t-SNE + Attention coefficients on Cora Overvie

Petar Veličković 2.6k Jan 05, 2023
Python Library for Signal/Image Data Analysis with Transport Methods

PyTransKit Python Transport Based Signal Processing Toolkit Website and documentation: https://pytranskit.readthedocs.io/ Installation The library cou

24 Dec 23, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
Improving the robustness and performance of biomedical NLP models through adversarial training

RobustBioNLP Improving the robustness and performance of biomedical NLP models through adversarial training In this repository you can find suppliment

Milad Moradi 3 Sep 20, 2022
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

Re-implementation of the paper 'Grokking: Generalization beyond overfitting on small algorithmic datasets' Paper Original paper can be found here Data

Tom Lieberum 38 Aug 09, 2022
RetinaFace: Deep Face Detection Library in TensorFlow for Python

RetinaFace is a deep learning based cutting-edge facial detector for Python coming with facial landmarks.

Sefik Ilkin Serengil 512 Dec 29, 2022
Main Results on ImageNet with Pretrained Models

This repository contains Pytorch evaluation code, training code and pretrained models for the following projects: SPACH (A Battle of Network Structure

Microsoft 151 Dec 14, 2022
Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

Intellia ICT 4 Feb 15, 2022
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Aniket Maurya 210 Dec 21, 2022
A Python Package For System Identification Using NARMAX Models

SysIdentPy is a Python module for System Identification using NARMAX models built on top of numpy and is distributed under the 3-Clause BSD license. N

Wilson Rocha 175 Dec 25, 2022
UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down. UpChecker - just run file and use project easy

UpChecker UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down.

Yan 4 Apr 07, 2022
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

HiFiGAN Denoiser This is a Unofficial Pytorch implementation of the paper HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep F

Rishikesh (ऋषिकेश) 134 Dec 27, 2022
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.3k Dec 30, 2022
Make your own game in a font!

Project structure. Included is a suite of tools to create font games. Tutorial: For a quick tutorial about how to make your own game go here For devel

Michael Mulet 125 Dec 04, 2022
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available for research purposes.

Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available f

Yongrui Chen 5 Nov 10, 2022
PyTorch implementation of the WarpedGANSpace: Finding non-linear RBF paths in GAN latent space (ICCV 2021)

Authors official PyTorch implementation of the "WarpedGANSpace: Finding non-linear RBF paths in GAN latent space" [ICCV 2021].

Christos Tzelepis 100 Dec 06, 2022
PyTorch implementation of Pay Attention to MLPs

gMLP PyTorch implementation of Pay Attention to MLPs. Quickstart Clone this repository. git clone https://github.com/jaketae/g-mlp.git Navigate to th

Jake Tae 34 Dec 13, 2022
High-Resolution 3D Human Digitization from A Single Image.

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i

Meta Research 8.4k Dec 29, 2022