InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Last update: Nov 25, 2022

Overview

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

This is the official code base for our ICLR 2021 paper:

"InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective".

Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Usage

Prepare your environment

Download required packages

pip install -r requirements.txt

ANLI and TextFooler

To run ANLI and TextFooler experiments, refer to README in the ANLI directory.

SQuAD

We will upload the code for the SQuAD experiments soon.

Citation

@inproceedings{
wang2021infobert,
title={InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective},
author={Wang, Boxin and Wang, Shuohang and Cheng, Yu and Gan, Zhe and Jia, Ruoxi and Li, Bo and Liu, Jingjing},
booktitle={International Conference on Learning Representations},
year={2021}}

Owner

AI Secure

UIUC Secure Learning Lab

GitHub Repository

Need: Image Search With Python

Need: Image Search The problem is that a user needs to search for a specific ima

1 Dec 30, 2021

BiQE: Code and dataset for the BiQE paper

BiQE: Bidirectional Query Embedding This repository includes code for BiQE and the datasets introduced in Answering Complex Queries in Knowledge Graph

1 Oct 20, 2021

Search msDS-AllowedToActOnBehalfOfOtherIdentity

前言现在进行RBCD的攻击手段主要是搜索mS-DS-CreatorSID，如果机器的创建者是我们可控的话，那就可以修改对应机器的msDS-AllowedToActOnBehalfOfOtherIdentity，利用工具SharpAllowedToAct-Modify 那我们索性也试试搜索所有计算机

26 Dec 05, 2022

Speech Recognition Database Management with python

Speech Recognition Database Management The main aim of this project is to recogn

2 Feb 02, 2022

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力，旨在为飞桨开发者提升文本建模效率，并提供基于PaddlePaddle 2.0的NLP领域最佳实践。

6.9k Jan 01, 2023

Blue Brain text mining toolbox for semantic search and structured information extraction

Blue Brain Search Source Code DOI Data & Models DOI Documentation Latest Release Python Versions License Build Status Static Typing Code Style Securit

29 Dec 01, 2022

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

7 Nov 11, 2022

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

HuggingSound HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools. I have no intention of building a very complex tool here.

247 Dec 26, 2022

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 🤗 Transformers provides thousands of pretrained models to perform tasks o

77.3k Jan 03, 2023

SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

213 Jan 04, 2023

A library for Multilingual Unsupervised or Supervised word Embeddings

MUSE: Multilingual Unsupervised and Supervised Embeddings MUSE is a Python library for multilingual word embeddings, whose goal is to provide the comm

3k Jan 06, 2023

Transformer related optimization, including BERT, GPT

This repository provides a script and recipe to run the highly optimized transformer-based encoder and decoder component, and it is tested and maintained by NVIDIA.

1.7k Jan 04, 2023

An Open-Source Package for Neural Relation Extraction (NRE)

OpenNRE We have a DEMO website (http://opennre.thunlp.ai/). Try it out! OpenNRE is an open-source and extensible toolkit that provides a unified frame

3.9k Jan 03, 2023

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

auto_code_complete v1.3 purpose and usage auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the m

2 Feb 22, 2022

Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

1.9k Jan 06, 2023

gaiic2021-track3-小布助手对话短文本语义匹配复赛rank3、决赛rank4

决赛答辩已经过去一段时间了，我们队伍ac milan最终获得了复赛第3，决赛第4的成绩。在此首先感谢一些队友的carry～经过2个多月的比赛，学习收获了很多，也认识了很多大佬，在这里记录一下自己的参赛体验和学习收获。

102 Dec 19, 2022

Utilize Korean BERT model in sentence-transformers library

ko-sentence-transformers 이 프로젝트는 KoBERT 모델을 sentence-transformers 에서 보다 쉽게 사용하기 위해 만들어졌습니다. Ko-Sentence-BERT-SKTBERT 프로젝트에서는 KoBERT 모델을 sentence-trans

40 Dec 20, 2022

PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

279 Dec 09, 2022

A natural language processing model for sequential sentence classification in medical abstracts.

NLP PubMed Medical Research Paper Abstract (Randomized Controlled Trial) A natural language processing model for sequential sentence classification in

1 Jan 17, 2022

Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)

Indobenchmark Toolkit Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG) resources fo

11 Aug 26, 2022

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Related tags

Overview

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Usage

Prepare your environment

ANLI and TextFooler

SQuAD

Citation

Owner

AI Secure

Need: Image Search With Python

BiQE: Code and dataset for the BiQE paper

Search msDS-AllowedToActOnBehalfOfOtherIdentity

Speech Recognition Database Management with python

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

Blue Brain text mining toolbox for semantic search and structured information extraction

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

SummerTime - Text Summarization Toolkit for Non-experts

A library for Multilingual Unsupervised or Supervised word Embeddings

Transformer related optimization, including BERT, GPT

An Open-Source Package for Neural Relation Extraction (NRE)

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Python implementation of TextRank for phrase extraction and summarization of text documents

gaiic2021-track3-小布助手对话短文本语义匹配复赛rank3、决赛rank4

Utilize Korean BERT model in sentence-transformers library

PyTorch implementation of Tacotron speech synthesis model.

A natural language processing model for sequential sentence classification in medical abstracts.

Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)