SimBiber - A tool for simplifying bibtex with official info

Related tags

MiscellaneousSimBiber
Overview

SimBiber: A tool for simplifying bibtex with official info.

We often need to simplify the official bib that consists of many information into a shorter version that only maintains necessary information (e.g., author, title, conference/journal name and etc) due to page limitation.

We introduce SimBiber, a simple tool in Python to simplify them automatically. Hope it's helpful for you.

We also highly recommend another wonderful tool for you Rebiber, which is a tool for normalizing bibtex with official info.

Changelog

  • 2021.12.31 We build the first version and release it.

Installation

git clone https://github.com/MLNLP-World/Simbiber.git
pip install bibtexparser

Usage(v0.1.0)

python SimBiberParser.py --input_path data/bibtex.bib --output_path out/bibtex.bib --config_path parserConfig.json --if_append_output False --cache_num 100
argument usage
--input_path The path to the input bib file that you want to simplify
--output_path The path to the output bib file that you want to save.
--config_path The path to the mapper config file
--if_append_output Whether append simplified data to output bib file.
--cache_num The number of bib items you want to simplify at once.
PLEASE ATTENTION: If you want to simplify a huge bib file, you'd better change it to achieve satisfactory speed.

Example Input and Output

An example simplified output entry with the official information:

@inproceedings{qin-etal-2019-stack,
    title = "A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding",
    author = "Qin, Libo  and
      Che, Wanxiang  and
      Li, Yangming  and
      Wen, Haoyang  and
      Liu, Ting",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1214",
    doi = "10.18653/v1/D19-1214",
    pages = "2078--2087",
    abstract = "Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely tied and the slots often highly depend on the intent. In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guiding the slot filling. In our framework, we adopt a joint model with Stack-Propagation which can directly use the intent information as input for slot filling, thus to capture the intent semantic knowledge. In addition, to further alleviate the error propagation, we perform the token-level intent detection for the Stack-Propagation framework. Experiments on two publicly datasets show that our model achieves the state-of-the-art performance and outperforms other previous methods by a large margin. Finally, we use the Bidirectional Encoder Representation from Transformer (BERT) model in our framework, which further boost our performance in SLU task.",
}

An example simplified output entry from the official information:

@inproceedings{qin-etal-2019-stack,
    author = {Qin, Libo  and
     Che, Wanxiang  and
     Li, Yangming  and
     Wen, Haoyang  and
     Liu, Ting},
    booktitle = {Proc. of EMNLP},
    title = {A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding},
    year = {2019}
}

Supported Conferences

The parserConfig.json contains a list of converted json files of the mapper between official full name and simplified name.

Name Full Name
AAAI Association for the Advance of Artificial Intelligence
ACL Association for Computational Linguistics
CCL Chinese Computational Linguistics
COLING International Conference on Computational Linguistics
EMNLP Empirical Methods in Natural Language Processing
ICASSP International Conference on Acoustics, Speech and Signal Processing
ICLR International Conference on Learning Representations
ICML International Conference on Machine Learning
LREC Language Resources and Evaluation Conference
NeurIPS Neural Information Processing Systems
NLPCC Natural Language Processing and Chinese Computing
SemEval International Workshop on Semantic Evaluation
SIGDIAL SIGdial Meeting on Discourse and Dialogue

Adding a new conference

You can manually add any conferences from DBLP to config map.

Take ICLR as an example:

  • Step 1: Go to DBLP
  • Step 2: Find the full name of Conference
  • Step 3: Add map to parserConfig.json
{"International Conference on Learning Representations": "ICLR"}

Contact

Please email or [email protected] or [email protected] create Github issues here if you have any questions or suggestions.

Contributor

Thanks to the contributors:

Libo Qin; Qiguang Chen; Qian Liu

A web app for presenting my research in BEM(building energy model) simulation

BEM(building energy model)-SIM-APP The is a web app presenting my research in BEM(building energy model) calibration. You can play around with some pa

8 Sep 03, 2021
A simple panel with IP, CNPJ, CEP and PLACA queries

Painel mpm Um painel simples com consultas de IP, CNPJ, CEP e PLACA Início 🌐 apt update && apt upgrade -y pkg i python git pip install requests Insta

MrDiniz 4 Nov 04, 2022
Unofficial Valorant documentation and tools for third party developers

Valorant Third Party Toolkit This repository contains unofficial Valorant documentation and tools for third party developers. Our goal is to centraliz

Noah Kim 20 Dec 21, 2022
A python server markup language

PSML - Python server markup language How to install: python install.py

LMFS 6 May 18, 2022
Usando Multi Player Perceptron e Regressão Logistica para classificação de SPAM

Relatório dos procedimentos executados e resultados obtidos. Objetivos Treinar um modelo para classificação de SPAM usando o dataset train_data. Class

André Mediote 1 Feb 02, 2022
Virtual Assistant Using Python

-Virtual-Assistant-Using-Python Virtual desktop assistant is an awesome thing. If you want your machine to run on your command like Jarvis did for Ton

Bade om 1 Nov 13, 2021
Repository voor verhalen over de woningbouw-opgave in Nederland

Analyse plancapaciteit woningen In deze notebook zetten we cijfers op een rij om de woningbouwplannen van Nederlandse gemeenten in kaart te kunnen bre

Follow the Money 10 Jun 30, 2022
Short, introductory guide for the Python programming language

100 Page Python Intro This book is a short, introductory guide for the Python programming language.

Sundeep Agarwal 185 Dec 26, 2022
Programming labs for 6.S060 (Foundations of Computer Security).

6.S060 Labs This git repository contains the code for the labs in 6.S060. In these labs, you will add a series of security features to a photo-sharing

MIT PDOS 10 Nov 02, 2022
Convert a .vcf file to 'aa_table.tsv', including depth & alt frequency info

Produce an 'amino acid table' file from a vcf, including depth and alt frequency info.

Dan Fornika 1 Oct 16, 2021
2021华为软件精英挑战赛 程序输出分析器

AutoGrader 0.2.0更新:加入资源分配溢出检测,如果发生资源溢出会输出溢出发生的位置。 如果通过检测,会显示通过符号 如果没有通过检测,会显示警告,并输出溢出发生的位置和操作

54 Aug 14, 2022
A tool to determine optimal projects for Gridcoin crunchers. Maximize your magnitude!

FindTheMag FindTheMag helps optimize your BOINC client for Gridcoin mining. You can group BOINC projects into two groups: "preferred" projects and "mi

7 Oct 04, 2022
Small projects for python beginners.

Python Mini Projects For Beginners I recently started doing the #100DaysOfCode Challenge in Python. I've used Python before, but I had switched to JS

Sreekesh Iyer 10 Dec 12, 2022
A corona information module

A corona information module

Fayas Noushad 3 Nov 28, 2021
Today I Commit (1일 1커밋) 챌린지 알림 봇

Today I Commit Challenge 1일1커밋 챌린지를 위한 알림 봇 config.py github_token = "github private access key" slack_token = "slack authorization token" channel = "

sunho 4 Nov 08, 2021
This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

Daniel Caine 1 Jan 31, 2022
PIP Manager written in python Tkinter

PIP Manager About PIP Manager is designed to make Python Package handling easier by just a click of a button!! Available Features Installing packages

Will Payne 9 Dec 09, 2022
Curses frontend for Canto daemon

Canto Curses The curses (text) client for canto-daemon. Canto-daemon is required to work and is found at: http://github.com/themoken/canto-next Requir

Jack Miller 86 Dec 28, 2022
Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

Yueqian Liu 3 Oct 24, 2022
A simple assembly- and brainfuck-inspired stack-based language

asm-stackfuck A simple assembly- and brainfuck-inspired stack-based language. The language has a few goals: Be stack-based Look like assembly Have a s

Nils Trinity 1 Feb 06, 2022