TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Last update: Feb 07, 2022

Related tags

Overview

TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Named entity recognition (NER), which aims at identifying real-world entity mentions from texts, is a fundamental task in natural language processing with a wide range of applications. Previous approaches mainly focus on the original pure sentence but the Part of speech (POS) contains rich semantic information and contribute to the success of the Natural Language Processing task. To further improve the performance of the NER task, we proposed the five methods that employed POS tags fused with the original tokens based on the BERT model to achieve the NER task, including concatenating token and POS as one or two sentences, adding POS embedding as one of the embedding elements, model ensemble, and conduct the multi-attention between the token representations and POS representations. In this work, we addressed the CoNLL-2003 and Groningen Meaning Bank (GMB) datasets which can provide both NER tags and POS tags. From our experiments on two datasets, part of the proposed methods can show performance improvement in comparison with the baseline methods.

This is the project I worked with Haoqing Tang, the extraordinary computer scientist in CV & NLP area, during the interesting and memorable Master study period.

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Related tags

Overview

TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

This is the project I worked with Haoqing Tang, the extraordinary computer scientist in CV & NLP area, during the interesting and memorable Master study period.

Owner

Get list of common stop words in various languages in Python

Knowledge Management for Humans using Machine Learning & Tags

NLTK Source

An A-SOUL Text Generator Based on CPM-Distill.

Library for fast text representation and classification.

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

A Telegram bot to add notes to Flomo.

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

BookNLP, a natural language processing pipeline for books

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Chinese Pre-Trained Language Models (CPM-LM) Version-I

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

keras implement of transformers for humans

EMNLP 2021 paper "Pre-train or Annotate? Domain Adaptation with a Constrained Budget".

Natural language Understanding Toolkit

A programming language with logic of Python, and syntax of all languages.