Training code for Korean multi-class sentiment analysis

Last update: Dec 02, 2022

Related tags

Text Data & NLP KoSentimentAnalysis

Overview

KoSentimentAnalysis

Bert implementation for the Korean multi-class sentiment analysis

왜 한국어 감정 다중분류 모델은 거의 없는 것일까?에서 시작된 프로젝트

Environment: Pytorch, Dataset: 감성 대화 말뭉치
Install dependencies:

!pip install mxnet
!pip install gluonnlp pandas tqdm
!pip install sentencepiece
!pip install transformers==3.0.2
!pip install torch
!pip install git+https://[email protected]/SKTBrain/[email protected]

Populate data into *.tsv (we in this code assume 6-class classification tasks, based on Ekman's sentiment model)
Train (assuming gpu device is used, drop device otherwise)
Validate & Use (See below # test comment)

Owner

Donghoon Shin

ECE, Information undergrad @ Seoul National University, Seoul, Korea

GitHub Repository

Toward Model Interpretability in Medical NLP

Toward Model Interpretability in Medical NLP LING380: Topics in Computational Linguistics Final Project James Cross ( 1 Mar 04, 2022

Named Entity Recognition API used by TEI Publisher

TEI Publisher Named Entity Recognition API This repository contains the API used by TEI Publisher's web-annotation editor to detect entities in the in

14 Nov 15, 2022

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

46 Dec 15, 2022

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

English|简体中文 ERNIE是百度开创性提出的基于知识增强的持续学习语义理解框架，该框架将大数据预训练与多源丰富知识相结合，通过持续学习技术，不断吸收海量文本数据中词汇、结构、语义等方面的知识，实现模型效果不断进化。ERNIE在累积 40 余个典型 NLP 任务取得 SOTA 效果，并在 G

5.4k Jan 03, 2023

texlive expressions for documents

tex2nix Generate Texlive environment containing all dependencies for your document rather than downloading gigabytes of texlive packages. Installation

70 Dec 26, 2022

Document processing using transformers

Doc Transformers Document processing using transformers. This is still in developmental phase, currently supports only extraction of form data i.e (ke

13 Dec 21, 2022

Korean stereoypte detector with TUNiB-Electra and K-StereoSet

Korean Stereotype Detector Korean stereotype sentence classifier using K-StereoSet with TUNiB-Electra Web demo you can test this model easily in demo

11 Feb 18, 2022

FedNLP: A Benchmarking Framework for Federated Learning in Natural Language Processing

FedNLP is a research-oriented benchmarking framework for advancing federated learning (FL) in natural language processing (NLP). It uses FedML repository as the git submodule. In other words, FedNLP

216 Nov 27, 2022

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

740 Dec 24, 2022

AI_Assistant - This is a Python based Voice Assistant.

This is a Python based Voice Assistant. This was programmed to increase my understanding of python and also how the in-general Voice Assistants work.

1 Jan 06, 2022

Text-Based zombie apocalyptic decision-making game in Python

Inspiration We shared university first year game coursework.[to gauge previous experience and start brainstorming] Adapted a particular nuclear fallou

2 Feb 17, 2022

Prompt tuning toolkit for GPT-2 and GPT-Neo

mkultra mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo. Prompt tuning injects a string of 20-100 special tokens into the context in order to

61 Jan 01, 2023

Kerberoast with ACL abuse capabilities

targetedKerberoast targetedKerberoast is a Python script that can, like many others (e.g. GetUserSPNs.py), print "kerberoast" hashes for user accounts

213 Dec 22, 2022

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Language Models are Few-shot Multilingual Learners Paper This is the source code of the paper [Arxiv] [ACL Anthology]: This code has been written usin

45 Nov 21, 2022

A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

488 Nov 28, 2022

Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Jan 08, 2023

PyJPBoatRace: Python-based Japanese boatrace tools 🚤

pyjpboatrace :speedboat: provides you with useful tools for data analysis and auto-betting for boatrace.

5 Oct 29, 2022

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Well-formed Limericks and Haikus with GPT2 📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &

2 May 26, 2022

Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

1.9k Jan 06, 2023

List of GSoC organisations with number of times they have been selected.

Welcome to GSoC Organisation Frequency And Details 👋 List of GSoC organisations with number of times they have been selected, techonologies, topics,

41 Oct 01, 2022