KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

Overview

KoRean based ELECTRA (KR-ELECTRA)

This is a release of a Korean-specific ELECTRA model with comparable or better performances developed by the Computational Linguistics Lab at Seoul National University. Our model shows remarkable performances on tasks related to informal texts such as review documents, while still showing comparable results on other kinds of tasks.

Released Model

We pre-trained our KR-ELECTRA model following a base-scale model of ELECTRA. We trained the model based on Tensorflow-v1 using a v3-8 TPU of Google Cloud Platform.

Model Details

We followed the training parameters of the base-scale model of ELECTRA.

Hyperparameters
model # of layers embedding size hidden size # of heads
Discriminator 12 768 768 12
Generator 12 768 256 4
Pretraining
batch size train steps learning rates max sequence length generator size
256 700000 2e-4 128 0.33333

Training Dataset

34GB Korean texts including Wikipedia documents, news articles, legal texts, news comments, product reviews, and so on. These texts are balanced, consisting of the same ratios of written and spoken data.

Vocabulary

vocab size 30,000

We used morpheme-based unit tokens for our vocabulary based on the Mecab-Ko morpheme analyzer.

Download Link

  • Tensorflow-v1 model (download)

  • PyTorch models on HuggingFace

from transformers import ElectraModel, ElectraTokenizer

model = ElectraModel.from_pretrained("snunlp/KR-ELECTRA-discriminator")
tokenizer = ElectraTokenizer.from_pretrained("snunlp/KR-ELECTRA-discriminator")

Finetuning

We used and slightly edited the finetuning codes from KoELECTRA, with additionally adjusted hyperparameters. You can download the codes and config files that we used for our model.

python3 run_seq_cls.py --task nsmc --config_file kr-electra.json
python3 run_seq_cls.py --task kornli --config_file kr-electra.json
python3 run_seq_cls.py --task paws --config_file kr-electra.json
python3 run_seq_cls.py --task question-pair --config_file kr-electra.json
python3 run_seq_cls.py --task korsts --config_file kr-electra.json
python3 run_seq_cls.py --task korsts --config_file kr-electra.json
python3 run_ner.py --task naver-ner --config_file kr-electra.json
python3 run_squad.py --task korquad --config_file kr-electra.json

Experimental Results

NSMC
(acc)
Naver NER
(F1)
PAWS
(acc)
KorNLI
(acc)
KorSTS
(spearman)
Question Pair
(acc)
KorQuaD (Dev)
(EM/F1)
Korean-Hate-Speech (Dev)
(F1)
KoBERT 89.59 87.92 81.25 79.62 81.59 94.85 51.75 / 79.15 66.21
XLM-Roberta-Base 89.03 86.65 82.80 80.23 78.45 93.80 64.70 / 88.94 64.06
HanBERT 90.06 87.70 82.95 80.32 82.73 94.72 78.74 / 92.02 68.32
KoELECTRA-Base 90.33 87.18 81.70 80.64 82.00 93.54 60.86 / 89.28 66.09
KoELECTRA-Base-v2 89.56 87.16 80.70 80.72 82.30 94.85 84.01 / 92.40 67.45
KoELECTRA-Base-v3 90.63 88.11 84.45 82.24 85.53 95.25 84.83 / 93.45 67.61
KR-ELECTRA (ours) 91.168 87.90 82.05 82.51 85.41 95.51 84.93 / 93.04 74.50

The baseline results are brought from KoELECTRA's.

Citation

@misc{kr-electra,
  author = {Lee, Sangah and Hyopil Shin},
  title = {KR-ELECTRA: a KoRean-based ELECTRA model},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/snunlp/KR-ELECTRA}}
}
22 Oct 14, 2022
chainladder - Property and Casualty Loss Reserving in Python

chainladder (python) chainladder - Property and Casualty Loss Reserving in Python This package gets inspiration from the popular R ChainLadder package

Casualty Actuarial Society 130 Dec 07, 2022
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
Voice assistant - Voice assistant with python

🌐 Python Voice Assistant 🌵 - User's greeting 🌵 - Writing tasks to todo-list ?

PythonToday 10 Dec 26, 2022
SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

SparseInst 🚀 A simple framework for real-time instance segmentation, CVPR 2022 by Tianheng Cheng, Xinggang Wang†, Shaoyu Chen, Wenqiang Zhang, Qian Z

Hust Visual Learning Team 458 Jan 05, 2023
PyTorchMemTracer - Depict GPU memory footprint during DNN training of PyTorch

A Memory Tracer For PyTorch OOM is a nightmare for PyTorch users. However, most

Jiarui Fang 9 Nov 14, 2022
Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Grow Function: A 3D Stacked Bifurcating Double Deep Cellular Automata which differentiates using a Genetic Algorithm... TLDR;High Def Trees that you can mint as NFTs on Solana

Nathaniel Gibson 4 Oct 08, 2022
Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

gapmm2: gapped alignment using minimap2 This tool is a wrapper for minimap2 to r

Jon Palmer 2 Jan 27, 2022
Predicting the duration of arrival delays for commercial flights.

Flight Delay Prediction Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21%

Jordan Silke 1 Jan 11, 2022
The object detection pipeline is based on Ultralytics YOLOv5

AYOLOv2 The main goal of this repository is to rewrite the object detection pipeline with a better code structure for better portability and adaptabil

153 Dec 22, 2022
Simulating an AI playing 2048 using the Expectimax algorithm

2048-expectimax Simulating an AI playing 2048 using the Expectimax algorithm The base game engine uses code from here. The AI player is modeled as a m

Subha Ramesh 2 Jan 31, 2022
Reinforcement learning framework and algorithms implemented in PyTorch.

Reinforcement learning framework and algorithms implemented in PyTorch.

Robotic AI & Learning Lab Berkeley 2.1k Jan 04, 2023
A C implementation for creating 2D voronoi diagrams

Branch OSX/Linux Windows master dev jc_voronoi A fast C/C++ header only implementation for creating 2D Voronoi diagrams from a point set Uses Fortune'

Mathias Westerdahl 481 Dec 29, 2022
i3DMM: Deep Implicit 3D Morphable Model of Human Heads

i3DMM: Deep Implicit 3D Morphable Model of Human Heads CVPR 2021 (Oral) Arxiv | Poject Page This project is the official implementation our work, i3DM

Tarun Yenamandra 60 Jan 03, 2023
POCO: Point Convolution for Surface Reconstruction

POCO: Point Convolution for Surface Reconstruction by: Alexandre Boulch and Renaud Marlet Abstract Implicit neural networks have been successfully use

valeo.ai 93 Dec 29, 2022
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

TableauBits 3 May 29, 2022
Shape-Adaptive Selection and Measurement for Oriented Object Detection

Source Code of AAAI22-2171 Introduction The source code includes training and inference procedures for the proposed method of the paper submitted to t

houliping 24 Nov 29, 2022