CJK computer science terms comparison / 中日韓電腦科學術語對照 / 日中韓のコンピュータ科学の用語対照 / 한·중·일 전산학 용어 대조

Last update: Dec 23, 2022

Overview

CJK computer science terms comparison

This repository contains the source code of the website. You can see the website from the following link:

Greater China, Japan, and Korea, the so-called Sinosphere (漢字文化圈; literally: "Chinese character cultural sphere"), have borrowed many concepts through Sinoxenic vocabularies from the West since the modern era. Some of them have their own translations, but some have imported translations from neighboring countries. In some translations, both native and foreign stems are combined. As a result, Sinosphere countries share a lot of words, but to some extent they have their own parts. And this is no different in computer science translations.

This page contains comparison tables of how computer science terms, mostly derived from English, are translated and called in different regions of Sinosphere.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Introduction

Cognates

Cognates are words that are derived from one side or share a common etymology.

For example, the English word computer and the Korean word 컴퓨터, the Japanese word 計算科学 (keisan kagaku) and the Chinese word 計算科學 (jìsuàn kēxué), that are both meaning computational science, are cognates.

Cognates are indicated by the same colored border.

Calque (loan translation)

Calque is a word or phrase borrowed from another language by literal word-for-word or root-for-root translation.

For example, the Chinese word 軟件 is a translation of the English word software, which translates the English words soft → 軟 (ruǎn; soft or flexible) and ware → 件 (jiàn; clothes or item) respectively.

Matching words/roots between languages in this way are underlined with the same color & shape.

Homophonic translations

For a root transcribed from a foreign word, the original word is displayed on the root.

For example, as the Japanese word コンピュータ (konpyu-ta) is a transcription of English word computer, it is displayed like: コンピュータcomputer.

Romanized pronunciation

The pronunciation of each word is shown in Latin letters in parentheses below the word. The transcription system for each language is as follows:

Mandarin (China & Taiwan) : Hanyu Pinyin

Cantonese (Hong Kong) : Jyutping (Linguistic Society of Hong Kong Cantonese Romanization Scheme)

Japanese : Hepburn romanization

Korean : Revised Romanization of Korean

Basic terms

Show table.

Units

Show table.

Fields of study

Show table.

Computer programming

Show table.

Tools

Show table.

Theory of computation

Show table.

*[CJK]: Chinese, Japanese, and Korean languages

CJK computer science terms comparison / 中日韓電腦科學術語對照 / 日中韓のコンピュータ科学の用語対照 / 한·중·일 전산학 용어 대조

Related tags

Overview

CJK computer science terms comparison

Introduction

Cognates

Calque (loan translation)

Homophonic translations

Romanized pronunciation

Basic terms

Units

Fields of study

Computer programming

Tools

Theory of computation

Owner

Hong Minhee (洪民憙)

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

This repository has a implementations of data augmentation for NLP for Japanese.

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Snips Python library to extract meaning from text

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

Anuvada: Interpretable Models for NLP using PyTorch

A python framework to transform natural language questions to queries in a database query language.

Natural Language Processing

[AAAI 21] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

Common Voice Dataset explorer

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Google and Stanford University released a new pre-trained model called ELECTRA

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Expressions.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

The aim of this task is to predict someone's English proficiency based on a text input.

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

CJK computer science terms comparison / 中日韓電腦科學術語對照 / 日中韓のコンピュータ科学の用語対照 / 한·중·일 전산학 용어 대조

Related tags

Overview

CJK computer science terms comparison

Introduction

Cognates

Calque (loan translation)

Homophonic translations

Romanized pronunciation

Basic terms

Units

Fields of study

Computer programming

Tools

Theory of computation

Owner

Hong Minhee (洪 民憙)

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

This repository has a implementations of data augmentation for NLP for Japanese.

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Snips Python library to extract meaning from text

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

Anuvada: Interpretable Models for NLP using PyTorch

A python framework to transform natural language questions to queries in a database query language.

Natural Language Processing

[AAAI 21] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

Common Voice Dataset explorer

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Google and Stanford University released a new pre-trained model called ELECTRA

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

Simple, Fast, Powerful and Easily extensible python package for extracting patterns from text, with over than 60 predefined Regular Expressions.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

The aim of this task is to predict someone's English proficiency based on a text input.

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

Hong Minhee (洪民憙)