Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Last update: Dec 29, 2022

Related tags

Deep Learning wechsel

Overview

WECHSEL

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

arXiv: https://arxiv.org/abs/2112.06598

Models from the paper are available on the HuggingFace Hub:

Installation

We distribute a Python Package via PyPI:

pip install wechsel

Alternatively, clone the repository, install requirements.txt and run the code in wechsel/.

Example usage

Transferring English roberta-base to Swahili:

import torch
from transformers import AutoModel, AutoTokenizer
from datasets import load_dataset
from wechsel import WECHSEL, load_embeddings

source_tokenizer = AutoTokenizer.from_pretrained("roberta-base")
model = AutoModel.from_pretrained("roberta-base")

target_tokenizer = source_tokenizer.train_new_from_iterator(
    load_dataset("oscar", "unshuffled_deduplicated_sw", split="train")["text"],
    vocab_size=len(source_tokenizer)
)

wechsel = WECHSEL(
    load_embeddings("en"),
    load_embeddings("sw"),
    bilingual_dictionary="swahili"
)

target_embeddings, info = wechsel.apply(
    source_tokenizer,
    target_tokenizer,
    model.get_input_embeddings().weight.detach().numpy(),
)

model.get_input_embeddings().weight.data = torch.from_numpy(target_embeddings)

# use `model` and `target_tokenizer` to continue training in Swahili!

Bilingual dictionaries

We distribute 3276 bilingual dictionaries from English to other languages for use with WECHSEL in dicts/.

Citation

Please cite WECHSEL as

@misc{minixhofer2021wechsel,
      title={WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models}, 
      author={Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz},
      year={2021},
      eprint={2112.06598},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Related tags

Overview

WECHSEL

Installation

Example usage

Bilingual dictionaries

Citation

Owner

Institute of Computational Perception

[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

An intelligent, flexible grammar of machine learning.

Official Pytorch Implementation of Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images.

Classification models 1D Zoo - Keras and TF.Keras

Problem-943.-ACMP - Problem 943. ACMP

MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

Course on computational design, non-linear optimization, and dynamics of soft systems at UIUC.

Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL"

E-Ink Magic Calendar that automatically syncs to Google Calendar and runs off a battery powered Raspberry Pi Zero

Code release for SLIP Self-supervision meets Language-Image Pre-training

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

This is a project based on retinaface face detection, including ghostnet and mobilenetv3

2D Time independent Schrodinger equation solver for arbitrary shape of well

Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Super Resolution for images using deep learning.