EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Related tags

Deep LearningEncT5
Overview

EncT5

(Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

About

  • Finetune T5 model for classification & regression by only using the encoder layers.
  • Implemented of Tokenizer and Model for EncT5.
  • Add BOS Token () for tokenizer, and use this token for classification & regression.
    • Need to resize embedding as vocab size is changed. (model.resize_token_embeddings())
  • BOS and EOS token will be automatically added as below.
    • single sequence: X
    • pair of sequences: A B

Requirements

Highly recommend to use the same version of transformers.

transformers==4.15.0
torch==1.8.1
sentencepiece==0.1.96
datasets==1.17.0
scikit-learn==0.24.2

How to Use

from enc_t5 import EncT5ForSequenceClassification, EncT5Tokenizer

model = EncT5ForSequenceClassification.from_pretrained("t5-base")
tokenizer = EncT5Tokenizer.from_pretrained("t5-base")

# Resize embedding size as we added bos token
if model.config.vocab_size < len(tokenizer.get_vocab()):
    model.resize_token_embeddings(len(tokenizer.get_vocab()))

Finetune on GLUE

Setup

  • Use T5 1.1 base for finetuning.
  • Evaluate on TPU. See run_glue_tpu.sh for more details.
  • Use AdamW optimizer instead of Adafactor.
  • Check best checkpoint on every epoch by using EarlyStoppingCallback.

Results

Metric Result (Paper) Result (Implementation)
CoLA Matthew 53.1 52.4
SST-2 Acc 94.0 94.5
MRPC F1/Acc 91.5/88.3 91.7/88.0
STS-B PCC/SCC 80.5/79.3 88.0/88.3
QQP F1/Acc 72.9/89.8 88.4/91.3
MNLI Mis/Matched 88.0/86.7 87.5/88.1
QNLI Acc 93.3 93.2
RTE Acc 67.8 69.7
You might also like...
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

 Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation
Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Fine-tuning StyleGAN2 for Cartoon Face Generation

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Fine-tuning StyleGAN2 for Cartoon Face Generation
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Comments
  • Enable tokenizer to be loaded by sentence-transformer

    Enable tokenizer to be loaded by sentence-transformer

    🚀 Feature Request

    Integration into sentence-transformer library.

    📎 Additional context

    I tried to load this tokenizer with sentence-transformer library but it failed. AutoTokenizer couldn't load this tokenizer. So, I simply added code to override save_pretrained and its dependencies so that this tokenizer is saved as T5Tokenizer, its super class.

            def save_pretrained(
            self,
            save_directory,
            legacy_format: Optional[bool] = None,
            filename_prefix: Optional[str] = None,
            push_to_hub: bool = False,
            **kwargs,
        ):
            if os.path.isfile(save_directory):
                logger.error(f"Provided path ({save_directory}) should be a directory, not a file")
                return
    
            if push_to_hub:
                commit_message = kwargs.pop("commit_message", None)
                repo = self._create_or_get_repo(save_directory, **kwargs)
    
            os.makedirs(save_directory, exist_ok=True)
    
            special_tokens_map_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + SPECIAL_TOKENS_MAP_FILE
            )
            tokenizer_config_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + TOKENIZER_CONFIG_FILE
            )
    
            tokenizer_config = copy.deepcopy(self.init_kwargs)
            if len(self.init_inputs) > 0:
                tokenizer_config["init_inputs"] = copy.deepcopy(self.init_inputs)
            for file_id in self.vocab_files_names.keys():
                tokenizer_config.pop(file_id, None)
    
            # Sanitize AddedTokens
            def convert_added_tokens(obj: Union[AddedToken, Any], add_type_field=True):
                if isinstance(obj, AddedToken):
                    out = obj.__getstate__()
                    if add_type_field:
                        out["__type"] = "AddedToken"
                    return out
                elif isinstance(obj, (list, tuple)):
                    return list(convert_added_tokens(o, add_type_field=add_type_field) for o in obj)
                elif isinstance(obj, dict):
                    return {k: convert_added_tokens(v, add_type_field=add_type_field) for k, v in obj.items()}
                return obj
    
            # add_type_field=True to allow dicts in the kwargs / differentiate from AddedToken serialization
            tokenizer_config = convert_added_tokens(tokenizer_config, add_type_field=True)
    
            # Add tokenizer class to the tokenizer config to be able to reload it with from_pretrained
            ############################################################################
            tokenizer_class = self.__class__.__base__.__name__
            ############################################################################
            # Remove the Fast at the end unless we have a special `PreTrainedTokenizerFast`
            if tokenizer_class.endswith("Fast") and tokenizer_class != "PreTrainedTokenizerFast":
                tokenizer_class = tokenizer_class[:-4]
            tokenizer_config["tokenizer_class"] = tokenizer_class
            if getattr(self, "_auto_map", None) is not None:
                tokenizer_config["auto_map"] = self._auto_map
            if getattr(self, "_processor_class", None) is not None:
                tokenizer_config["processor_class"] = self._processor_class
    
            # If we have a custom model, we copy the file defining it in the folder and set the attributes so it can be
            # loaded from the Hub.
            if self._auto_class is not None:
                custom_object_save(self, save_directory, config=tokenizer_config)
    
            with open(tokenizer_config_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(tokenizer_config, ensure_ascii=False))
            logger.info(f"tokenizer config file saved in {tokenizer_config_file}")
    
            # Sanitize AddedTokens in special_tokens_map
            write_dict = convert_added_tokens(self.special_tokens_map_extended, add_type_field=False)
            with open(special_tokens_map_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(write_dict, ensure_ascii=False))
            logger.info(f"Special tokens file saved in {special_tokens_map_file}")
    
            file_names = (tokenizer_config_file, special_tokens_map_file)
    
            save_files = self._save_pretrained(
                save_directory=save_directory,
                file_names=file_names,
                legacy_format=legacy_format,
                filename_prefix=filename_prefix,
            )
    
            if push_to_hub:
                url = self._push_to_hub(repo, commit_message=commit_message)
                logger.info(f"Tokenizer pushed to the hub in this commit: {url}")
    
            return save_files
    
    enhancement 
    opened by kwonmha 0
Releases(v1.0.0)
  • v1.0.0(Jan 22, 2022)

    What’s Changed

    :rocket: Features

    • Add GLUE Trainer (#2) @monologg
    • Add Template & EncT5 model and tokenizer (#1) @monologg

    :pencil: Documentation

    • Add readme & script (#3) @monologg
    Source code(tar.gz)
    Source code(zip)
Owner
Jangwon Park
Jangwon Park
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 06, 2022
Advanced Signal Processing Notebooks and Tutorials

Advanced Digital Signal Processing Notebooks and Tutorials Prof. Dr. -Ing. Gerald Schuller Jupyter Notebooks and Videos: Renato Profeta Applied Media

Guitars.AI 115 Dec 13, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

TiSASRec.paddle A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation. Introduction 论文:Time Interval Aware Sel

Paddorch 2 Nov 28, 2021
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8.1k Jan 02, 2023
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Annotate datasets with a semi-trained or fully trained YOLOv5 model

YOLOv5 Auto Annotator Annotate datasets with a semi-trained or fully trained YOLOv5 model Prerequisites Ubuntu =20.04 Python =3.7 System dependencie

Akash James 3 May 14, 2022
Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2022/01/05 By another round of training based on previous weights, our model also achieved a better performance on ACDC (91.61% DSC). W

dotman 92 Dec 25, 2022
Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Knowledge Distillation for BERT Unsupervised Domain Adaptation Official PyTorch implementation | Paper Abstract A pre-trained language model, BERT, ha

Minho Ryu 29 Nov 30, 2022
A curated list of awesome projects and resources related fastai

A curated list of awesome projects and resources related fastai

Tanishq Abraham 138 Dec 22, 2022
Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

GT-SALT 36 Dec 02, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
A multi-entity Transformer for multi-agent spatiotemporal modeling.

baller2vec This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotempor

Michael A. Alcorn 56 Nov 15, 2022
Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

SMU A Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE arXiv https://arxiv.org/abs/211

Fuhang 5 Jan 18, 2022
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022
A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Monte Carlo Simulation to the Paper A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Sören Kohnert 0 Dec 06, 2021
This is a simple framework to make object detection dataset very quickly

FastAnnotation Table of contents General info Requirements Setup General info This is a simple framework to make object detection dataset very quickly

Serena Tetart 1 Jan 24, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
🕵 Artificial Intelligence for social control of public administration

Non-tech crash course into Operação Serenata de Amor Tech crash course into Operação Serenata de Amor Contributing with code and tech skills Supportin

Open Knowledge Brasil - Rede pelo Conhecimento Livre 4.4k Dec 31, 2022