State of the art faster Natural Language Processing in Tensorflow 2.0 .

Last update: Dec 05, 2022

Overview

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0

***************************************************************************************************************

We have a new version releasing soon, which have much more updates and major changes, please stay tuned ----

tf-transformers is designed to harness the full power of Tensorflow 2, to make it much faster and simpler comparing to existing Tensorflow based NLP architectures. On an average, there is 80 % improvement over current exsting Tensorflow based libraries, on text generation and other tasks. You can find more details in the Benchmarks section.

All / Most NLP downstream tasks can be integrated into Tranformer based models with much ease. All the models can be trained using model.fit, which supports GPU, multi-GPU, TPU.

Unique Features

Faster AutoReggressive Decoding using Tensorflow2. Faster than PyTorch in most experiments (V100 GPU). 80% faster compared to existing TF based libararies (relative difference) Refer benchmark code.
Complete TFlite support for BERT, RoBERTA, T5, Albert, mt5 for all down stream tasks except text-generation
Faster sentence-piece alignment (no more LCS overhead)
Variable batch text generation for Encoder only models like GPT2
No more hassle of writing long codes for TFRecords. minimal and simple.
Off the shelf support for auto-batching tf.data.dataset or tf.ragged tensors
Pass dictionary outputs directly to loss functions inside tf.keras.Model.fit using model.compile2 . Refer examples or blog
Multiple mask modes like causal, user-defined, prefix by changing one argument . Refer examples or blog

Performance Benchmarks

Evaluating performance benhcmarks is trickier. I evaluated tf-transformers, primarily on text-generation tasks with GPT2 small and t5 small, with amazing HuggingFace, as it is the ready to go library for NLP right now. Text generation tasks require efficient caching to make use of past Key and Value pairs.

On an average, tf-transformers is 80 % faster than HuggingFace Tensorflow implementation and in most cases it is comparable or faster than PyTorch.

1. GPT2 benchmark

The evaluation is based on average of 5 runs, with different batch_size, beams, sequence_length etc. So, there is qute a larg combination, when it comes to BEAM and **top-k*8 decoding. The figures are randomly taken 10 samples. But, you can see the full code and figures in the repo.

GPT2 greedy

GPT2 beam

GPT2 top-k top-p

GPT2 greedy histogram

Codes to reproduce GPT2 benchmark experiments

Codes to reproduce T5 benchmark experiments

QuickStart

I am providing some basic tutorials here, which covers basics of tf-transformers and how can we use it for other downstream tasks. All/most tutorials has following structure:

Introduction About the Problem
Prepare Training Data
Load Model and asociated downstream Tasks
Define Optimizer, Loss
Train using Keras and CustomTrainer
Evaluate Using Dev data
In Producton - Secton defines how can we use tf.saved_model in production + pipelines

Production Ready Tutorials

Start by converting HuggingFace models (base models only) to tf-transformers models.

Here are a few examples : Jupyter Notebooks:

Basics of tf-transformers
Convert HuggingFace Models ( BERT, Albert, Roberta, GPT2, t5, mt5) to tf-transformers checkpoints
Name Entity Recognition + Albert + TFlite + Joint Loss + Pipeline
Squad v1.1 + Roberta + TFlite + Pipeline
Roberta2Roberta Encoder Decoder + XSUM + Summarisation
Squad v1.1 + T5 + Text Generation
Squad v1.1 + T5 + Span Selection + TFlite + Pipeline
Albert + GLUE + Joint Loss - Glue Score 81.0 on 14 M parameter + 5 layers
Albert + Squad + Joint Loss - EM/F1 78.1/87.0 on 14 M parameter + 5 layers
[Squad v1.1 + GPT2 + Causal Masking EM/F1 37.36/50.20] (Coming Soon)
[Squad v1.1 + GPT2 + Prefix Masking EM/F1 47.52/63.20](Coming Soon)
BERT + STS-B + Regression (Coming Soon)
BERT + COLA + Text Classification + TFlite + Pipeline

Why should I use tf-transformers?

Use state-of-the-art models in Production, with less than 10 lines of code.
- High performance models, better than all official Tensorflow based models
- Very simple classes for all downstream tasks
- Complete TFlite support for all tasks except text-generation
Make industry based experience to avaliable to students and community with clear tutorials
Train any model on GPU, multi-GPU, TPU with amazing tf.keras.Model.fit
- Train state-of-the-art models in few lines of code.
- All models are completely serializable.
Customize any models or pipelines with minimal or no code change.

Do we really need to distill? Jont Loss is all we need.

1. GLUE

We have conducted few experiments to squeeze the power of Albert base models ( concept is applicable to any models and in tf-transformers, it is out of the box.)

The idea is minimize the loss for specified task in each layer of your model and check predictions at each layer. as per our experiments, we are able to get the best smaller model (thanks to Albert), and from layer 4 onwards we beat all the smaller model in GLUE benchmark. By layer 6, we got a GLUE score of 81.0, which is 4 points ahead of Distillbert with GLUE score of 77 and MobileBert GLUE score of 78.

The Albert model has 14 million parameters, and by using layer 6, we were able to speed up the compuation by 50% .

The concept is applicable to all the models.

Codes to reproduce GLUE Joint Loss experiments

Benchmark Results

GLUE score ( not including WNLI )

2. SQUAD v1.1

We have trained Squad v1.1 with joint loss. At layer 6 we were able to achieve same performance as of Distillbert - (EM - 78.1 and F1 - 86.2), but slightly worser than MobileBert.

Benchmark Results

Codes to reproduce Squad v1.1 Joint Loss experiments

Note: We have a new model in pipeline. :-)

Installation

With pip

This repository is tested on Python 3.7+, and Tensorflow 2.3.1

Recommended to use a virtual environment.

Assuming Tensorflow 2.0 is installed

pip install tf-transformers

From Github

Assuming poetry is installed. If not pip install poetry .

git clone https://github.com/legacyai/tf-transformers.git

cd tf-transformers

poetry install

Pipeline

Pipeline in tf-transformers is different from HuggingFace. Here, pipeline for specific tasks expects a model and tokenizer_fn. Because in an ideal scenario, no one will be able to understand whats the kind of pre-processing we want to do to our inputs. Please refer above tutorial notebooks for examples.

Token Classificaton Pipeline (NER)

from tf_transformers.pipeline import Token_Classification_Pipeline

def tokenizer_fn(feature):
    """
    feature: tokenized text (tokenizer.tokenize)
    """
    result = {}
    result["input_ids"] = tokenizer.convert_tokens_to_ids([tokenizer.cls_token] +  feature['input_ids'] + [tokenizer.bos_token])
    result["input_mask"] = [1] * len(result["input_ids"])
    result["input_type_ids"] = [0] * len(result["input_ids"])
    return result

# load Keras/ Serialized Model
model_ner = # Load Model
slot_map_reverse = # dictionary index - entity mapping
pipeline = Token_Classification_Pipeline( model = model_ner,
                tokenizer = tokenizer,
                tokenizer_fn = tokenizer_fn,
                SPECIAL_PIECE = SPIECE_UNDERLINE,
                label_map = slot_map_reverse,
                max_seq_length = 128,
                batch_size=32)

sentences = ['I would love to listen to Carnatic music by Yesudas',
            'Play Carnatic Fusion by Various Artists',
            'Please book 2 tickets from Bangalore to Kerala']
result = pipeline(sentences)

Span Selection Pipeline (QA)

from tf_transformers.pipeline import Span_Extraction_Pipeline

def tokenizer_fn(features):
    """
    features: dict of tokenized text
    Convert them into ids
    """

    result = {}
    input_ids = tokenizer.convert_tokens_to_ids(features['input_ids'])
    input_type_ids = tf.zeros_like(input_ids).numpy().tolist()
    input_mask = tf.ones_like(input_ids).numpy().tolist()
    result['input_ids'] = input_ids
    result['input_type_ids'] = input_type_ids
    result['input_mask'] = input_mask
    return result

model = # Load keras/ saved_model
# Span Extraction Pipeline
pipeline = Span_Extraction_Pipeline(model = model,
                tokenizer = tokenizer,
                tokenizer_fn = tokenizer_fn,
                SPECIAL_PIECE = ROBERTA_SPECIAL_PEICE,
                n_best_size = 20,
                n_best = 5,
                max_answer_length = 30,
                max_seq_length = 384,
                max_query_length=64,
                doc_stride=20)


questions = ['When was Kerala formed?']
contexts = ['''Kerala (English: /ˈkɛrələ/; Malayalam: [ke:ɾɐɭɐm] About this soundlisten (help·info)) is a state on the southwestern Malabar Coast of India. It was formed on 1 November 1956, following the passage of the States Reorganisation Act, by combining Malayalam-speaking regions of the erstwhile states of Travancore-Cochin and Madras. Spread over 38,863 km2 (15,005 sq mi), Kerala is the twenty-first largest Indian state by area. It is bordered by Karnataka to the north and northeast, Tamil Nadu to the east and south, and the Lakshadweep Sea[14] to the west. With 33,387,677 inhabitants as per the 2011 Census, Kerala is the thirteenth-largest Indian state by population. It is divided into 14 districts with the capital being Thiruvananthapuram. Malayalam is the most widely spoken language and is also the official language of the state.[15]''']
result = pipeline(questions=questions, contexts=contexts)

Classification Model Pipeline

from tf_transformers.pipeline import Classification_Pipeline
from tf_transformers.data import pad_dataset_normal

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
max_seq_length = 128

@pad_dataset_normal
def tokenizer_fn(texts):
    """
    feature: tokenized text (tokenizer.tokenize)
    pad_dataset_noral will automatically pad it.
    """
    input_ids = []
    input_type_ids = []
    input_mask = []
    for text in texts:
        input_ids_ex = [tokenizer.cls_token] + tokenizer.tokenize(text)[: max_seq_length-2] + [tokenizer.sep_token] # -2 to add CLS and SEP
        input_ids_ex = tokenizer.convert_tokens_to_ids(input_ids_ex)
        input_mask_ex = [1] * len(input_ids_ex)
        input_type_ids_ex = [0] * len(input_ids_ex)

        input_ids.append(input_ids_ex)
        input_type_ids.append(input_type_ids_ex)
        input_mask.append(input_mask_ex)

    result = {}
    result['input_ids'] = input_ids
    result['input_type_ids'] = input_type_ids
    result['input_mask'] = input_mask
    return result

model = # Load keras/ saved_model
label_map_reverse = {0: 'unacceptable', 1: 'acceptable'}
pipeline = Classification_Pipeline( model = model,
                tokenizer_fn = tokenizer_fn,
                label_map = label_map_reverse,
                batch_size=32)

sentences = ['In which way is Sandy very anxious to see if the students will be able to solve the homework problem?',
            'The book was written by John.',
            'Play Carnatic Fusion by Various Artists',
            'She voted herself.']
result = pipeline(sentences)

Supported Models architectures

tf-transformers currently provides the following architectures .

ALBERT (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut.
BERT (from Google) released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
BERT For Sequence Generation (from Google) released with the paper Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
ELECTRA (from Google Research/Stanford University) released with the paper ELECTRA: Pre-training text encoders as discriminators rather than generators by Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
GPT-2 (from OpenAI) released with the paper Language Models are Unsupervised Multitask Learners by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
MT5 (from Google AI) released with the paper mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
RoBERTa (from Facebook), released together with the paper a Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
T5 (from Google AI) released with the paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.

Note

tf-transformers is a personal project. This has nothing to do with any organization. So, I might not be able to host equivalent checkpoints of all base models. As a result, there is a conversion notebooks, to convert above mentioned architectures from HuggingFace to tf-transformers.

Credits

I want to give credits to Tensorflow NLP official repository. I used November 2019 version of master branch ( where tf.keras.Network) was used for models. I have modified that by large extend now.

Apart from that, I have used many common scripts from many open repos. I might not be able to recall everything as it is. But still credit goes to them too.

Citation

:-)

Comments

Where is the benchmark about 90 times faster than HF transformers?

You said this library is 90 times faster than HF transformers, but there is no benchmark about it. https://github.com/legacyai/tf-transformers/tree/main/benchmarks
question

opened by hyunwoongko 11
Colab

This is great work!!! I have problem with TF2+HF with too many errors, reported to TF2, I aim to switch to tf-transformers. Though library did not work in colab, I guess there are some missing files? Thanks.

opened by Rababalkhalifa 5
HF models are not using key-value caching?

I was reading the code for the HF GPT2 benchmark, and it seems like key-value caching is not being used? This is pretty important for any kind of autoregressive generation and would greatly speed up the decoding time. HF models have had support for key-value caching for a while, see config arguments use_cache and past_key_values here: https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.GPT2LMHeadModel.

I think it would be important for this project to re-benchmark the HF models with key-value caching enabled, as that is standard practice and without it the HF numbers are being handicapped.
question

opened by abhi-mosaic 2
Bump urllib3 from 1.26.3 to 1.26.5
Bumps urllib3 from 1.26.3 to 1.26.5.

Release notes

Sourced from urllib3's releases.

1.26.5

:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

Fixed deprecation warnings emitted in Python 3.10.

Updated vendored six library to 1.16.0.

Improved performance of URL parser when splitting the authority component.

If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

1.26.4

:warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap

Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

If you or your organization rely on urllib3 consider supporting us via GitHub Sponsors

Changelog

Sourced from urllib3's changelog.

1.26.5 (2021-05-26)

Fixed deprecation warnings emitted in Python 3.10.

Updated vendored six library to 1.16.0.

Improved performance of URL parser when splitting the authority component.

1.26.4 (2021-03-15)

Changed behavior of the default SSLContext when connecting to HTTPS proxy during HTTPS requests. The default SSLContext now sets check_hostname=True.

Commits

d161647 Release 1.26.5

2d4a3fe Improve performance of sub-authority splitting in URL

2698537 Update vendored six to 1.16.0

07bed79 Fix deprecation warnings for Python 3.10 ssl module

d725a9b Add Python 3.10 to GitHub Actions

339ad34 Use pytest==6.2.4 on Python 3.10+

f271c9c Apply latest Black formatting

1884878 [1.26] Properly proxy EOF on the SSLTransport test suite

a891304 Release 1.26.4

8d65ea1 Merge pull request from GHSA-5phf-pp7p-vc2r

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 2

enable flexible tf version for tf.keras.mix_precision global_policy feature

according to this post https://stackoverflow.com/questions/67037067/attributeerror-module-tensorflow-keras-mixed-precision-has-no-attribute-set

global_policy is no longer experimental but a feature after tensorflow 2.4

This PR would provide users with flexibility of TensorFlow versions, otherwise, the following error would occur:

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_1331/2018827728.py in <module>
      4 
      5 # Initializing a model from the original configuration
----> 6 model = RobertaModel.from_config(configuration)

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta_model.py in from_config(cls, config, return_layer, use_mlm_layer, **kwargs)
    155         # Just create a model and return it with random_weights
    156         # (Distribute strategy fails)
--> 157         model_layer = Encoder(config_dict, **kwargs_copy)
    158         if use_mlm_layer:
    159             model_layer = MaskedLMModel(model_layer, config_dict["embedding_size"], config_dict["layer_norm_epsilon"])

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py in __init__(self, config, mask_mode, name, use_dropout, is_training, use_auto_regressive, use_decoder, batch_size, sequence_length, return_all_layer_outputs, **kwargs)
    147         self.call_fn = self.get_call_method(self._config_dict)
    148         # Initialize model
--> 149         self.model_inputs, self.model_outputs = self.get_model(initialize_only=True)
    150 
    151     def get_model(self: LegacyLayer, initialize_only: bool = False):

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py in get_model(self, initialize_only)
    242                 del inputs["past_length"]
    243 
--> 244         layer_outputs = self(inputs)
    245         if initialize_only:
    246             return inputs, layer_outputs

/opt/conda/lib/python3.8/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py in tf__call(self, inputs)
      8                 do_return = False
      9                 retval_ = ag__.UndefinedReturnValue()
---> 10                 outputs = ag__.converted_call(ag__.ld(self).call_fn, (ag__.ld(inputs),), None, fscope)
     11                 try:
     12                     do_return = True

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py in tf__call_encoder(self, inputs)
    121                 i = ag__.Undefined('i')
    122                 layer = ag__.Undefined('layer')
--> 123                 ag__.for_stmt(ag__.converted_call(ag__.ld(range), (ag__.ld(self)._config_dict['num_hidden_layers'],), None, fscope), None, loop_body, get_state_5, set_state_5, ('embeddings',), {'iterate_names': 'i'})
    124                 cls_token_tensor = ag__.converted_call(ag__.converted_call(ag__.ld(tf).keras.layers.Lambda, (ag__.autograph_artifact((lambda x: ag__.converted_call(ag__.ld(tf).squeeze, (ag__.ld(x)[:, 0:1, :],), dict(axis=1), fscope))),), None, fscope), (ag__.ld(encoder_outputs)[(- 1)],), None, fscope)
    125                 cls_output = ag__.converted_call(ag__.ld(self)._pooler_layer, (ag__.ld(cls_token_tensor),), None, fscope)

/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py in loop_body(itr)
    116                     i = itr
    117                     layer = ag__.ld(self)._transformer_layers[ag__.ld(i)]
--> 118                     (embeddings, _, _) = ag__.converted_call(ag__.ld(layer), ([ag__.ld(embeddings), ag__.ld(attention_mask)],), None, fscope)
    119                     ag__.converted_call(ag__.ld(encoder_outputs).append, (ag__.ld(embeddings),), None, fscope)
    120                 _ = ag__.Undefined('_')

/opt/conda/lib/python3.8/site-packages/tf_transformers/layers/transformer/bert_transformer.py in tf__call(self, inputs, mode, cache_key, cache_value)
     26                     outputs = ag__.converted_call(ag__.ld(self).call_encoder, (ag__.ld(inputs),), dict(cache_key=ag__.ld(cache_key), cache_value=ag__.ld(cache_value)), fscope)
     27                 outputs = ag__.Undefined('outputs')
---> 28                 ag__.if_stmt(ag__.ld(self)._use_decoder, if_body, else_body, get_state, set_state, ('outputs',), 1)
     29                 try:
     30                     do_return = True

/opt/conda/lib/python3.8/site-packages/tf_transformers/layers/transformer/bert_transformer.py in else_body()
     24                 def else_body():
     25                     nonlocal outputs
---> 26                     outputs = ag__.converted_call(ag__.ld(self).call_encoder, (ag__.ld(inputs),), dict(cache_key=ag__.ld(cache_key), cache_value=ag__.ld(cache_value)), fscope)
     27                 outputs = ag__.Undefined('outputs')
     28                 ag__.if_stmt(ag__.ld(self)._use_decoder, if_body, else_body, get_state, set_state, ('outputs',), 1)

/opt/conda/lib/python3.8/site-packages/tf_transformers/layers/transformer/bert_transformer.py in tf__call_encoder(self, inputs, cache_key, cache_value)
     29                 attention_output = ag__.converted_call(ag__.ld(self)._attention_dropout, (ag__.ld(attention_output),), dict(training=ag__.ld(self)._use_dropout), fscope)
     30                 attention_output = ag__.converted_call(ag__.ld(self)._attention_layer_norm, ((ag__.ld(input_tensor) + ag__.ld(attention_output)),), None, fscope)
---> 31                 attention_output = ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(attention_output),), dict(dtype=ag__.converted_call(ag__.ld(tf_utils).get_dtype, (), None, fscope)), fscope)
     32                 intermediate_output = ag__.converted_call(ag__.ld(self)._intermediate_dense, (ag__.ld(attention_output),), None, fscope)
     33                 layer_output = ag__.converted_call(ag__.ld(self)._output_dense, (ag__.ld(intermediate_output),), None, fscope)

/opt/conda/lib/python3.8/site-packages/tf_transformers/utils/tf_utils.py in tf__get_dtype()
     10                 retval_ = ag__.UndefinedReturnValue()
     11                 dtype = ag__.ld(tf).float32
---> 12                 policy = ag__.converted_call(ag__.ld(tf).keras.mixed_precision.experimental.global_policy, (), None, fscope)
     13 
     14                 def get_state():

AttributeError: Exception encountered when calling layer "tf_transformers/roberta" (type RobertaEncoder).

in user code:

    File "/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py", line 718, in call  *
        outputs = self.call_fn(inputs)
    File "/opt/conda/lib/python3.8/site-packages/tf_transformers/models/roberta/roberta.py", line 290, in call_encoder  *
        embeddings, _, _ = layer([embeddings, attention_mask])
    File "/opt/conda/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/tmp/__autograph_generated_fileflkmddlu.py", line 28, in tf__call
        ag__.if_stmt(ag__.ld(self)._use_decoder, if_body, else_body, get_state, set_state, ('outputs',), 1)
    File "/tmp/__autograph_generated_fileflkmddlu.py", line 26, in else_body
        outputs = ag__.converted_call(ag__.ld(self).call_encoder, (ag__.ld(inputs),), dict(cache_key=ag__.ld(cache_key), cache_value=ag__.ld(cache_value)), fscope)
    File "/tmp/__autograph_generated_filetd9bb7wo.py", line 31, in tf__call_encoder
        attention_output = ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(attention_output),), dict(dtype=ag__.converted_call(ag__.ld(tf_utils).get_dtype, (), None, fscope)), fscope)
    File "/tmp/__autograph_generated_file9o5z35_o.py", line 12, in tf__get_dtype
        policy = ag__.converted_call(ag__.ld(tf).keras.mixed_precision.experimental.global_policy, (), None, fscope)

    AttributeError: Exception encountered when calling layer "transformer/layer_0" (type TransformerBERT).
    
    in user code:
    
        File "/opt/conda/lib/python3.8/site-packages/tf_transformers/layers/transformer/bert_transformer.py", line 331, in call  *
            outputs = self.call_encoder(inputs, cache_key=cache_key, cache_value=cache_value)
        File "/opt/conda/lib/python3.8/site-packages/tf_transformers/layers/transformer/bert_transformer.py", line 256, in call_encoder  *
            attention_output = tf.cast(attention_output, dtype=tf_utils.get_dtype())
        File "/opt/conda/lib/python3.8/site-packages/tf_transformers/utils/tf_utils.py", line 178, in get_dtype  *
            policy = tf.keras.mixed_precision.experimental.global_policy()
    
        AttributeError: module 'keras.api._v2.keras.mixed_precision' has no attribute 'experimental'
    
    
    Call arguments received by layer "transformer/layer_0" (type TransformerBERT):
      • inputs=['tf.Tensor(shape=(None, None, 768), dtype=float32)', 'tf.Tensor(shape=(None, None, None), dtype=float32)']
      • mode=encoder
      • cache_key=None
      • cache_value=None


Call arguments received by layer "tf_transformers/roberta" (type RobertaEncoder):
  • inputs={'input_ids': 'tf.Tensor(shape=(None, None), dtype=int32)', 'input_mask': 'tf.Tensor(shape=(None, None), dtype=int32)', 'input_type_ids': 'tf.Tensor(shape=(None, None), dtype=int32)'}

This is shown after following the official example

from tf_transformers.models import RobertaConfig, RobertaModel
# Initializing an bert-base-uncased style configuration
configuration = RobertaConfig()

# Initializing a model from the original configuration
model = RobertaModel.from_config(configuration)

settings:

tf_transformers version: 2.0.0
tensorflow text version: 2.9.0
sentencepiece version: 0.1.97
tensorflow version: 2.9.1

opened by kerrychu 0

Releases(v2.0.0)

v2.0.0(Apr 8, 2022)
This is the first stable version of tf-transformers.

What's Changed

Added new tutorials + docs by @legacyai in https://github.com/legacyai/tf-transformers/pull/35

Added Code Translation tutorial by @legacyai in https://github.com/legacyai/tf-transformers/pull/36

Fixed docs , tutorials and patch by @legacyai in https://github.com/legacyai/tf-transformers/pull/37

Ready to release v2.0.0 by @legacyai in https://github.com/legacyai/tf-transformers/pull/38

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.21...v2.0.0
Source code(tar.gz)
Source code(zip)
v1.0.21(Apr 3, 2022)

Source code(tar.gz)
Source code(zip)
v1.0.20(Apr 2, 2022)
What's Changed

More tutorials + Documentaion by @legacyai in https://github.com/legacyai/tf-transformers/pull/27

Moved some documentation by @legacyai in https://github.com/legacyai/tf-transformers/pull/28

feat: Added tutorial for vit image classification by @legacyai in https://github.com/legacyai/tf-transformers/pull/29

Added Image Classification tutorial by @legacyai in https://github.com/legacyai/tf-transformers/pull/30

Added more tutorials . by @legacyai in https://github.com/legacyai/tf-transformers/pull/31

Added README.MD with more info by @legacyai in https://github.com/legacyai/tf-transformers/pull/32

Added Sentence Transformers + Model Usage + docs by @legacyai in https://github.com/legacyai/tf-transformers/pull/33

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.19...v1.0.20
Source code(tar.gz)
Source code(zip)
v1.0.19(Mar 10, 2022)
What's Changed

Fixing workflows by @legacyai in https://github.com/legacyai/tf-transformers/pull/24

Fix workflow in cd.yaml by @legacyai in https://github.com/legacyai/tf-transformers/pull/25

Added new tutorials and docs by @legacyai in https://github.com/legacyai/tf-transformers/pull/26

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.18...v1.0.19
Source code(tar.gz)
Source code(zip)
v1.0.18(Mar 3, 2022)
What's Changed

Merging some major changes by @legacyai in https://github.com/legacyai/tf-transformers/pull/23

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.17...v1.0.18
Source code(tar.gz)
Source code(zip)
v1.0.17(Jan 12, 2022)
What's Changed

fix: Added patch fix by @legacyai in https://github.com/legacyai/tf-transformers/pull/22

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.16...v1.0.17
Source code(tar.gz)
Source code(zip)
v1.0.16(Jan 12, 2022)
What's Changed

fix: Added patch and ref by @legacyai in https://github.com/legacyai/tf-transformers/pull/21

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.13...v1.0.16
Source code(tar.gz)
Source code(zip)
v1.0.15(Jan 12, 2022)

Source code(tar.gz)
Source code(zip)
v1.0.14(Jan 12, 2022)
What's Changed

fix: So many fixes by @legacyai in https://github.com/legacyai/tf-transformers/pull/18

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.13...v1.0.14
Source code(tar.gz)
Source code(zip)
v1.0.13(Jan 12, 2022)
What's Changed

fix: Added patch by @legacyai in https://github.com/legacyai/tf-transformers/pull/20

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.15...v1.0.13
Source code(tar.gz)
Source code(zip)
v1.0.12(Jan 12, 2022)
Test

What's Changed

fix: patch script + push tag on release.yaml by @legacyai in https://github.com/legacyai/tf-transformers/pull/16

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.11...v1.0.12
Source code(tar.gz)
Source code(zip)
v1.0.11(Jan 12, 2022)
Test workflow

What's Changed

Fixed yaml file by @legacyai in https://github.com/legacyai/tf-transformers/pull/15

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.10...v1.0.11
Source code(tar.gz)
Source code(zip)
v1.0.10(Jan 12, 2022)

Another test

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.9...v1.0.10
Source code(tar.gz)
Source code(zip)
v1.0.9(Jan 12, 2022)
This a test to check workflow in release.yaml

What's Changed

fix: Added workflow tests by @legacyai in https://github.com/legacyai/tf-transformers/pull/14

Full Changelog: https://github.com/legacyai/tf-transformers/compare/v1.0.8...v1.0.9
Source code(tar.gz)
Source code(zip)
v1.0.4(Jan 5, 2022)

Source code(tar.gz)
Source code(zip)
v1.0.3(Mar 21, 2021)

Source code(tar.gz)
Source code(zip)
v1.0.2(Mar 21, 2021)

Source code(tar.gz)
Source code(zip)
v1.0.1(Mar 15, 2021)

This is the first official release of tf-transformers. NP with TensorFlow 2.0 and TFlite. Added many tutorials + best model using Joint Loss.
Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

KoSimCSE Korean Simple Contrastive Learning of Sentence Embeddings implementation using pytorch SimCSE Installation git clone https://github.com/BM-K/

34 Nov 24, 2022

A simple version of DeTR

DeTR-Lite A simple version of DeTR Before you enjoy this DeTR-Lite The purpose of this project is to allow you to learn the basic knowledge of DeTR. P

11 Jun 13, 2022

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

Splinter This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection", to

88 Dec 31, 2022

A tool helps build a talk preview image by combining the given background image and talk event description

talk-preview-img-builder A tool helps build a talk preview image by combining the given background image and talk event description Installation and U

4 Aug 20, 2022

Pipelines de datos, 2021.

Este repo ilustra un proceso sencillo de automatización de transformación y modelado de datos, a través de un pipeline utilizando Luigi. Stack princip

8 May 19, 2022

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

154 Jan 01, 2023

Binaural Speech Synthesis

Binaural Speech Synthesis This repository contains code to train a mono-to-binaural neural sound renderer. If you use this code or the provided datase

135 Dec 18, 2022

A single model that parses Universal Dependencies across 75 languages.

A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology tags, lemmas, and dependency trees.

189 Nov 29, 2022

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

pySBD: Python Sentence Boundary Disambiguation (SBD) pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detecti

549 Jan 06, 2023

An open collection of annotated voices in Japanese language

声庭 (Koniwa): オープンな日本語音声とアノテーションのコレクション Koniwa (声庭): An open collection of annotated voices in Japanese language 概要 Koniwa(声庭)は利用・修正・再配布が自由でオープンな音声とアノテ

32 Dec 14, 2022

The Sudachi synonym dictionary in Solar format.

solr-sudachi-synonyms The Sudachi synonym dictionary in Solar format. Summary Run a script that checks for updates to the Sudachi dictionary every hou

3 Aug 19, 2022

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Well-formed Limericks and Haikus with GPT2 📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &

2 May 26, 2022

Contains descriptions and code of the mini-projects developed in various programming languages

TexttoSpeechAndLanguageTranslator-project introduction A pleasant application where the client will be given buttons like play,reset and exit. The cli

1 Dec 22, 2021

In this Notebook I've build some machine-learning and deep-learning to classify corona virus tweets, in both multi class classification and binary classification.

Hello, This Notebook Contains Example of Corona Virus Tweets Multi Class Classification. - Classes is: Extremely Positive, Positive, Extremely Negativ

3 Dec 06, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 06, 2022

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Related tags

Overview

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0

Unique Features

Performance Benchmarks

1. GPT2 benchmark

QuickStart

Production Ready Tutorials

Why should I use tf-transformers?

Do we really need to distill? Jont Loss is all we need.

1. GLUE

2. SQUAD v1.1

Installation

With pip

From Github

Pipeline

Supported Models architectures

Note

Credits

Citation

Comments

Where is the benchmark about 90 times faster than HF transformers?

Colab

HF models are not using key-value caching?

Bump urllib3 from 1.26.3 to 1.26.5

1.26.5

1.26.4

1.26.5 (2021-05-26)

1.26.4 (2021-03-15)

enable flexible tf version for tf.keras.mix_precision global_policy feature

Releases(v2.0.0)

v2.0.0(Apr 8, 2022)

What's Changed

v1.0.21(Apr 3, 2022)

v1.0.20(Apr 2, 2022)

What's Changed

v1.0.19(Mar 10, 2022)

What's Changed

v1.0.18(Mar 3, 2022)

What's Changed

v1.0.17(Jan 12, 2022)

What's Changed

v1.0.16(Jan 12, 2022)

What's Changed

v1.0.15(Jan 12, 2022)

v1.0.14(Jan 12, 2022)

What's Changed

v1.0.13(Jan 12, 2022)

What's Changed

v1.0.12(Jan 12, 2022)

What's Changed

v1.0.11(Jan 12, 2022)

What's Changed

v1.0.10(Jan 12, 2022)

v1.0.9(Jan 12, 2022)

What's Changed

v1.0.4(Jan 5, 2022)

v1.0.3(Mar 21, 2021)

v1.0.2(Mar 21, 2021)

v1.0.1(Mar 15, 2021)

Owner

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

A simple version of DeTR

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

A tool helps build a talk preview image by combining the given background image and talk event description

Pipelines de datos, 2021.

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Binaural Speech Synthesis

A single model that parses Universal Dependencies across 75 languages.

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

An open collection of annotated voices in Japanese language

The Sudachi synonym dictionary in Solar format.

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Contains descriptions and code of the mini-projects developed in various programming languages

In this Notebook I've build some machine-learning and deep-learning to classify corona virus tweets, in both multi class classification and binary classification.

Code for evaluating Japanese pretrained models provided by NTT Ltd.

Mastering Transformers, published by Packt

Fine-tune GPT-3 with a Google Chat conversation history

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.