Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Overview



pypi Python Build codecov Documentation Status License

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation. Texar-PyTorch was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this repository is maintained by Petuum Open Source.

Texar-PyTorch integrates many of the best features of TensorFlow into PyTorch, delivering highly usable and customizable modules superior to PyTorch native ones.

Key Features

  • Two Versions, (Mostly) Same Interfaces. Texar-PyTorch (this repo) and Texar-TF have mostly the same interfaces. Both further combine the best design of TF and PyTorch:
    • Interfaces and variable sharing in PyTorch convention
    • Excellent factorization and rich functionalities in TF convention.
  • Versatile to support broad needs:
    • data processing, model architectures, loss functions, training and inference algorithms, evaluation, ...
    • encoder(s) to decoder(s), sequential- and self-attentions, memory, hierarchical models, classifiers, ...
    • maximum likelihood learning, reinforcement learning, adversarial learning, probabilistic modeling, ...
  • Fully Customizable at multiple abstraction level -- both novice-friendly and expert-friendly.
    • Free to plug in whatever external modules, since Texar is fully compatible with the native PyTorch APIs.
  • Modularized for maximal re-use and clean APIs, based on principled decomposition of Learning-Inference-Model Architecture.
  • Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
  • Clean, detailed documentation and rich examples.




Library API Example

A code example that builds and trains a Conditional GPT2 model (e.g., for machine translation and text summarization):

import texar.torch as tx
from texar.torch.run import *

# (1) Modeling
class ConditionalGPT2Model(nn.Module):
  """An encoder-decoder model with GPT-2 as the decoder."""
  def __init__(self, vocab_size):
    super().__init__()
    # Use hyperparameter dict for model configuration
    self.embedder = tx.modules.WordEmbedder(vocab_size, hparams=emb_hparams)
    self.encoder = tx.modules.TransformerEncoder(hparams=enc_hparams)
    self.decoder = tx.modules.GPT2Decoder("gpt2-small")  # With pre-trained weights

  def _get_decoder_output(self, batch, train=True):
    """Perform model inference, i.e., decoding."""
    enc_states = self.encoder(inputs=self.embedder(batch['source_text_ids']),
                              sequence_length=batch['source_length'])
    if train:  # Teacher-forcing decoding at training time
      return self.decoder(
          inputs=batch['target_text_ids'], sequence_length=batch['target_length'] - 1,
          memory=enc_states, memory_sequence_length=batch['source_length'])
    else:      # Beam search decoding at prediction time
      start_tokens = torch.full_like(batch['source_text_ids'][:, 0], BOS)
      return self.decoder(
          beam_width=5, start_tokens=start_tokens,
          memory=enc_states, memory_sequence_length=batch['source_length'])

  def forward(self, batch):
    """Compute training loss."""
    outputs = self._get_decoder_output(batch)
    loss = tx.losses.sequence_sparse_softmax_cross_entropy(  # Sequence loss
        labels=batch['target_text_ids'][:, 1:], logits=outputs.logits,
        sequence_length=batch['target_length'] - 1)  # Automatic masking
    return {"loss": loss}

  def predict(self, batch):
    """Compute model predictions."""
    sequence, _ = self._get_decoder_output(batch, train=False)
    return {"gen_text_ids": sequence}

  
# (2) Data
# Create dataset splits using built-in data loaders
datasets = {split: tx.data.PairedTextData(hparams=data_hparams[split])
            for split in ["train", "valid", "test"]}

model = ConditionalGPT2Model(datasets["train"].target_vocab.size)

# (3) Training
# Manage the train-eval loop with the Executor API
executor = Executor(
  model=model, datasets=datasets,
  optimizer={"type": torch.optim.Adam, "kwargs": {"lr": 5e-4}},
  stop_training_on=cond.epoch(20),
  log_every=cond.iteration(100),
  validate_every=cond.epoch(1),
  train_metric=("loss", metric.RunningAverage(10, pred_name="loss")),
  valid_metric=metric.BLEU(pred_name="gen_text_ids", label_name="target_text_ids"),
  save_every=cond.validation(better=True),
  checkpoint_dir="outputs/saved_models/")
executor.train()
executor.test(datasets["test"])

Many more examples are available here.

Installation

Texar-PyTorch requires:

  • python == 3.6 or 3.7
  • torch >= 1.0.0. Please follow the official instructions to install the appropriate version.

After torch is installed, install Texar from PyPI:

pip install texar-pytorch

To use cutting-edge features or develop locally, install from source:

git clone https://github.com/asyml/texar-pytorch.git
cd texar-pytorch
pip install .

To use tensorboard support with Executor, please install tensorboardX with the following command

pip install tensorboardX

Getting Started

Reference

If you use Texar, please cite the tech report with the following BibTex entry:

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wanrong Zhu, Devendra Sachan and Eric Xing
ACL 2019

@inproceedings{hu2019texar,
  title={Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation},
  author={Hu, Zhiting and Shi, Haoran and Tan, Bowen and Wang, Wentao and Yang, Zichao and Zhao, Tiancheng and He, Junxian and Qin, Lianhui and Wang, Di and others},
  booktitle={ACL 2019, System Demonstrations},
  year={2019}
}

License

Apache License 2.0

Companies and Universities Supporting Texar

                  

Comments
  • Design decision query: recommended pattern for auxiliary loss terms that should be ignored during evaluation?

    Design decision query: recommended pattern for auxiliary loss terms that should be ignored during evaluation?

    Hello!

    I came across texar-pytorch while in the process of writing my own version of the Executor, and am really happy that someone's already done the work for it, so firstly, thanks for the excellent repository.

    Broad Query One of the main questions I have pertains to the design requirement of including the loss as an item in the dictionary returned by the forward method of the model. Essentially, I'm wondering what's the recommended pattern for including terms in the loss (for eg. regularization terms) that ought to not be a part of the validation loss? Conceptually, I think of the forward pass as being completed with the model making its predictions, which is what I believe the predict() method is for. However, having the computation of the loss as a responsibility of the forward pass could lead to certain problems.

    Context Ideally, the training loss ought to be computed in a separate forward pass after the training epoch is completed. I'm aware that most people use an average over the training batches as an approximation of the training loss. However, this becomes an issue when comparing against a validation loss curve, where the difference between the training and validation curve indicates generalization error. This is for two reasons in the typical case:

    1. The model changes at the end of each batch when the optimizer takes a step, so it's an unfair comparison against the evaluation setting. The model might also overfit on a single batch.
    2. The model might have dropout and batch norm turned on during training which behave differently during evaluation.

    As far as I can tell, the training loss is computed within the training loop in the executor, as opposed to an additional forward pass over the training set with model.eval(). Is my understanding correct?

    On regularization Typically, any regularization terms over the model parameters is added to the loss before the call to backward. With the given interface, it seems like the right place to do this would be in the forward pass. However, I find it a little weird to make the model responsible for regularizing itself. I usually have a separate nn.Module subclass responsible for regularizing a model and dependency-inject the model into this class. That way I can swap out different regularizers without changing the model, in compliance with the Open-Close SOLID principle.

    Could you please explain how to achieve these two things (loss computation over the training set after the train loop, and computing auxiliary loss terms outside the model) with the current setup of the executor? It seems like this is a direct consequence of requiring the model to compute the loss, which seems to be a little problematic. This is currently the main two problems precluding my use of this otherwise awesome repo, so I'd appreciate any insight.

    Thanks!

    topic: executor discussion 
    opened by chiragraman 14
  • Change feature type names in `RecordData`

    Change feature type names in `RecordData`

    The feature types in RecordData (FixedLenFeature, FixedLenSequenceFeature, and VarLenFeature) are directly borrowed from TensorFlow and might cause confusion for PyTorch users. Based on my discussion with @AvinashBukkittu yesterday, we think it might be worthy to also introduce a set of aliases for these types, and use them in the examples.

    Here's the meaning behind each feature type, and what happens when we collate such features:

    • FixedLenFeature are tensor features that have a fixed shape. These features are stacked together (torch.stack(features, dim=0)) while collating.
    • FixedLenSequenceFeature represent a list of tensor features such that each list element have the same fixed shape. These features are padded and batched (tx.data.padded_batch(features)) while collating.
    • VarLenFeature can be tensors of any shape, or other objects that are not tensors. These features are simply put into a list (list(features)) while collating.

    I think we can rename the feature types to either:

    • describe the object types of the input features...
      • tensor, tensor_seq, and any/other.
    • ... or, describe how they're collated.
      • tensor, pad_tensor, and list.

    But yeah, any suggestions are welcome.


    Also, for the feature_original_types format: https://github.com/asyml/texar-pytorch/blob/43967ee238a5da2e996f4f71644a940a86cad009/texar/torch/data/data/record_data.py#L444-L470

    Since VarLenFeature can be any object, it does not really make sense to ask the user to write a dtype. Maybe we can also accept None here.

    enhancement topic: data 
    opened by huzecong 10
  • Add vae_text example

    Add vae_text example

    • Port vae_text from texar-TF.

    • Add external distribution MultivariateNormalDiag.

    • Add preprocessing for data batch.

    • Modify None checking condition for initial_state in RNNDecoderBase.

    • Modify max_pos for config_trans_yahoo.py.

    • Modify connectors mlp function.

    • Refactor vae_text training & generation decoder.

    • Refactor vae_text decoder embeddings.

    • Refactor to import texar.torch.

    • Polish code.

    opened by TomNong 10
  • Workflow and code update for numpy versions from 1.15 to 1.21

    Workflow and code update for numpy versions from 1.15 to 1.21

    This PR fixes https://github.com/asyml/texar-pytorch/issues/333 and fixes https://github.com/asyml/texar-pytorch/issues/341

    Code is updated to pass the mypy test with numpy>=1.20 (which fixes #333)

    Workflow is updated to enumerate numpy versions (from 1.15 to 1.21), and further include pytorch version (1.7.1 and 1.8.1), which fixes #341

    Note: In mypy.ini, I changed warn_unused_ignores to False (here) for the enumeration of numpy versions. Not sure if there are better ways to do it like in mypy-torch. Looks adding a same line under [mypy-numpy] doesn't work.

    opened by tanyuqian 9
  • Update RoBERTa vocabulary files

    Update RoBERTa vocabulary files

    import torch
    roberta = torch.hub.load('pytorch/fairseq', 'roberta.base')
    roberta.eval()
    
    tokens = roberta.encode('Hello world!')
    print(tokens)  # [    0, 31414,   232,   328,     2]
    
    import texar.torch as tx
    tokenizer = tx.data.RoBERTaTokenizer(pretrained_model_name='roberta-base')
    
    input_ids, _ = tokenizer.encode_text('Hello world!', max_seq_length=5)
    print(input_ids)  # [0, 31414, 232, 328, 2]
    
    opened by gpengzhi 9
  • Resolve issue #196: Data module enhancement

    Resolve issue #196: Data module enhancement

    1. Removed the device moving operations (.to(device)) in collate methods of built-in data modules and examples.

    2. Add a call to move_memory in __next__ of the dataset iterators before returning the batch. move_memory internally calls map_structure to recursively move all tensors to GPU memory.

      It is worth noting that we could have combined this into the pin_memory thread of the dataloader. We don't do this for two reasons:

      • It is pretty difficult to modify code in PyTorch without basically copy-pasting everything.
      • pin_memory is called for every tensor in the prefetched queue. Moving all of them to CUDA memory might result to excessive memory usage.

      It is also worth noting that this is a better practice than before. Without modifying any code, the Transformer example now runs ~15% faster (172ex/s to 199ex/s).

    3. The DataBase class is now renamed to DatasetBase to avoid confusion. A default collate implementation is not added, since the PyTorch default_collate function has undesirable behaviors (for instance, if each example contains a list of integers, the batch will be collated as a list of tensors), and it'll be inconsistent if we introduce some different default behavior.

    opened by huzecong 9
  • Add Tokenizer Module (Pre-trained Tokenizer)

    Add Tokenizer Module (Pre-trained Tokenizer)

    The design of the pre-trained tokenizer module borrows ideas from both pytorch-transformer and our pretrained module.

    To initiate a tokenizer, you can sepcify pretrained_model_name

    tokenizer = BERTTokenizer(pretrained_model_name='bert-base-uncased')
    

    or sepcify hparams

    tokenizer = BERTTokenizer(hparams={'pretrained_model_name': 'bert-base-uncased'})
    

    or directly load from the vocab file

    tokenizer = BERTTokenizer(hparams={'pretrained_model_name': None, 'vocab_file': 'path_to_vocab'})
    

    For the downstream tasks, you can add new tokens to the tokenizer

    tokenizer.add_tokens([...])
    

    and get the lasted vocabulary size, which can be used to define the downstream model

    current_vocab_size = len(tokenizer)
    

    You can also save the tokenizer to a directory (vocab file, special token file, added token file, and config file will be saved)

    tokenizer.save('path-to-directory')
    

    or load one from a directory

    tokenizer = BERTTokenizer.load('path-to-directory')
    

    Basically, we provide four core function for each tokenizer: text-to-token, text-to-id, id-to-text, token-to-text.

    tokens = tokenizer(inputs=text, task='text-to-token')
    ids = tokenizer(inputs=text, task='text-to-id')
    tokens = tokenizer(inputs=ids, task='id-to-token')
    text = tokenizer(inputs=ids, task='id-to-text')
    
    opened by gpengzhi 9
  • Not support Windows system?

    Not support Windows system?

    When I run the code in Windows, I met the error

    RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got CUDAType instead

    and I run the same codes in Linux, all things were OK.

    I had checked the pytorch version were both 1.1.0.

    bug 
    opened by Codle 9
  • Seq2Seq Example with GPU Support

    Seq2Seq Example with GPU Support

    Hello,

    How can I run the Seq2Seq example with my GPU?

    I already modified the training data to use the cuda device as well as the model:

    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 
    
    train_data = tx.data.PairedTextData(hparams=config_data.train, device=device)
    val_data = tx.data.PairedTextData(hparams=config_data.val, device=device)
    test_data = tx.data.PairedTextData(hparams=config_data.test, device=device)
    
    model = Seq2SeqAttn(train_data)
    model.to(device)
    
    bug priority: high 
    opened by bigabig 9
  • Cannot download dataset in bert example

    Cannot download dataset in bert example

    When I run

    python data/download_glue_data.py --tasks=MRPC
    

    I got

    dyld: Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib
      Referenced from: /usr/local/bin/wget
      Reason: image not found
    dyld: Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib
      Referenced from: /usr/local/bin/wget
      Reason: image not found
    Processing MRPC...
    Traceback (most recent call last):
      File "data/download_glue_data.py", line 152, in <module>
        sys.exit(main(sys.argv[1:]))
      File "data/download_glue_data.py", line 142, in main
        format_mrpc(args.data_dir, args.path_to_mrpc)
      File "data/download_glue_data.py", line 55, in format_mrpc
        f"Train data not found at {mrpc_train_file}"
    AssertionError: Train data not found at data/MRPC/msr_paraphrase_train.txt
    

    Since you are testing bert example, can you investigate this issue? @atif93

    bug topic: examples 
    opened by gpengzhi 7
  • Question about the usage of helper in TransformerDecoder

    Question about the usage of helper in TransformerDecoder

    Hi~ I want to implement the step-by-step TransformerDecoder with a TrainingHelper(), but I don't know how to call the same forward function as the RNN's, e.g.

    outputs, hidden = self.gru(embedded, hidden) # forward for every step

    Has it been done in the step method of the class helper ? Hope for your help!

    question 
    opened by ha-lins 6
  • Code blocks in docstring are not fully rendered as sphinx doc.

    Code blocks in docstring are not fully rendered as sphinx doc.

    The code blocks in the docstring are not rendered as expected int he sphinx doc.

    The code block documentation In the following example(and a few following), are not rendered correctly in the resulting documentation. Probably there is some slight indent mismatching here.

    bug topic: docs 
    opened by hunterhector 0
  • GPU memory usage when doing beam search

    GPU memory usage when doing beam search

    A BART model (https://arxiv.org/pdf/1910.13461.pdf) is implemented here: https://github.com/tanyuqian/texar-pytorch/tree/master/examples/bart

    It has passed the test of text classification (MNLI) and summarization (CNN/DM) with greedy decoding, but it fails to run CNN/DM with beam search on a single GTX 1080Ti because of GPU memory, even when batch_size=1, beam_width=2, max_decoding_length=140.

    A script to show this issue is here: https://github.com/tanyuqian/texar-pytorch/blob/master/examples/bart/bart_cnn.py (run this code after downloading CNN/DM data following README)

    Note that in this fork, two more hyperparameters are added in TransformerDecoder ('normalize_before' and 'final_layer_norm'): https://github.com/tanyuqian/texar-pytorch/blob/master/texar/torch/modules/decoders/transformer_decoders.py#L290

    question topic: modules 
    opened by tanyuqian 0
  • Error when decoder has more than 1 layer.

    Error when decoder has more than 1 layer.

    https://github.com/asyml/texar-pytorch/blob/0ba18bff28cd8fff2640021354c15dfd4aef2f72/examples/vae_text/config_lstm_yahoo.py#L62

    The output is the follwoing: RuntimeError: Input batch size 128 doesn't match hidden[0] batch size 256

    The issue is due to the "initial_state=lstm_states" when the decoder is forwarded.

    question topic: examples 
    opened by pajola 0
  • Incorporating copy mechanism in decoder

    Incorporating copy mechanism in decoder

    I'm really enjoying this library, thanks for your work. Just curious, are there any plans to implement some sort of copying mechanism for decoding, e.g. CopyNet (https://arxiv.org/abs/1603.06393)?

    enhancement topic: modules 
    opened by roemmele 2
  • Add ELMo modules

    Add ELMo modules

    Add texar-styled ELMo encoder adapted from allennlp. The corresponding tokenizer will be in another PR.

    Resolve some comments in #298

    I checked the implementation of ELMo in allennlp, It seems that they used customized LSTM such that we cannot use our LSTM module to implement it directly. And the Highway module they used is different from our HighwayWrapper. I feel that it is better to directly use their implementations, and the correctness of the implementation is guaranteed by their unit tests. Please let me know your thought @huzecong

    opened by gpengzhi 2
Releases(v0.1.4)
  • v0.1.4(Apr 14, 2022)

    • Add tests for python3.8 and python3.9 https://github.com/asyml/texar-pytorch/pull/340
    • workflow and code update for numpy versions from 1.15 to 1.21. https://github.com/asyml/texar-pytorch/pull/352
    • Move out HParams and SpecialTokens to asyml-utilities https://github.com/asyml/texar-pytorch/pull/353
    • Several bug fixes https://github.com/asyml/texar-pytorch/issues/335 https://github.com/asyml/texar-pytorch/pull/345 https://github.com/asyml/texar-pytorch/pull/351
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Mar 29, 2021)

    New features

    1. Integrated texar-pytorch with NNI and AdaptDL for distributed adaptive API (https://github.com/asyml/texar-pytorch/pull/331)
    2. Integrated with NNI for hyperparameter tuning (https://github.com/asyml/texar-pytorch/pull/324)
    3. Add Information Loss based on KL divergence (https://github.com/asyml/texar-pytorch/pull/328)
    4. Allow WordpieceTokenizer to return the original character spans (https://github.com/asyml/texar-pytorch/pull/332)
    5. Add a few modules: RNN Classifier (https://github.com/asyml/texar-pytorch/pull/303), SpanBERT (https://github.com/asyml/texar-pytorch/pull/300)

    Feature improvements:

    1. Fix a few documentation issues.

    Fixes

    1. Fix a type error in Beam Search top-k index (https://github.com/asyml/texar-pytorch/pull/330)
    2. Fix a file operation error in Executor (https://github.com/asyml/texar-pytorch/pull/323)
    3. Fix some evaluator bugs and related file handling (https://github.com/asyml/texar-pytorch/pull/320)
    4. Fix a problem where metrics cannot be pickled (https://github.com/asyml/texar-pytorch/pull/319)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Feb 7, 2020)

    New features

    • Support PyTorch 1.3. (#249)
    • Add T5 modules (T5Encoder, T5Decoder, and T5EncoderDecoder). (#280)
    • Add T5Tokenizer. (#283)
    • Support PyTorch 1.4. (#291)

    Feature improvements

    • Refactor the interface of GPT2 modules. (#238)
    • Support gpt2-xl checkpoint file in GPT2 modules. (#242)
    • Add code coverage check in CI. (#245)
    • Update the vocabulary files of RoBERTa modules. (#255)
    • Disable codecov/patch check in CI. (#265)
    • Provide option to freeze the embedding parameters. (#271)
    • Add encode_text_for_generation function in XLNetTokenizer. (#278)
    • Use warning instead of error in map_token_to_id function. (#285)
    • Add copyright header to unit tests. (#287)
    • Remove duplicated pytest in CI. (#289)
    • Update the versions of pylint, flake8, and mypy in CI. (#292)

    Fixes

    • Fix the documentation issues in SentencePieceTokenizer. (#236)
    • Fix the bugs in RoBERTa checkpoint file loading procedure. (#241)
    • Fix the documentation issues in Executor. (#244)
    • Fix the documentation issues in gpt-2 example. (#250)
    • Fix the bugs in bidirectional_dynamic_rnn and dynamic_rnn functions. (#252)
    • Fix the bugs in vae_text example. (#253)
    • Fix the bugs in sentence_classifier example. (#262)
    • Fix the path error when installing texar-pytorch in Windows. (#268)
    • Fix the bugs in XLNetTokenizer. (#273)
    • Fix the bugs in download_checkpoint function. (#274)
    • Fix the bugs in google drive downloading function. (#275)
    • Fix the bugs in the unit test of GPT2Decoder. (#288)
    • Fix the documentation issues in Decoder module. (#290)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Oct 15, 2019)

  • v0.0.1(Aug 2, 2019)

Owner
ASYML
Machine Learning as Machine Assembly, part of the CASL project https://casl-project.ai/
ASYML
BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

BeautyNet BeautyNet is an AI powered model which can tell you whether you're beautiful or not. Download Dataset from here:https://www.kaggle.com/gpios

Ansh Gupta 0 May 06, 2022
A look-ahead multi-entity Transformer for modeling coordinated agents.

baller2vec++ This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling

Michael A. Alcorn 30 Dec 16, 2022
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Pattern-Exploiting Training (PET) This repository contains the code for Exploiting Cloze Questions for Few-Shot Text Classification and Natural Langua

Timo Schick 1.4k Dec 30, 2022
Deduplication is the task to combine different representations of the same real world entity.

Deduplication is the task to combine different representations of the same real world entity. This package implements deduplication using active learning. Active learning allows for rapid training wi

63 Nov 17, 2022
Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

Chenhe Dong 28 Nov 10, 2022
Topic Inference with Zeroshot models

zeroshot_topics Table of Contents Installation Usage License Installation zeroshot_topics is distributed on PyPI as a universal wheel and is available

Rita Anjana 55 Nov 28, 2022
PyTranslator é simultaneamente um editor e tradutor de texto com diversos recursos e interface feito com coração e 100% em Python

PyTranslator O Que é e para que serve o PyTranslator? PyTranslator é simultaneamente um editor e tradutor de texto em com interface gráfica que usa a

Elizeu Barbosa Abreu 1 May 12, 2022
This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was

1 Nov 16, 2021
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

Eliyar Eziz 2.3k Dec 29, 2022
Converts python code into c++ by using OpenAI CODEX.

🦾 codex_py2cpp 🤖 OpenAI Codex Python to C++ Code Generator Your Python Code is too slow? 🐌 You want to speed it up but forgot how to code in C++? ⌨

Alexander 423 Jan 01, 2023
Ongoing research training transformer language models at scale, including: BERT & GPT-2

What is this fork of Megatron-LM and Megatron-DeepSpeed This is a detached fork of https://github.com/microsoft/Megatron-DeepSpeed, which in itself is

BigScience Workshop 316 Jan 03, 2023
基于百度的语音识别,用python实现,pyaudio+pyqt

Speech-recognition 基于百度的语音识别,python3.8(conda)+pyaudio+pyqt+baidu-aip 百度有面向python

J-L 1 Jan 03, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !

Yase Yet Another Sequence Encoder - encode sequences to vector of vectors in python ! Why Yase ? Yase enable you to encode any sequence which can be r

Pierre PACI 12 Aug 19, 2021
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Dec 28, 2022
超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

bert4pytorch 2021年8月27更新: 感谢大家的star,最近有小伙伴反映了一些小的bug,我也注意到了,奈何这个月工作上实在太忙,更新不及时,大约会在9月中旬集中更新一个只需要pip一下就完全可用的版本,然后会新添加一些关键注释。 再增加对抗训练的内容,更新一个完整的finetune

muqiu 317 Dec 18, 2022
HAIS_2GNN: 3D Visual Grounding with Graph and Attention

HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv

Yue Chen 1 Nov 26, 2022
Anuvada: Interpretable Models for NLP using PyTorch

Anuvada: Interpretable Models for NLP using PyTorch So, you want to know why your classifier arrived at a particular decision or why your flashy new d

EDGE 102 Oct 01, 2022