Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Overview



pypi Build Status codecov Documentation Status License

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation.

Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this repository is maintained by Petuum Open Source.

Key Features

  • Two Versions, (Mostly) Same Interfaces. Texar-TensorFlow (this repo) and Texar-PyTorch have mostly the same interfaces. Both further combine the best design of TF and PyTorch:
    • Interfaces and variable sharing in PyTorch convention
    • Excellent factorization and rich functionalities in TF convention.
  • Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
  • Fully Customizable at multiple abstraction level -- both novice-friendly and expert-friendly.
    • Free to plug in whatever external modules, since Texar is fully compatible with the native TF/PyTorch APIs.
  • Versatile to support broad tasks, models, algorithms, data processing, evaluation, etc.
    • encoder(s) to decoder(s), sequential- and self-attentions, memory, hierarchical models, classifiers...
    • maximum likelihood learning, reinforcement learning, adversarial learning, probabilistic modeling, ...
  • Modularized for maximal re-use and clean APIs, based on principled decomposition of Learning-Inference-Model Architecture.
  • Distributed model training with multiple GPUs.
  • Clean, detailed documentation and rich examples.


Library API Example

Builds an encoder-decoder model, with maximum likelihood learning:

import texar.tf as tx

# Data 
data = tx.data.PairedTextData(hparams=hparams_data) # a dict of hyperparameters 
iterator = tx.data.DataIterator(data)
batch = iterator.get_next()                         # get a data mini-batch

# Model architecture
embedder = tx.modules.WordEmbedder(data.target_vocab.size, hparams=hparams_emb)
encoder = tx.modules.TransformerEncoder(hparams=hparams_enc)
outputs_enc = encoder(inputs=embedder(batch['source_text_ids']),  # call as a function
                      sequence_length=batch['source_length'])
                      
decoder = tx.modules.TransformerDecoder(
    output_layer=tf.transpose(embedder.embedding) # tie input embedding w/ output layer
    hparams=hparams_decoder)
outputs, _, _ = decoder(memory=output_enc, 
                        memory_sequence_length=batch['source_length'],
                        inputs=embedder(batch['target_text_ids']),
                        sequence_length=batch['target_length']-1,
                        decoding_strategy='greedy_train')    # teacher-forcing decoding
                        
# Loss for maximum likelihood learning
loss = tx.losses.sequence_sparse_softmax_cross_entropy(
    labels=batch['target_text_ids'][:, 1:],
    logits=outputs.logits,
    sequence_length=batch['target_length']-1)  # automatic sequence masks

# Beam search decoding
outputs_bs, _, _ = tx.modules.beam_search_decode(
    decoder,
    embedding=embedder,
    start_tokens=[data.target_vocab.bos_token_id]*num_samples,
    end_token=data.target_vocab.eos_token_id)

The same model, but with adversarial learning:

helper = tx.modules.GumbelSoftmaxTraingHelper( # Gumbel-softmax decoding
    start_tokens=[BOS]*batch_size, end_token=EOS, embedding=embedder)
outputs, _ = decoder(helper=helper)            # automatic re-use of the decoder variables

discriminator = tx.modules.BertClassifier(hparams=hparams_bert)        # pre-trained model

G_loss, D_loss = tx.losses.binary_adversarial_losses(
    real_data=data['target_text_ids'][:, 1:],
    fake_data=outputs.sample_id,
    discriminator_fn=discriminator)

The same model, but with RL policy gradient learning:

agent = tx.agents.SeqPGAgent(samples=outputs.sample_id,
                             logits=outputs.logits,
                             sequence_length=batch['target_length']-1,
                             hparams=config_model.agent)

Many more examples are available here

Installation

(Note: Texar>0.2.3 requires Python 3.6 or 3.7. To use with older Python versions, please use Texar<=0.2.3)

Texar requires:

After tensorflow and tensorflow_probability are installed, install Texar from PyPI:

pip install texar

To use cutting-edge features or develop locally, install from source:

git clone https://github.com/asyml/texar.git
cd texar
pip install .

Getting Started

Reference

If you use Texar, please cite the tech report with the following BibTex entry:

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wanrong Zhu, Devendra Sachan and Eric Xing
ACL 2019

@inproceedings{hu2019texar,
  title={Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation},
  author={Hu, Zhiting and Shi, Haoran and Tan, Bowen and Wang, Wentao and Yang, Zichao and Zhao, Tiancheng and He, Junxian and Qin, Lianhui and Wang, Di and others},
  booktitle={ACL 2019, System Demonstrations},
  year={2019}
}

License

Apache License 2.0

Companies and Universities Supporting Texar

                  

Comments
  • Potential issue with mlp transform

    Potential issue with mlp transform

    Hi,

    Currently, there is an issue in _mlp_transform https://github.com/asyml/texar/blob/master/texar/modules/connectors/connectors.py#L73 , thanks @huzecong who found this issue.

    For input with shape [batch_size, max_time, dim], current _mlp_transform reshapes it to [batch_size, max_time * dim] and processes linear layer transform. The transform matrix has shape [max_time * dim, mlp_output_dim], which is equivalent to max_time number of smaller matrixes with size [dim, mlp_output_dim]. However, regarding same-time-vector should transform with same matrix, current _mlp_transform can not make transform in that way. Should _mlp_transform be modified?

    enhancement 
    opened by TomNong 9
  • ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    when i try to do model_utils.init_bert_checkpoint(init_checkpoint) I get ValueError:Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    This was working fine earlier ,could this be due to the new changes introduced in texar ,as I have not changed my code .

    opened by Vibha111094 9
  • Transformer decoder is giving the same out for every example

    Transformer decoder is giving the same out for every example

    I have been using texar library to solve summarization problem. I have replaced encoder part with Bert and decode is Transformer Decoder with beam size 5. My loss got decreased from 11 to around 3 after 50 k iteration but when I try to infer on test data it gives me same output for every example any thoughts on this? Link to the code: https://github.com/santhoshkolloju/bert_summ

    opened by santhoshkolloju 8
  • problem with 'allow_smaller_final_batch' and 'beam_search_decode'

    problem with 'allow_smaller_final_batch' and 'beam_search_decode'

    Hi there, I encountered an error when trying the seq2seq_attn example with iwslt14 dataset. The full error log is appended at the end of the post.

    The error occurs when operating this line: https://github.com/asyml/texar/blob/9c699e8143fd8ecb5d65a41ceef09c45832b9258/examples/seq2seq_attn/seq2seq_attn.py#L125

    It indicates that the training stage is ok, and the bug occurs in the validation stage.

    Here are two pieces of error logs:

    2018-09-11 16:04:10.567407: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/fw/fw/dynamic_rnn/input_0_32362: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    2018-09-11 16:04:10.567928: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/bw/bw/dynamic_rnn/input_0_32364: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    ......
    
    Caused by op 'attention_rnn_decoder_5/tile_batch_1/Reshape', defined at:
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 93, in main
        train_op, infer_outputs = build_model(batch, train_data)
      File "seq2seq_attn.py", line 77, in build_model
        max_decoding_length=60)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/beam_search_decode.py", line 193, in beam_search_decode
        cell = decoder_or_cell._get_beam_search_cell(beam_width=beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/rnn_decoders.py", line 545, in _get_beam_search_cell
        memory_seq_length, beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in tile_batch
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
    ......
    

    As there is no problem in training stage, I guess there might be something wrong in the implementation of beam_search_decode.

    The error can be described as: the tensor expects a dimension of 32 while we feed 23 instead. And I find that the batch size is 32, and there are 887 validation examples in valid.de, where 887 % 32 == 23. Also, add 'allow_smaller_final_batch': False to the val and test item of config_iwslt14.py can get rid of the error.

    But this "fix" is not what we really want. Theoretically, we are supposed to run validation and test on all the dev/test samples.

    I am using tensorflow 1.8. Please let me know if I need to provide any other environment information.

    The full logs:

    $ python seq2seq_attn.py --config_model config_model --config_data config_iwslt14
    2018-09-11 15:50:03.793617: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
    2018-09-11 15:50:04.038875: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
    name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
    pciBusID: 0000:02:00.0
    totalMemory: 22.38GiB freeMemory: 22.21GiB
    2018-09-11 15:50:04.038945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
    2018-09-11 15:50:04.384688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
    2018-09-11 15:50:04.384752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0
    2018-09-11 15:50:04.385091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N
    2018-09-11 15:50:04.385631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21549 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:02:00.0, compute capability: 6.1)
    step=0, loss=481.5847
    step=500, loss=101.2404
    step=1000, loss=75.9185
    step=1500, loss=102.7388
    step=2000, loss=81.9897
    step=2500, loss=64.7623
    step=3000, loss=76.1445
    step=3500, loss=81.1186
    step=4000, loss=48.0918
    step=4500, loss=54.7355
    step=5000, loss=74.8126
    2018-09-11 16:04:10.567407: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/fw/fw/dynamic_rnn/input_0_32362: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    2018-09-11 16:04:10.567928: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/bw/bw/dynamic_rnn/input_0_32364: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    Traceback (most recent call last):
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
        return fn(*args)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
        options, feed_dict, fetch_list, target_list, run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
        run_metadata)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 149, in main
        val_bleu = _eval_epoch(sess, 'val')
      File "seq2seq_attn.py", line 125, in _eval_epoch
        sess.run(fetches, feed_dict=feed_dict)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
        run_metadata_ptr)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
        feed_dict_tensor, options, run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
        run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
        raise type(e)(node_def, op, message)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    Caused by op 'attention_rnn_decoder_5/tile_batch_1/Reshape', defined at:
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 93, in main
        train_op, infer_outputs = build_model(batch, train_data)
      File "seq2seq_attn.py", line 77, in build_model
        max_decoding_length=60)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/beam_search_decode.py", line 193, in beam_search_decode
        cell = decoder_or_cell._get_beam_search_cell(beam_width=beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/rnn_decoders.py", line 545, in _get_beam_search_cell
        memory_seq_length, beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in tile_batch
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 375, in map_structure
        structure[0], [func(*x) for x in entries])
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 375, in <listcomp>
        structure[0], [func(*x) for x in entries])
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in <lambda>
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 90, in _tile_batch
        ([shape_t[0] * multiplier], shape_t[1:]), 0))
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 6113, in reshape
        "Reshape", tensor=tensor, shape=shape, name=name)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
        op_def=op_def)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
        op_def=op_def)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
        self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
    
    InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    opened by Luolc 8
  • import texar.tf as tx .No module named 'texar.tf' via pip install

    import texar.tf as tx .No module named 'texar.tf' via pip install

    Hi guys

    I am very new to texar. I have installed it via pip. When trying to run one of the examples, namely seq2seq_exposure_bias. I get the following: File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'texar.tf' . I have tried to run it via jupyter and bash, and neither works, nor a direct bash command python import texar.tf as tx.

    Then I have tried to run the example from the git cloned texar repo. The texar.tf is imported successfully in that case, but then I get tensorflow.python.framework.errors_impl.NotFoundError: ./data/iwslt14/vocab.de; No such file or directory when running python interpolation_main.py \ --config_model configs.config_model \ --config_data configs.config_iwslt14 \ --lambdas_init [0.04,0.96,0.0] \ --delta_lambda_self 0.06 \ --delta_lambda_reward 0.06 \ --lambda_reward_steps 4

    Tensorflow is installed and working.

    question 
    opened by pol690 7
  • Transformer Training fails on custom dataset

    Transformer Training fails on custom dataset

    Hi, I'm training Transformer on a custom dataset on CPU. I've used spm encoding and have followed instructions to a T, but the training always fails with the below error trace. The same error occurs regardless of BPE, SPM or raw encodings. Kindly help!

    step: 400, loss: 5.6811
    step: 500, loss: 5.3085
    Traceback (most recent call last):
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
        return fn(*args)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
        options, feed_dict, fetch_list, target_list, run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
        run_metadata)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
    	 [[{{node sinusoid_posisiton_embedder/embedding_lookup}}]]
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "transformer_main.py", line 308, in <module>
        main()
      File "transformer_main.py", line 293, in main
        step = _train_epoch(sess, epoch, step, smry_writer)
      File "transformer_main.py", line 273, in _train_epoch
        _eval_epoch(sess, epoch, mode='eval')
      File "transformer_main.py", line 193, in _eval_epoch
        fetches_ = sess.run(fetches, feed_dict=feed_dict)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
        run_metadata_ptr)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
        feed_dict_tensor, options, run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
        run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
        raise type(e)(node_def, op, message)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
    	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]
    
    Caused by op 'sinusoid_posisiton_embedder/embedding_lookup', defined at:
      File "transformer_main.py", line 308, in <module>
        main()
      File "transformer_main.py", line 96, in main
        src_pos_embeds = pos_embedder(sequence_length=src_seq_len)
      File "/home/karthik/installs/texar/texar/module_base.py", line 116, in __call__
        return self._template(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 360, in __call__
        return self._call_func(args, kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 311, in _call_func
        result = self._func(*args, **kwargs)
      File "/home/karthik/installs/texar/texar/modules/embedders/position_embedders.py", line 327, in _build
        outputs = tf.nn.embedding_lookup(embedding, inputs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 316, in embedding_lookup
        transform_fn=None)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 133, in _embedding_lookup_and_transform
        result = _clip(array_ops.gather(params[0], ids, name=name),
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
        return target(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 3273, in gather
        return gen_array_ops.gather_v2(params, indices, axis, name=name)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3748, in gather_v2
        "GatherV2", params=params, indices=indices, axis=axis, name=name)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
        op_def=op_def)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
        return func(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
        op_def=op_def)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
        self._traceback = tf_stack.extract_stack()
    
    InvalidArgumentError (see above for traceback): indices[55,128] = 128 is not in [0, 128)
    	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]
    
    question 
    opened by karthikb23 7
  • Error running gpt2_generate_main.py

    Error running gpt2_generate_main.py

    When I try to run the gpt2_generate_main.py file, I face the following error,

    ValueError: The shape for transformer_decoder_1/transformer_decoder/while/Merge_27:0 is not an invariant for the loop. It enters the loop with shape (1, 768), but has shape (?, 768) after one iteration. Provide shape invariants using either the shape_invariants argument of tf.while_loop or set_shape() on the loop variables.

    Also, how to use this model for conditioned text generation tasks? I am working on Reading Comprehension task that takes in a single stream input (Passage + ": " + Question + "? " + Answer) and am using a custom mask to extract loss between the answer start and sequence length indices. Is there a more elegant way to get this done?

    Here is the entire list of callbacks:

    Traceback (most recent call last): File "gpt2_generate_main.py", line 210, in tf.app.run() File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "gpt2_generate_main.py", line 144, in main mode=tf.estimator.ModeKeys.PREDICT) File "/home1/deepak/RaviTej/texar/texar/module_base.py", line 116, in call return self._template(*args, **kwargs) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 455, in call result = self._call_func(args, kwargs) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 406, in _call_func result = self._func(*args, **kwargs) File "/home1/deepak/RaviTej/texar/texar/modules/decoders/transformer_decoders.py", line 569, in _build scope=self.variable_scope) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 309, in dynamic_decode swap_memory=swap_memory) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3202, in while_loop result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2940, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2914, in _BuildLoop next_vars.append(_AddNextAndBackEdge(m, v)) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 688, in _AddNextAndBackEdge _EnforceShapeInvariant(m, v) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 632, in _EnforceShapeInvariant (merge_var.name, m_shape, n_shape)) ValueError: The shape for transformer_decoder_1/transformer_decoder/while/Merge_27:0 is not an invariant for the loop. It enters the loop with shape (1, 768), but has shape (?, 768) after one iteration. Provide shape invariants using either the shape_invariants argument of tf.while_loop or set_shape() on the loop variables.

    originally defined at: File "gpt2_generate_main.py", line 133, in main hparams=gpt2_config.decoder) File "/home1/deepak/RaviTej/texar/texar/modules/decoders/transformer_decoders.py", line 98, in init ModuleBase.init(self, hparams) File "/home1/deepak/RaviTej/texar/texar/module_base.py", line 73, in init create_scope_now_=True) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 153, in make_template **kwargs)

    opened by Akella17 7
  • Making Input Different From Text

    Making Input Different From Text

    Hello,

    I want to train an image captioning model with the transformer. How should I configure the input (i.e. VGG imagenet features) so that I can have an encoder-decoder model that is an image captioning model?

    Please share any links or examples to implement this.

    opened by ghost 7
  • Need some help to train gpt-2. Thank you very much!

    Need some help to train gpt-2. Thank you very much!

    The figure below is from the README of gpt-2.

    image

    source_hidden_states and some other variables are missing, if I coding based on gpt2_generate_main.py.

    Hope you could please provide a more detailed example? or only an related example link of other model is OK.

    Thank you!

    opened by guotong1988 7
  • Move Texar-TF under `texar/tf/`

    Move Texar-TF under `texar/tf/`

    This PR includes the following changes:

    • Move entire Texar-TF codebase under texar/tf/
    • Add DeprecationWarning when user imports from texar but not texar.tf in Python 3
    opened by huzecong 6
  • A Question About the Example -- sentence_classifier

    A Question About the Example -- sentence_classifier

    I am a new user of texar, when I try one of the examples named "sentence_classifier"(https://github.com/asyml/texar/tree/master/examples/sentence_classifier), I find some problems hard to resolve. In the example, the language of the data is english however I want to try it in chinese. I just replace the SSTdata with my data. Here is the form of the data I use: 2 但是您的所作所为不合适。 1 我觉得这个苹果味道一般。 0 今天终于修正了一个错误。 About the vocab ,I use my own word segmenter for chinese sentence. The program displays an error: image I guess it is because the size of the batch. So I change the batch size from 50 to 64 in config_kim.py. Then there is another error appears: image I don't know how to explain it so I hope someone can tell me if I use Texar incorrectly or there are other problems exist.

    opened by mahaoxiang822 6
  • Update dataset_utils_test.py

    Update dataset_utils_test.py

    Hello,this issue shows the reason that I commit this PR. I sincerely wish my PR will help you.And if you think my PR has a little work,Hoping you could merge it. Thank you,my friend.

    opened by DLPerf 1
  •  Performance issues in the program

    Performance issues in the program

    Hello,I found a performance issue in the definition of test_make_chained_transformation , asyml/texar/blob/master/tests/data/data/dataset_utils_test.py, dataset.map was called without num_parallel_calls. I think it will increase the efficiency of your program if you add this.

    Here is the documemtation of tensorflow to support this thing.

    opened by DLPerf 0
  • Question about the installation about the version 0.2.1

    Question about the installation about the version 0.2.1

    Hello, I am trying to reproduce a project , there say i need to install texar 0.2.1,Because the texar 0.2.1 is saved there . https://github.com/fabrahman/Emo-Aware-Storytelling/tree/master/third_party/texar And i run -> pip install . And the system show Successfully installed texar-0.2.1 ; But when i run python to enter python environment and run import texar to test if it is successfully downloaded. Something happened! It says :>>> import texar Traceback (most recent call last): File "", line 1, in File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/init.py", line 29, in from texar import modules File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/init.py", line 24, in from texar.modules.networks import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/networks/init.py", line 24, in from texar.modules.networks.network_base import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/networks/network_base.py", line 26, in from texar.core.layers import get_layer File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/core/init.py", line 24, in from texar.core.layers import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/core/layers.py", line 25, in import tensorflow.contrib.rnn as rnn File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/init.py", line 38, in from tensorflow.contrib import cloud File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/init.py", line 24, in from tensorflow.contrib.cloud.python.ops.bigquery_reader_ops import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/bigquery_reader_ops.py", line 21, in from tensorflow.contrib.cloud.python.ops import gen_bigquery_reader_ops File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/gen_bigquery_reader_ops.py", line 307, in _op_def_lib = _InitOpDefLibrary(b"\n\355\001\n\016BigQueryReader\032\024\n\rreader_handle\030\007\200\001\001"\027\n\tcontainer\022\006string\032\002\022\000"\031\n\013shared_name\022\006string\032\002\022\000"\024\n\nproject_id\022\006string"\024\n\ndataset_id\022\006string"\022\n\010table_id\022\006string"\027\n\007columns\022\014list(string)"\027\n\020timestamp_millis\022\003int"\034\n\016test_end_point\022\006string\032\002\022\000\210\001\001\n\331\001\n GenerateBigQueryReaderPartitions\032\016\n\npartitions\030\007"\024\n\nproject_id\022\006string"\024\n\ndataset_id\022\006string"\022\n\010table_id\022\006string"\027\n\007columns\022\014list(string)"\027\n\020timestamp_millis\022\003int"\025\n\016num_partitions\022\003int"\034\n\016test_end_point\022\006string\032\002\022\000") File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/gen_bigquery_reader_ops.py", line 215, in _InitOpDefLibrary _op_def_registry.register_op_list(op_list) AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list'

    Can you give me some useful advide ,I am so sorry to bother you ! @ZhitingHu

    opened by shaoniana1997 1
  • Output is not as expected(style does not getting transferred)

    Output is not as expected(style does not getting transferred)

    I have followed your tutorial to generate the output fully. I have not changed a tiny bit. Did whatever has been mentioned in your github tutorial. But not getting the expected output. Please let me know how you can help me to track the issue? As I am doing exactly the same thing you mentioned and same data and all, but still let me know if I have to upload any file from my side to track the issue? Thank you.

    Output Sample:

    you can find lots of gifts that are rare and the price is right ! you can find lots of gifts that are rare and the price is right ! do n't bother with this place . do n't bother with this place . the customer service is awful . the customer service is awful . it is n't expensive at all and the staff is so nice . it is n't expensive at all and the staff is so nice . jill was very patient with me and my receipts . haha was very patient with me and my quotes . went for breakfast this morning , was n't very busy and service was horrible . went for breakfast this morning , was n't very busy and service was horrible . price was more than right , less than he quoted me . price was more than right , less than he quoted me . got my drink correct and quick ! got my drink correct and quick ! so good ! so good !

    opened by SOURO 0
  • ASYML project suggestions

    ASYML project suggestions

    As open-sourced projects, we love to hear the voices of the community and learn what features you are expecting from machine learning libraries and NLP toolkits. To facilitate this, we will maintain a master list of project suggestions of all of our ASYML projects from the communities, which can help future contributors and us to decide what to work on. We will maintain a high-level list in this issue:

    • If you have an idea/direction, you can raise it here. Use the issues if you have a concrete feature request or bug report.
    • If you are interested in working on any of these directions, you can open issues and comment here with the issue link.

    Texar (issue list)

    • Tensorflow-2.0 support.

    Texar (issue list)

    • Support distributed training with AdaptDL.

    Forte (issue list)

    • Support Automatic Data Augmentation functions.

    Stave (issue list)

    • Easy configurable active learning for annotation projects.
    • Allow Embed Stave visualization to webpages.
    opened by hunterhector 0
Releases(v0.2.4)
  • v0.2.4(Nov 19, 2019)

    New features

    • Support only Python 3.6 and 3.7. Drop support of older Python versions. (#211)
    • Add Tokenizers including tokenizers for pretrained models (BERTTokenizer, XLNetTokenizer, etc). (#225)
    • Add GPT2 modules (GPT2Encoder, GPT2Decoder, GPT2Classifier, etc). (#228)

    Feature improvements

    • Update embedder modules dropout_strategy=='item' to support TensorFlow v1.15. (#231)
    • Update .gitignore and add .gitignore files to all examples. (#233)
    • Polish code style according to flake8. (#234)
    • Add GPT2 XL pretrained checkpoint. (#243)

    Fixes

    • Fix examples/transformer/scripts/wmt14_en_de.sh to create output dir automatically. (#238)
    • Fix variable scope issue in texar.tf.modules.decoders.dynamic_decode. (#246)
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Sep 26, 2019)

  • v0.2.2(Aug 5, 2019)

  • v0.2.1(Jul 28, 2019)

    New features

    • Add support for GPT-2 345M model in examples/gpt-2. (#156)
    • Add BERT modules, including texar.modules.BERTEncoder (doc) and texar.modules.BERTClassifier (doc). (#167)

    Feature improvements

    • Refactor TransformerEncoder and TransformerDecoder to separate position embeddings from the modules. (#126)
    • Allow passing a Tensor to output_layer of decoders' constructors -- used for weight tie b/w the output layer and input embedding matrix. (#126)
    • TransformerDecoder constructor interface made exact the same with RNN decoders constructor interfaces. (#126)
    • Refactor decoder Helpers to allow two-argument embedding_fn (supporting for position embedding). (#126)
    • Refactor SinusoidsPositionEmbedder to enable infinite large or negative position indexes. (#176)

    Fixes

    • Fix texar.losses.reduce_batch_time when sequence has dtype other than tf.float32. (#143)
    • Fix texar.losses.reduce_dimensions when average_axes or sum_axes is int. (#141)
    • Fix GPT-2 tokenization loading path. (#165)
    • Fix examples/vae_text EOS bug. (#168)
    • Fix transformer bleu_tool.py when translation_length is 0. (#176)
    • Fix StochasticConnector and ReparameterizedStochasticConnector when transform=False. (#179)
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 9, 2019)

    New features

    • TFRecordData: A new data module for reading and processing TFRecord data, with support for, e.g., image data, feature data, etc. (#107)
    • GPT-2: OpenAI pretrained language model. (#91, example)
    • TopKSampleEmbeddingHelper to perform top_k random sample decoding. (baa09ff)

    Feature improvements

    • Refactor BERT example using TFRecordData data module.
    • TransformerDecoder supports helper arguments to specify decoding strategy. (#76)

    Fixes

    • Fix variable collection bug in examples/seqgan. (#110)
    • Fix error when beam_search_decode with output_layer=tf.identity (#77)
    • Fix readthedocs compilation error (#85)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Feb 6, 2019)

Owner
ASYML
Machine Learning as Machine Assembly, part of the CASL project https://casl-project.ai/
ASYML
Contains the code and data for our #ICSE2022 paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences"

CodeFill This repository contains the code for our paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Namin

Software Analytics Lab 11 Oct 31, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

1.1k Dec 27, 2022
Twitter Sentiment Analysis using #tag, words and username

Twitter Sentment Analysis Web App using #tag, words and username to fetch data finds Insides of data and Tells Sentiment of the perticular #tag, words or username.

Kumar Saksham 26 Dec 25, 2022
Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

This repo provides the code of the following papers: (GAR) "Generation-Augmented Retrieval for Open-domain Question Answering", ACL 2021 (RIDER) "Read

morning 49 Dec 26, 2022
The aim of this task is to predict someone's English proficiency based on a text input.

English_proficiency_prediction_NLP The aim of this task is to predict someone's English proficiency based on a text input. Using the The NICT JLE Corp

1 Dec 13, 2021
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

Facebook Research 409 Oct 28, 2022
Pretty-doc - Composable text objects with python

pretty-doc from __future__ import annotations from dataclasses import dataclass

Taine Zhao 2 Jan 17, 2022
無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

VOICEVOX ENGINE VOICEVOXの音声合成エンジン。 実態は HTTP サーバーなので、リクエストを送信すればテキスト音声合成できます。 API ドキュメント VOICEVOX ソフトウェアを起動した状態で、ブラウザから

Hiroshiba 3 Jul 05, 2022
A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Basic-UI-for-GPT-J-6B-with-low-vram A repository to run GPT-J-6B on low vram systems by using both ram, vram and pinned memory. There seem to be some

90 Dec 25, 2022
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc

GUOKUN LAI 197 Dec 11, 2022
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)

Structured Super Lottery Tickets in BERT This repo contains our codes for the paper "Super Tickets in Pre-Trained Language Models: From Model Compress

Chen Liang 16 Dec 11, 2022
NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles NewsMTSC is a dataset for target-dependent sentiment classification (TSC)

Felix Hamborg 79 Dec 30, 2022
Code for Editing Factual Knowledge in Language Models

KnowledgeEditor Code for Editing Factual Knowledge in Language Models (https://arxiv.org/abs/2104.08164). @inproceedings{decao2021editing, title={Ed

Nicola De Cao 86 Nov 28, 2022
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoenc

Venelin Valkov 1.8k Dec 31, 2022
Segmenter - Transformer for Semantic Segmentation

Segmenter - Transformer for Semantic Segmentation

592 Dec 27, 2022
Sample data associated with the Aurora-BP study

The Aurora-BP Study and Dataset This repository contains sample code, sample data, and explanatory information for working with the Aurora-BP dataset

Microsoft 16 Dec 12, 2022
中文問句產生器;使用台達電閱讀理解資料集(DRCD)

Transformer QG on DRCD The inputs of the model refers to we integrate C and A into a new C' in the following form. C' = [c1, c2, ..., [HL], a1, ..., a

Philip 1 Oct 22, 2021
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

N-Grammer - Pytorch Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch Install $ pip install n-grammer-pytorch Usage

Phil Wang 66 Dec 29, 2022