Empower Sequence Labeling with Task-Aware Language Model

Overview

LM-LSTM-CRF

Documentation Status License Insight.io

Check Our New NER Toolkit 🚀 🚀 🚀

  • Inference:
    • LightNER: inference w. models pre-trained / trained w. any following tools, efficiently.
  • Training:
    • LD-Net: train NER models w. efficient contextualized representations.
    • VanillaNER: train vanilla NER models w. pre-trained embedding.
  • Distant Training:
    • AutoNER: train NER models w.o. line-by-line annotations and get competitive performance.

This project provides high-performance character-aware sequence labeling tools, including Training, Evaluation and Prediction.

Details about LM-LSTM-CRF can be accessed here, and the implementation is based on the PyTorch library.

Important: A serious bug was found on the bioes_to_span function in the original implementation, please refer the numbers reported in the Benchmarks section as the accurate performance.

The documents would be available here.

Quick Links

Model Notes

As visualized above, we use conditional random field (CRF) to capture label dependencies, and adopt a hierarchical LSTM to leverage both char-level and word-level inputs. The char-level structure is further guided by a language model, while pre-trained word embeddings are leveraged in word-level. The language model and the sequence labeling model are trained at the same time, and both make predictions at word-level. Highway networks are used to transform the output of char-level LSTM into different semantic spaces, and thus mediating these two tasks and allowing language model to empower sequence labeling.

Installation

For training, a GPU is strongly recommended for speed. CPU is supported but training could be extremely slow.

PyTorch

The code is based on PyTorch and supports PyTorch 0.4 now . You can find installation instructions here.

Dependencies

The code is written in Python 3.6. Its dependencies are summarized in the file requirements.txt. You can install these dependencies like this:

pip3 install -r requirements.txt

Data

We mainly focus on the CoNLL 2003 NER dataset, and the code takes its original format as input. However, due to the license issue, we are restricted to distribute this dataset. You should be able to get it here. You may also want to search online (e.g., Github), someone might release it accidentally.

Format

We assume the corpus is formatted as same as the CoNLL 2003 NER dataset. More specifically, empty lines are used as separators between sentences, and the separator between documents is a special line as below.

-DOCSTART- -X- -X- -X- O

Other lines contains words, labels and other fields. Word must be the first field, label mush be the last, and these fields are separated by space. For example, the first several lines in the WSJ portion of the PTB POS tagging corpus should be like the following snippet.

-DOCSTART- -X- -X- -X- O

Pierre NNP
Vinken NNP
, ,
61 CD
years NNS
old JJ
, ,
will MD
join VB
the DT
board NN
as IN
a DT
nonexecutive JJ
director NN
Nov. NNP
29 CD
. .


Usage

Here we provide implementations for two models, one is LM-LSTM-CRF and the other is its variant, LSTM-CRF, which only contains the word-level structure and CRF. train_wc.py and eval_wc.py are scripts for LM-LSTM-CRF, while train_w.py and eval_w.py are scripts for LSTM-CRF. The usages of these scripts can be accessed by the parameter -h, i.e.,

python train_wc.py -h
python train_w.py -h
python eval_wc.py -h
python eval_w.py -h

The default running commands for NER and POS tagging, and NP Chunking are:

  • Named Entity Recognition (NER):
python train_wc.py --train_file ./data/ner/train.txt --dev_file ./data/ner/testa.txt --test_file ./data/ner/testb.txt --checkpoint ./checkpoint/ner_ --caseless --fine_tune --high_way --co_train --least_iters 100
  • Part-of-Speech (POS) Tagging:
python train_wc.py --train_file ./data/pos/train.txt --dev_file ./data/pos/testa.txt --test_file ./data/pos/testb.txt --eva_matrix a --checkpoint ./checkpoint/pos_ --caseless --fine_tune --high_way --co_train
  • Noun Phrase (NP) Chunking:
python train_wc.py --train_file ./data/np/train.txt.iobes --dev_file ./data/np/testa.txt.iobes --test_file ./data/np/testb.txt.iobes --checkpoint ./checkpoint/np_ --caseless --fine_tune --high_way --co_train --least_iters 100

For other datasets or tasks, you may wanna try different stopping parameters, especially, for smaller dataset, you may want to set least_iters to a larger value; and for some tasks, if the speed of loss decreasing is too slow, you may want to increase lr.

Benchmarks

Here we compare LM-LSTM-CRF with recent state-of-the-art models on the CoNLL 2000 Chunking dataset, the CoNLL 2003 NER dataset, and the WSJ portion of the PTB POS Tagging dataset. All experiments are conducted on a GTX 1080 GPU.

A serious bug was found on the bioes_to_span function in the original implementation, please refer the following numbers as the accurate performance.

NER

When models are only trained on the WSJ portion of the PTB POS Tagging dataset, the results are summarized as below.

Model Max(Acc) Mean(Acc) Std(Acc) Time(h)
LM-LSTM-CRF 91.35 91.24 0.12 4
-- HighWay 90.87 90.79 0.07 4
-- Co-Train 91.23 90.95 0.34 2

POS

When models are only trained on the WSJ portion of the PTB POS Tagging dataset, the results are summarized as below.

Model Max(Acc) Mean(Acc) Std(Acc) Reported(Acc) Time(h)
Lample et al. 2016 97.51 97.35 0.09 N/A 37
Ma et al. 2016 97.46 97.42 0.04 97.55 21
LM-LSTM-CRF 97.59 97.53 0.03 16

Pretrained Model

Evaluation

We released pre-trained models on these three tasks. The checkpoint file can be downloaded at the following links. Notice that the NER model and Chunking model (coming soon) are trained on both the training set and the development set:

WSJ-PTB POS Tagging CoNLL03 NER
Args Args
Model Model

Also, eval_wc.py is provided to load and run these checkpoints. Its usage can be accessed by command python eval_wc.py -h, and a running command example is provided below:

python eval_wc.py --load_arg checkpoint/ner/ner_4_cwlm_lstm_crf.json --load_check_point checkpoint/ner_ner_4_cwlm_lstm_crf.model --gpu 0 --dev_file ./data/ner/testa.txt --test_file ./data/ner/testb.txt

Prediction

To annotated raw text, seq_wc.py is provided to annotate un-annotated text. Its usage can be accessed by command python seq_wc.py -h, and a running command example is provided below:

python seq_wc.py --load_arg checkpoint/ner/ner_4_cwlm_lstm_crf.json --load_check_point checkpoint/ner_ner_4_cwlm_lstm_crf.model --gpu 0 --input_file ./data/ner2003/test.txt --output_file output.txt

The input format is similar to CoNLL, but each line is required to only contain one field, token. For example, an input file could be:

-DOCSTART-

But
China
saw
their
luck
desert
them
in
the
second
match
of
the
group
,
crashing
to
a
surprise
2-0
defeat
to
newcomers
Uzbekistan
.

and the corresponding output is:

-DOCSTART- -DOCSTART- -DOCSTART-

But <LOC> China </LOC> saw their luck desert them in the second match of the group , crashing to a surprise 2-0 defeat to newcomers <LOC> Uzbekistan </LOC> . 

Reference

@inproceedings{2017arXiv170904109L,
  title = "{Empower Sequence Labeling with Task-Aware Neural Language Model}", 
  author = {{Liu}, L. and {Shang}, J. and {Xu}, F. and {Ren}, X. and {Gui}, H. and {Peng}, J. and {Han}, J.}, 
  booktitle={AAAI},
  year = 2018, 
}
Comments
  • Mismatch of performance between this repository and the paper

    Mismatch of performance between this repository and the paper

    hi, I have carefully read your paper and this repository, but I found the claimed mean and max F1 scores for NER are different between the paper and this repository, is there any reason behind? Thanks!

    opened by jind11 6
  • RuntimeError (matrix and matrix expected) while training

    RuntimeError (matrix and matrix expected) while training

    Hello, thanks for making available this tool.

    I am using IBM power pc machine with Ubuntu 16.03 and I am getting en error while I am trying to train a postag model. Here is the command I am running: python train_wc.py --train_file cmc_test_twitter.txt --dev_file cmc_test_twitter.txt --test_file cmc_test_twitter.txt --eva_matrix a --checkpoint ./checkpoint/pos_ --lr 0.015 --caseless --fine_tune --high_way --co_train

    And here is the output:

    train
    setting:
    Namespace(batch_size=10, caseless=True, char_dim=30, char_hidden=300, char_layers=1, checkpoint='./checkpoint/pos_', clip_grad=5.0, co_train=True, dev_file='empirist_gold_cmc/tagged/cmc_test_twitter.txt', drop_out=0.5, emb_file='./embedding/glove.6B.100d.txt', epoch=200, eva_matrix='a', fine_tune=False, gpu=0, high_way=True, highway_layers=1, lambda0=1, least_iters=50, load_check_point='', load_opt=False, lr=0.015, lr_decay=0.05, mini_count=5, momentum=0.9, patience=15, rand_embedding=False, shrink_embedding=False, small_crf=True, start_epoch=0, test_file='empirist_gold_cmc/tagged/cmc_test_twitter.txt', train_file='empirist_gold_cmc/tagged/cmc_test_twitter.txt', unk='unk', update='sgd', word_dim=100, word_hidden=300, word_layers=1)
    loading corpus
    constructing coding table
    feature size: '44'
    loading embedding
    embedding size: '400005'
    constructing dataset
    building model
    device: 0
    Traceback (most recent call last):       
      File "train_wc.py", line 188, in <module>
        scores = ner_model(f_f, f_p, b_f, b_p, w_f)
      File "/home/eray/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/eray/projects/LM-LSTM-CRF/model/lm_lstm_crf.py", line 235, in forward
        char_out = self.fb2char(fb_lstm_out)
      File "/home/eray/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/eray/projects/LM-LSTM-CRF/model/highway.py", line 53, in forward
        g = nn.functional.sigmoid(self.gate[0](x))
      File "/home/eray/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/eray/anaconda3/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 54, in forward
        return self._backend.Linear.apply(input, self.weight, self.bias)
      File "/home/eray/anaconda3/lib/python3.5/site-packages/torch/nn/_functions/linear.py", line 12, in forward
        output.addmm_(0, 1, input, weight.t())
    RuntimeError: matrix and matrix expected at /home/eray/pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:237
    
    opened by erayyildiz 6
  • RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    I was trying to run the program by python3 train_wc.py --gpu -1 --train_file ./data/ner/train.txt --dev_file ./data/ner/testa.txt --test_file ./data/ner/testb.txt --checkpoint ./checkpoint/ner_ --caseless --fine_tune --high_way --co_train --least_iters 100 I got the following error:

    embedding size: '400060' constructing dataset building model /usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38

    : UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.55 and num_layers=1 "num_layers={}".format(dropout, num_layers)) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:805

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(input_linear.weight, -bias, bias) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:816

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(weight, -bias, bias) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:819

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(weight, -bias, bias)

    • Tot it 1406 (epoch 0): 0it [00:00, ?it/s]train_wc.py:201

    : UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_. nn.utils.clip_grad_norm(ner_model.parameters(), args.clip_grad) Traceback (most recent call last):
    File "train_wc.py

    ", line 212, in dev_f1, dev_pre, dev_rec, dev_acc = evaluator.calc_score(ner_model, dev_dataset_loader) File "/home/yankai/weixiao/LM-LSTM-CRF/model/evaluator.py

    ", line 209, in calc_score decoded = self.decoder.decode(scores.data, mask_v.data) File "/home/yankai/weixiao/LM-LSTM-CRF/model/crf.py

    ", line 379, in decode decode_idx[idx] = pointer RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    opened by weixiao12345678 4
  • how to annotation text after get the trained ner_cwlm_lstm_crf.model model ?

    how to annotation text after get the trained ner_cwlm_lstm_crf.model model ?

    hi,

    After a long time run, the final model file named : ner_cwlm_lstm_crf.model been generated in the checkpoint folder . Then the next step is use it as input for annotation some text . Is there a script called test or something like this can do text or annotation job ?

    thanks in advance .

    opened by cloudtrends 3
  • about batch training

    about batch training

    hello, I have a question, why there is no function 'pack_padded_sequence' before function 'word_lstm', I'll very appreciate it if you can tell me the reason. thank you!

    opened by ZhixiuYe 3
  • train_w.py Error

    train_w.py Error

    hi Liyuan :) I have a error when I run train-w.py: On Pytorch 0.3.0 Traceback (most recent call last): File "train_w.py", line 193, in loss.backward() File "/home/dungdx4/anaconda2/envs/python_anaconda_3/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/home/dungdx4/anaconda2/envs/python_anaconda_3/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward variables, grad_variables, retain_graph) RuntimeError: dimension specified as 0 but tensor has no dimensions

    On Pytorch 0.4: Traceback (most recent call last):
    File "train_w.py", line 189, in fea_v, tg_v, mask_v = packer.repack_vb(feature, tg, mask) File "/home/dungdx4/LM-LSTM-CRF-master-0.4/model/crf.py", line 107, in repack_vb fea_v = torch.Tensor(feature.transpose(0, 1)).cuda() TypeError: expected torch.FloatTensor (got torch.LongTensor)

    opened by dungdx34 2
  • OMG! RuntimeError: $ Torch: not enough memory: you tried to allocate 3421GB. Buy new RAM!?

    OMG! RuntimeError: $ Torch: not enough memory: you tried to allocate 3421GB. Buy new RAM!?

    I was trying to run the program by python train_wc.py --train_file ./data/eng.train --dev_file ./data/eng.testa --test_file ./data/eng.testb --checkpoint ./checkpoint/ner_ --caseless --fine_tune --high_way --co_train --least_iters 100 > record.txt

    I got the following error:

    Traceback (most recent call last): File "train_wc.py", line 136, in ner_model = LM_LSTM_CRF(len(l_map), len(c_map), args.char_dim, args.char_hidden, args.char_layers, args.word_dim, args.word_hidden, args.word_layers, len(f_map), args.drop_out, large_CRF=args.small_crf, if_highway=args.high_way, in_doc_words=in_doc_words, highway_layers = args.highway_layers) File "/nas/home/xhuang/project_ner/LM-LSTM-CRF/model/lm_lstm_crf.py", line 63, in init self.crf = crf.CRF_L(word_hidden_dim, tagset_size) File "/nas/home/xhuang/project_ner/LM-LSTM-CRF/model/crf.py", line 29, in init self.hidden2tag = nn.Linear(hidden_dim, self.tagset_size * self.tagset_size, bias=if_bias) File "/nas/home/xhuang/anaconda3/envs/ner/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 41, in init self.weight = Parameter(torch.Tensor(out_features, in_features)) RuntimeError: $ Torch: not enough memory: you tried to allocate 3421GB. Buy new RAM! at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/TH/THGeneral.c:218

    opened by xiaoleihuang 2
  • Question abuout parameter in code

    Question abuout parameter in code

    I want to ask a question about parameter- shrink_embedding in train_w.py, If I don't use external resources, this parameter can set False, it doesn't affect the final result? The other parameter fine_tune, I can't totally understand the meaning, can you explain it for me? Thank you very much!

    if args.fine_tune:              # which means does not do fine-tune  
            f_map = {'<eof>': 0}
    

    i think this code may be wrong, if I choose fine_tune=Ture, will process this code, then can't fune pre-trained embedding dictionary

    opened by airship-explorer 2
  • About Xuezhe's result

    About Xuezhe's result

    hello, I replicated the LSTM-CNN-CRF model, and my best result is 91.23, which is close to Xuezhe's reported result. I wonder why in your paper, the mean result is better than Xuezhe's reported result in LSTM-CNN-CRF model. It is because you modified Xuezhe's code or anything else?

    Thank you vary much if you can tell me about this.

    opened by ZhixiuYe 2
  • Bump numpy from 1.13.1 to 1.21.0

    Bump numpy from 1.13.1 to 1.21.0

    Bumps numpy from 1.13.1 to 1.21.0.

    Release notes

    Sourced from numpy's releases.

    v1.21.0

    NumPy 1.21.0 Release Notes

    The NumPy 1.21.0 release highlights are

    • continued SIMD work covering more functions and platforms,
    • initial work on the new dtype infrastructure and casting,
    • universal2 wheels for Python 3.8 and Python 3.9 on Mac,
    • improved documentation,
    • improved annotations,
    • new PCG64DXSM bitgenerator for random numbers.

    In addition there are the usual large number of bug fixes and other improvements.

    The Python versions supported for this release are 3.7-3.9. Official support for Python 3.10 will be added when it is released.

    :warning: Warning: there are unresolved problems compiling NumPy 1.21.0 with gcc-11.1 .

    • Optimization level -O3 results in many wrong warnings when running the tests.
    • On some hardware NumPy will hang in an infinite loop.

    New functions

    Add PCG64DXSM BitGenerator

    Uses of the PCG64 BitGenerator in a massively-parallel context have been shown to have statistical weaknesses that were not apparent at the first release in numpy 1.17. Most users will never observe this weakness and are safe to continue to use PCG64. We have introduced a new PCG64DXSM BitGenerator that will eventually become the new default BitGenerator implementation used by default_rng in future releases. PCG64DXSM solves the statistical weakness while preserving the performance and the features of PCG64.

    See upgrading-pcg64 for more details.

    (gh-18906)

    Expired deprecations

    • The shape argument numpy.unravel_index cannot be passed as dims keyword argument anymore. (Was deprecated in NumPy 1.16.)

    ... (truncated)

    Commits
    • b235f9e Merge pull request #19283 from charris/prepare-1.21.0-release
    • 34aebc2 MAINT: Update 1.21.0-notes.rst
    • 493b64b MAINT: Update 1.21.0-changelog.rst
    • 07d7e72 MAINT: Remove accidentally created directory.
    • 032fca5 Merge pull request #19280 from charris/backport-19277
    • 7d25b81 BUG: Fix refcount leak in ResultType
    • fa5754e BUG: Add missing DECREF in new path
    • 61127bb Merge pull request #19268 from charris/backport-19264
    • 143d45f Merge pull request #19269 from charris/backport-19228
    • d80e473 BUG: Removed typing for == and != in dtypes
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Line 530 in utils.py is too slow with huge datasets

    Line 530 in utils.py is too slow with huge datasets

    Line 530 in construct_bucket_vb_wc function in utils.py is too slow with huge datasets. It even freezes if dataset is larger than 300k objects.

    I propose to change line

    forw_corpus = [pad_char_feature] + list(reduce(lambda x, y: x + [pad_char_feature] + y, forw_features)) + [pad_char_feature]

    to

    forw_corpus = [pad_char_feature]
    for forw_feature in forw_features:
       forw_corpus.extend(forw_feature + [pad_char_feature])
    

    Which works considerably faster with no freezes.

    opened by andreybondarb 1
  • RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    I was trying to run the program by python3 train_wc.py --gpu -1 --train_file ./data/ner/train.txt --dev_file ./data/ner/testa.txt --test_file ./data/ner/testb.txt --checkpoint ./checkpoint/ner_ --caseless --fine_tune --high_way --co_train --least_iters 100 I got the following error:

    embedding size: '400060' constructing dataset building model /usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38

    : UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.55 and num_layers=1 "num_layers={}".format(dropout, num_layers)) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:805

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(input_linear.weight, -bias, bias) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:816

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(weight, -bias, bias) /home/yankai/weixiao/LM-LSTM-CRF/model/utils.py:819

    : UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_. nn.init.uniform(weight, -bias, bias)

    Tot it 1406 (epoch 0): 0it [00:00, ?it/s]train_wc.py:201 : UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_. nn.utils.clip_grad_norm(ner_model.parameters(), args.clip_grad) Traceback (most recent call last): File "train_wc.py

    ", line 212, in dev_f1, dev_pre, dev_rec, dev_acc = evaluator.calc_score(ner_model, dev_dataset_loader) File "/home/yankai/weixiao/LM-LSTM-CRF/model/evaluator.py

    ", line 209, in calc_score decoded = self.decoder.decode(scores.data, mask_v.data) File "/home/yankai/weixiao/LM-LSTM-CRF/model/crf.py

    ", line 379, in decode decode_idx[idx] = pointer RuntimeError: expand(torch.LongTensor{[50, 1]}, size=[50]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

    opened by zakarianamikaz 0
  • dropout

    dropout

    python train_wc.py --train_file ./data/train_data.txt --dev_file ./data/dev_data.txt --test_file ./data/test_data.txt --caseless --fine_tune --high_way --co_train --least_iters 100 --dropout 0.5 setting: Namespace(batch_size=10, caseless=True, char_dim=30, char_hidden=100, char_layers=1, checkpoint='./checkpoint/', clip_grad=5.0, co_train=True, dev_file='./data/dev_data.txt', dropout=0.5, emb_file='./embedding/glove.6B.100d.txt', epoch=200, eva_matrix='fa', fine_tune=False, gpu=0, high_way=True, highway_layers=2, lambda0=1, least_iters=100, load_check_point='', load_opt=False, lr=0.01, lr_decay=0.05, mini_count=5, momentum=0.9, patience=15, rand_embedding=False, small_crf=True, start_epoch=0, test_file='./data/test_data.txt', train_file='./data/train_data.txt', unk='unk', update='sgd', word_dim=100, word_hidden=100, word_layers=1) loading corpus constructing coding table feature size: '3240' loading embedding embedding size: '3712' constructing dataset building model Traceback (most recent call last): File "train_wc.py", line 138, in if_highway=args.high_way, in_doc_words=in_doc_words, highway_layers=args.highway_layers) File "/home/usr/Downloads/arabic-ner-master/lmbilstmcrf/lm_lstm_crf.py", line 27, in init dropout=dropout_ratio) File "/home/usr/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 425, in init super(LSTM, self).init('LSTM', *args, **kwargs) File "/home/usr/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 31, in init raise ValueError("dropout should be a number in range [0, 1] " ValueError: dropout should be a number in range [0, 1] representing the probability of an element being zeroed

    opened by zakarianamikaz 0
  • RuntimeError:

    RuntimeError:

    RuntimeError: The expanded size of the tensor (300) must match the existing size (100) at non-singleton dimension 0. Target sizes: [300]. Tensor sizes: [100]

    opened by zakarianamikaz 1
  • Missing

    Missing "eval_batch" in train_w.py line 163

    The model is very useful , And I want to reload model to train , and get some error in train_w.py line 163. I search all the project to find function "eval_batch" , only in model/evaluator , but it is wrong file. Are you forget to write this function or some how ? How can I resolve this problem?

    opened by RichardHWD 1
  • About the score given a sequence and a target

    About the score given a sequence and a target

    Dear Author,

    Thank you for sharing the code. I have a question about the forward() in CRFLoss_vb() in crf.py, it calculates the score of the golden state by:

    tg_energy = torch.gather(scores.view(seq_len, bat_size, -1), 2, target).view(seq_len, bat_size)
    tg_energy = tg_energy.masked_select(mask).sum()

    However, it seems that each tag of the target sequence is handled separately and I don't really see the transitions like tag1->tag2->tag3->... Can you explain a little bit?

    opened by helenxu 0
Releases(v0.1)
Owner
Liyuan Liu
Ph.D. Student @ DMG, UIUC
Liyuan Liu
I explore rock vs. mine prediction using a SONAR dataset

I explore rock vs. mine prediction using a SONAR dataset. Using a Logistic Regression Model for my prediction algorithm, I intend on predicting what an object is based on supervised learning.

Jeff Shen 1 Jan 11, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash through feeding it pictures or videos.

Trash-Sorter-Extraordinaire Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash

Rameen Mahmood 1 Nov 07, 2021
Deep Halftoning with Reversible Binary Pattern

Deep Halftoning with Reversible Binary Pattern ICCV Paper | Project Website | BibTex Overview Existing halftoning algorithms usually drop colors and f

Menghan Xia 17 Nov 22, 2022
JAX bindings to the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) library

JAX bindings to FINUFFT This package provides a JAX interface to (a subset of) the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) lib

Dan Foreman-Mackey 32 Oct 15, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

Meta Research 283 Dec 30, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022
Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

DiffSinger - PyTorch Implementation PyTorch implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension). Status

Keon Lee 152 Jan 02, 2023
Implemenets the Contourlet-CNN as described in C-CNN: Contourlet Convolutional Neural Networks, using PyTorch

C-CNN: Contourlet Convolutional Neural Networks This repo implemenets the Contourlet-CNN as described in C-CNN: Contourlet Convolutional Neural Networ

Goh Kun Shun (KHUN) 10 Nov 03, 2022
Tensorflow AffordanceNet and AffContext implementations

AffordanceNet and AffContext This is tensorflow AffordanceNet and AffContext implementations. Both are implemented and tested with tensorflow 2.3. The

Beatriz Pérez 6 Dec 01, 2022
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
Open-source Monocular Python HawkEye for Tennis

Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac

ArtLabs 188 Jan 08, 2023
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 104 Jan 05, 2023
A NSFW content filter.

Project_Nfilter A NSFW content filter. With a motive of minimizing the spreads and leakage of NSFW contents on internet and access to others devices ,

1 Jan 20, 2022
Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"

Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning" Getting started Prerequisites CUD

70 Dec 02, 2022
Neural network for stock price prediction

neural_network_for_stock_price_prediction Neural networks for stock price predic

2 Feb 04, 2022
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

9 Feb 23, 2022
A simple Tensorflow based library for deep and/or denoising AutoEncoder.

libsdae - deep-Autoencoder & denoising autoencoder A simple Tensorflow based library for Deep autoencoder and denoising AE. Library follows sklearn st

Rajarshee Mitra 147 Nov 18, 2022
DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation By Qing Xu, Wenting Duan and Na He Requirements pytorch==1.1

Qing Xu 20 Dec 09, 2022
Compact Bilinear Pooling for PyTorch

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 07, 2022