✨Fast Coreference Resolution in spaCy with Neural Networks

Overview

NeuralCoref 4.0: Coreference Resolution in spaCy with Neural Networks.

NeuralCoref is a pipeline extension for spaCy 2.1+ which annotates and resolves coreference clusters using a neural network. NeuralCoref is production-ready, integrated in spaCy's NLP pipeline and extensible to new training datasets.

For a brief introduction to coreference resolution and NeuralCoref, please refer to our blog post. NeuralCoref is written in Python/Cython and comes with a pre-trained statistical model for English only.

NeuralCoref is accompanied by a visualization client NeuralCoref-Viz, a web interface powered by a REST server that can be tried online. NeuralCoref is released under the MIT license.

Version 4.0 out now! Available on pip and compatible with SpaCy 2.1+.

Current Release Version spaCy Travis-CI NeuralCoref online Demo

  • Operating system: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
  • Python version: Python 3.6+ (only 64 bit)
  • Package managers: [pip]

Install NeuralCoref

Install NeuralCoref with pip

This is the easiest way to install NeuralCoref.

pip install neuralcoref

spacy.strings.StringStore size changed error

If you have an error mentioning spacy.strings.StringStore size changed, may indicate binary incompatibility when loading NeuralCoref with import neuralcoref, it means you'll have to install NeuralCoref from the distribution's sources instead of the wheels to get NeuralCoref to build against the most recent version of SpaCy for your system.

In this case, simply re-install neuralcoref as follows:

pip uninstall neuralcoref
pip install neuralcoref --no-binary neuralcoref

Installing SpaCy's model

To be able to use NeuralCoref you will also need to have an English model for SpaCy.

You can use whatever english model works fine for your application but note that the performances of NeuralCoref are strongly dependent on the performances of the SpaCy model and in particular on the performances of SpaCy model's tagger, parser and NER components. A larger SpaCy English model will thus improve the quality of the coreference resolution as well (see some details in the Internals and Model section below).

Here is an example of how you can install SpaCy and a (small) English model for SpaCy, more information can be found on spacy's website:

pip install -U spacy
python -m spacy download en

Install NeuralCoref from source

You can also install NeuralCoref from sources. You will need to install the dependencies first which includes Cython and SpaCy.

Here is the process:

venv .env
source .env/bin/activate
git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .

Internals and Model

NeuralCoref is made of two sub-modules:

  • a rule-based mentions-detection module which uses SpaCy's tagger, parser and NER annotations to identify a set of potential coreference mentions, and
  • a feed-forward neural-network which compute a coreference score for each pair of potential mentions.

The first time you import NeuralCoref in python, it will download the weights of the neural network model in a cache folder.

The cache folder is set by defaults to ~/.neuralcoref_cache (see file_utils.py) but this behavior can be overided by setting the environment variable NEURALCOREF_CACHE to point to another location.

The cache folder can be safely deleted at any time and the module will download again the model the next time it is loaded.

You can have more information on the location, downloading and caching process of the internal model by activating python's logging module before loading NeuralCoref as follows:

import logging;
logging.basicConfig(level=logging.INFO)
import neuralcoref
>>> INFO:neuralcoref:Getting model from https://s3.amazonaws.com/models.huggingface.co/neuralcoref/neuralcoref.tar.gz or cache
>>> INFO:neuralcoref.file_utils:https://s3.amazonaws.com/models.huggingface.co/neuralcoref/neuralcoref.tar.gz not found in cache, downloading to /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40155833/40155833 [00:06<00:00, 6679263.76B/s]
>>> INFO:neuralcoref.file_utils:copying /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m to cache at /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633
>>> INFO:neuralcoref.file_utils:creating metadata file for /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633
>>> INFO:neuralcoref.file_utils:removing temp file /var/folders/yx/cw8n_njx3js5jksyw_qlp8p00000gn/T/tmp_8y5_52m
>>> INFO:neuralcoref:extracting archive file /Users/thomaswolf/.neuralcoref_cache/f46bc05a4bfba2ae0d11ffd41c4777683fa78ed357dc04a23c67137abf675e14.7d6f9a6fecf5cf09e74b65f85c7d6896b21decadb2554d486474f63b95ec4633 to dir /Users/thomaswolf/.neuralcoref_cache/neuralcoref

Loading NeuralCoref

Adding NeuralCoref to the pipe of an English SpaCy Language

Here is the recommended way to instantiate NeuralCoref and add it to SpaCY's pipeline of annotations:

# Load your usual SpaCy model (one of SpaCy English models)
import spacy
nlp = spacy.load('en')

# Add neural coref to SpaCy's pipe
import neuralcoref
neuralcoref.add_to_pipe(nlp)

# You're done. You can now use NeuralCoref as you usually manipulate a SpaCy document annotations.
doc = nlp(u'My sister has a dog. She loves him.')

doc._.has_coref
doc._.coref_clusters

Loading NeuralCoref and adding it manually to the pipe of an English SpaCy Language

An equivalent way of adding NeuralCoref to a SpaCy model pipe is to instantiate the NeuralCoref class first and then add it manually to the pipe of the SpaCy Language model.

# Load your usual SpaCy model (one of SpaCy English models)
import spacy
nlp = spacy.load('en')

# load NeuralCoref and add it to the pipe of SpaCy's model
import neuralcoref
coref = neuralcoref.NeuralCoref(nlp.vocab)
nlp.add_pipe(coref, name='neuralcoref')

# You're done. You can now use NeuralCoref the same way you usually manipulate a SpaCy document and it's annotations.
doc = nlp(u'My sister has a dog. She loves him.')

doc._.has_coref
doc._.coref_clusters

Using NeuralCoref

NeuralCoref will resolve the coreferences and annotate them as extension attributes in the spaCy Doc, Span and Token objects under the ._. dictionary.

Here is the list of the annotations:

Attribute Type Description
doc._.has_coref boolean Has any coreference has been resolved in the Doc
doc._.coref_clusters list of Cluster All the clusters of corefering mentions in the doc
doc._.coref_resolved unicode Unicode representation of the doc where each corefering mention is replaced by the main mention in the associated cluster.
doc._.coref_scores Dict of Dict Scores of the coreference resolution between mentions.
span._.is_coref boolean Whether the span has at least one corefering mention
span._.coref_cluster Cluster Cluster of mentions that corefer with the span
span._.coref_scores Dict Scores of the coreference resolution of & span with other mentions (if applicable).
token._.in_coref boolean Whether the token is inside at least one corefering mention
token._.coref_clusters list of Cluster All the clusters of corefering mentions that contains the token

A Cluster is a cluster of coreferring mentions which has 3 attributes and a few methods to simplify the navigation inside a cluster:

Attribute or method Type / Return type Description
i int Index of the cluster in the Doc
main Span Span of the most representative mention in the cluster
mentions list of Span List of all the mentions in the cluster
__getitem__ return Span Access a mention in the cluster
__iter__ yields Span Iterate over mentions in the cluster
__len__ return int Number of mentions in the cluster

Navigating the coreference cluster chains

You can also easily navigate the coreference cluster chains and display clusters and mentions.

Here are some examples, try them out to test it for yourself.

import spacy
import neuralcoref
nlp = spacy.load('en')
neuralcoref.add_to_pipe(nlp)

doc = nlp(u'My sister has a dog. She loves him')

doc._.coref_clusters
doc._.coref_clusters[1].mentions
doc._.coref_clusters[1].mentions[-1]
doc._.coref_clusters[1].mentions[-1]._.coref_cluster.main

token = doc[-1]
token._.in_coref
token._.coref_clusters

span = doc[-1:]
span._.is_coref
span._.coref_cluster.main
span._.coref_cluster.main._.coref_cluster

Important: NeuralCoref mentions are spaCy Span objects which means you can access all the usual Span attributes like span.start (index of the first token of the span in the document), span.end (index of the first token after the span in the document), etc...

Ex: doc._.coref_clusters[1].mentions[-1].start will give you the index of the first token of the last mention of the second coreference cluster in the document.

Parameters

You can pass several additional parameters to neuralcoref.add_to_pipe or NeuralCoref() to control the behavior of NeuralCoref.

Here is the full list of these parameters and their descriptions:

Parameter Type Description
greedyness float A number between 0 and 1 determining how greedy the model is about making coreference decisions (more greedy means more coreference links). The default value is 0.5.
max_dist int How many mentions back to look when considering possible antecedents of the current mention. Decreasing the value will cause the system to run faster but less accurately. The default value is 50.
max_dist_match int The system will consider linking the current mention to a preceding one further than max_dist away if they share a noun or proper noun. In this case, it looks max_dist_match away instead. The default value is 500.
blacklist boolean Should the system resolve coreferences for pronouns in the following list: ["i", "me", "my", "you", "your"]. The default value is True (coreference resolved).
store_scores boolean Should the system store the scores for the coreferences in annotations. The default value is True.
conv_dict dict(str, list(str)) A conversion dictionary that you can use to replace the embeddings of rare words (keys) by an average of the embeddings of a list of common words (values). Ex: conv_dict={"Angela": ["woman", "girl"]} will help resolving coreferences for Angela by using the embeddings for the more common woman and girl instead of the embedding of Angela. This currently only works for single words (not for words groups).

How to change a parameter

import spacy
import neuralcoref

# Let's load a SpaCy model
nlp = spacy.load('en')

# First way we can control a parameter
neuralcoref.add_to_pipe(nlp, greedyness=0.75)

# Another way we can control a parameter
nlp.remove_pipe("neuralcoref")  # This remove the current neuralcoref instance from SpaCy pipe
coref = neuralcoref.NeuralCoref(nlp.vocab, greedyness=0.75)
nlp.add_pipe(coref, name='neuralcoref')

Using the conversion dictionary parameter to help resolve rare words

Here is an example on how we can use the parameter conv_dict to help resolving coreferences of a rare word like a name:

import spacy
import neuralcoref

nlp = spacy.load('en')

# Let's try before using the conversion dictionary:
neuralcoref.add_to_pipe(nlp)
doc = nlp(u'Deepika has a dog. She loves him. The movie star has always been fond of animals')
doc._.coref_clusters
doc._.coref_resolved
# >>> [Deepika: [Deepika, She, him, The movie star]]
# >>> 'Deepika has a dog. Deepika loves Deepika. Deepika has always been fond of animals'
# >>> Not very good...

# Here are three ways we can add the conversion dictionary
nlp.remove_pipe("neuralcoref")
neuralcoref.add_to_pipe(nlp, conv_dict={'Deepika': ['woman', 'actress']})
# or
nlp.remove_pipe("neuralcoref")
coref = neuralcoref.NeuralCoref(nlp.vocab, conv_dict={'Deepika': ['woman', 'actress']})
nlp.add_pipe(coref, name='neuralcoref')
# or after NeuralCoref is already in SpaCy's pipe, by modifying NeuralCoref in the pipeline
nlp.get_pipe('neuralcoref').set_conv_dict({'Deepika': ['woman', 'actress']})

# Let's try agin with the conversion dictionary:
doc = nlp(u'Deepika has a dog. She loves him. The movie star has always been fond of animals')
doc._.coref_clusters
# >>> [Deepika: [Deepika, She, The movie star], a dog: [a dog, him]]
# >>> 'Deepika has a dog. Deepika loves a dog. Deepika has always been fond of animals'
# >>> A lot better!

Using NeuralCoref as a server

A simple example of server script for integrating NeuralCoref in a REST API is provided as an example in examples/server.py.

To use it you need to install falcon first:

pip install falcon

You can then start the server as follows:

cd examples
python ./server.py

And query the server like this:

curl --data-urlencode "text=My sister has a dog. She loves him." -G localhost:8000

There are many other ways you can manage and deploy NeuralCoref. Some examples can be found in spaCy Universe.

Re-train the model / Extend to another language

If you want to retrain the model or train it on another language, see our training instructions as well as our blog post

Comments
  • binary incompatibility

    binary incompatibility

    I'm using the current spaCy from the master branch, and getting this error:

    RuntimeWarning: spacy.tokens.span.Span size changed, may indicate binary incompatibility. Expected 72 from C header, got 80 from PyObject

    I'm assuming this happens because span.pxd has changed after the 2.1 release: https://github.com/explosion/spaCy/commits/master/spacy/tokens/span.pxd

    I tried reinstalling with

    pip install neuralcoref --no-binary neuralcoref

    But the warning remains and the program crashes when I run nlp(doc):

    Process finished with exit code -1073741819 (0xC0000005)

    Any idea on how to fix this? I'm compiling spaCy from sources too, so I was hoping not to have to do the same for neuralcoref ...

    upgrade install 
    opened by svlandeg 29
  • ExtraData: unpack(b) received extra data.

    ExtraData: unpack(b) received extra data.

    I get the following error while loading a custom mode with:

    ...
    neuralcoref.add_to_pipe(nlp)
    
    model in init = True
     ExtraData: unpack(b) received extra data. 
    ---------------------------------------------------------------------------
    ExtraData                                 Traceback (most recent call last)
    <ipython-input-6-3f11485ad4f8> in <module>
    ----> 1 neuralcoref.add_to_pipe(nlp)
    
    /workspace/neuralcoref_02/neuralcoref_with_training_mods/neuralcoref/__init__.py in add_to_pipe(nlp, **kwargs)
         40 
         41 def add_to_pipe(nlp, **kwargs):
    ---> 42     coref = NeuralCoref(nlp.vocab, **kwargs)
         43     nlp.add_pipe(coref, name="neuralcoref")
         44     return nlp
    
    neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.__init__()
    
    neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.from_disk()
    
    /opt/conda/lib/python3.6/site-packages/thinc/neural/_classes/model.py in from_bytes(self, bytes_data)
        353 
        354     def from_bytes(self, bytes_data):
    --> 355         data = srsly.msgpack_loads(bytes_data)
        356         weights = data[b"weights"]
        357         queue = [self]
    
    /opt/conda/lib/python3.6/site-packages/srsly/_msgpack_api.py in msgpack_loads(data, use_list)
         27     # msgpack-python docs suggest disabling gc before unpacking large messages
         28     gc.disable()
    ---> 29     msg = msgpack.loads(data, raw=False, use_list=use_list)
         30     gc.enable()
         31     return msg
    
    /opt/conda/lib/python3.6/site-packages/srsly/msgpack/__init__.py in unpackb(packed, **kwargs)
         58         object_hook = kwargs.get('object_hook')
         59         kwargs['object_hook'] = functools.partial(_decode_numpy, chain=object_hook)
    ---> 60     return _unpackb(packed, **kwargs)
         61 
         62 
    
    _unpacker.pyx in srsly.msgpack._unpacker.unpackb()
    
    ExtraData: unpack(b) received extra data.
    

    Thats how my model-folder looks like.

    image

    the model was generated as explained below (see mail from chieter).

    wontfix usage 
    opened by SimonF89 23
  • NeuralCoref-3.0 can't load the new spacy model

    NeuralCoref-3.0 can't load the new spacy model

    I couldn't load the spacy model en-coref-sm. I have installed both neuralcoref-3.0 and en-coref-sm by downloading and running the setup.py even I tried the pip install for both. Once the installation completed when tried to load the spacy model it throws the below exception.

    Traceback (most recent call last): File "/home/extraction/CoreferenceResolver.py", line 5, in from neuralcoref import Coref File "/usr/local/lib/python2.7/dist-packages/neuralcoref-3.0-py2.7-linux-x86_64.egg/neuralcoref/init.py", line 3, in from .neuralcoref import NeuralCoref File "neuralcoref.pyx", line 101, in init neuralcoref.neuralcoref TypeError: must be char, not unicode

    Please provide me the clear steps to begin with the new neuralcoref

    ubuntu 
    opened by Praveenabiginfo 21
  • spacy.strings.StringStore has the wrong size, try recompiling

    spacy.strings.StringStore has the wrong size, try recompiling

    Spacy works perfectly fine for me with the usual spacy-provided models, but trying to load en_coref_md or en_coref_lg fails with the following message:

    $ pip install https://github.com/huggingface/neuralcoref-models/releases/download/en_coref_md-3.0.0/en_coref_md-3.0.0.tar.gz
    $ python
    Python 3.7.0 (default, Jun 28 2018, 07:39:16)
    [Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
    >>> import spacy
    >>> spacy.load('en_coref_md')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/__init__.py", line 17, in load
        return util.load_model(name, **overrides)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/util.py", line 114, in load_model
        return load_model_from_package(name, **overrides)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/spacy/util.py", line 134, in load_model_from_package
        cls = importlib.import_module(name)
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/en_coref_md/__init__.py", line 6, in <module>
        from en_coref_md.neuralcoref import NeuralCoref
      File "/yyy/miniconda2/envs/xxx/lib/python3.7/site-packages/en_coref_md/neuralcoref/__init__.py", line 1, in <module>
        from .neuralcoref import NeuralCoref
      File "strings.pxd", line 23, in init en_coref_md.neuralcoref.neuralcoref
    ValueError: spacy.strings.StringStore has the wrong size, try recompiling. Expected 88, got 112
    
    >>> spacy.__version__
    '2.0.11'
    

    My environment:

    python 3.7
    spacy 2.0.11
    mac
    

    Not sure if this makes any difference, but spacy was installed via conda while coref was installed via pip. This is part of the output of conda list

    spacy                     2.0.11           py37h6440ff4_2
    en-coref-md               3.0.0                     <pip>
    
    upgrade install 
    opened by fersarr 19
  • Training Neuralcoref for Dutch does not work

    Training Neuralcoref for Dutch does not work

    Dear guys,

    Firstly, thank you guys so much for this interesting work. I'm training the neuralcoref model for Dutch language using SoNar corpus, at first, I used this script to convert the MMAX format to CONLL format. After that, I trained a w2v model to prepare the static_word_embedding files. I have a few questions that I could not answer myself and I could not also find anywhere else.

    • I don't know what tuned_word_embedding files are, whenever I ran the conllparser.py, it just complained about missing those files. Looking deeper to the original tuned_word_embedding, I could see that it is similar to the static_word_embeddings, however, there are words that appear in both static and tuned word embeddings, and there are words that only appear in tuned_word_embeddings. For this reason, I just used exactly the same word embeddings file for both static and tuned. It seemed to work (at least not throw any complaint but I'm not sure if it work or not).
    • I have no idea how you constructed the MISSING and the UNK tokens in those static/tuned word embeddings.
    • When I run the train code, it ran quite well at first but then display this error (I think it's from PERL): image

    I came across many topics as well as posting questions on many threads, however I still got no help or guidance. Thank you so much for any help that any of you can provide.

    With best regards, Eric

    wontfix training 
    opened by EricLe-dev 18
  • Using OntoNotes 5.0 to generate coNLL files

    Using OntoNotes 5.0 to generate coNLL files

    Description I am currently stucked at the "Get the data" section for training the neural coreference model. As a newbie, I have little understanding of converting the skeleton files to conll files. Here are the commands specified in the guide:

    skeleton2conll.sh -D [path_to_ontonotes_train_folder] [path_to_skeleton_train_folder] skeleton2conll.sh -D [path_to_ontonotes_test_folder] [path_to_skeleton_test_folder] skeleton2conll.sh -D [path_to_ontonotes_dev_folder] [path_to_skeleton_dev_folder]
    h

    Result

    Here is my command. image

    Here is the output in case image wont load:

    $ ".\conll-2012-scripts\conll-2012\v3\scripts\skeleton2conll.sh" -D ".\ontonotes-release-5.0\data\files\data\" ".\conll-2012-train\conll-2012\" please make sure that you are pointing to the directory 'conll-2012'

    Data OntoNotes 5.0 from LDC (thru email) Training, and Development data (both are v4) Test Data (Official, v9) CoNLL 2012 scripts (v3) last four from this link

    Steps to reproduce

    1. Download the data
    2. Extract the data
    3. Run the command skeleton2conll.sh -D [path/to/conll-2012-train-v0/data/files/data] [path/to/conll-2012]

    Build/Platform Windows 10 Git Bash (mingw64) python 3.6 cpu (no CUDA)

    Alternatively, if someone knows how to use conll-formatted Onotnotes 5.0, I can also put an issue about it.

    wontfix training 
    opened by vrian 15
  • Attribute Error

    Attribute Error

    Code: import spacy import en_coref_md

    nlp = en_coref_md.load() doc = nlp(u'My sister has a dog. She loves him.')

    doc..has_coref doc..coref_clusters

    Error:

    AttributeError Traceback (most recent call last) in () 2 import en_coref_md 3 ----> 4 nlp = en_coref_md.load() 5 doc = nlp(u'My sister has a dog. She loves him.') 6

    ~\Anaconda3\Scripts\en_coref_md_init_.py in load(**overrides) 13 overrides['disable'] = disable + ['neuralcoref'] 14 nlp = load_model_from_init_py(file, **overrides) ---> 15 coref = neuralcoref.NeuralCoref(nlp.vocab) 16 coref.from_disk(nlp.path / 'neuralcoref') 17 nlp.add_pipe(coref, name='neuralcoref')

    AttributeError: module 'neuralcoref' has no attribute 'NeuralCoref'

    windows 
    opened by humehta 15
  • Python stopped Working

    Python stopped Working

    Hi,

    I am a windows 10 user, working on spacy 2.1.4 with english web-lg model(v2.1.0) After adding neuralcoref to pipeline, I am getting a Python stopped working error as soon as I parse.

    Wanted to know what is causing this error.

    upgrade install 
    opened by RandomForestGump 14
  • #include

    #include "ios" error in mac Mojave

    Hi, I encounter the following error when I try to install the models

    Processing ./en_coref_sm-3.0.0.tar.gz
    Requirement already satisfied: spacy>=>=2.0.0a18 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from en-coref-sm==3.0.0) (2.0.12)
    Requirement already satisfied: numpy>=1.7 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.15.2)
    Collecting regex==2017.4.5 (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0)
    Requirement already satisfied: requests<3.0.0,>=2.13.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2.18.4)
    Requirement already satisfied: preshed<2.0.0,>=1.0.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.0.0)
    Requirement already satisfied: murmurhash<0.29,>=0.28 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.28.0)
    Requirement already satisfied: plac<1.0.0,>=0.9.6 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.6)
    Requirement already satisfied: ujson>=1.35 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.35)
    Requirement already satisfied: cymem<1.32,>=1.30 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.31.2)
    Requirement already satisfied: dill<0.3,>=0.2 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.2.7.1)
    Requirement already satisfied: thinc<6.11.0,>=6.10.3 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (6.10.3)
    Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (3.0.4)
    Requirement already satisfied: idna<2.7,>=2.5 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2.6)
    Requirement already satisfied: urllib3<1.23,>=1.21.1 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.22)
    Requirement already satisfied: certifi>=2017.4.17 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (2018.8.24)
    Requirement already satisfied: six<2.0.0,>=1.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.11.0)
    Requirement already satisfied: cytoolz<0.10,>=0.9.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.0.1)
    Requirement already satisfied: tqdm<5.0.0,>=4.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (4.22.0)
    Requirement already satisfied: msgpack-numpy<1.0.0,>=0.4.1 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.4.1)
    Requirement already satisfied: wrapt<1.11.0,>=1.10.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (1.10.11)
    Requirement already satisfied: msgpack<1.0.0,>=0.5.6 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.5.6)
    Requirement already satisfied: toolz>=0.8.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from cytoolz<0.10,>=0.9.0->thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.9.0)
    Requirement already satisfied: msgpack-python>=0.3.0 in /Users/kyoungrok/anaconda/lib/python3.6/site-packages (from msgpack-numpy<1.0.0,>=0.4.1->thinc<6.11.0,>=6.10.3->spacy>=>=2.0.0a18->en-coref-sm==3.0.0) (0.5.4)
    Building wheels for collected packages: en-coref-sm
      Running setup.py bdist_wheel for en-coref-sm ... error
      Complete output from command /Users/kyoungrok/anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-wheel-ww8axwrm --python-tag cp36:
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-10.7-x86_64-3.6
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      copying en_coref_sm/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/__init__.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      copying en_coref_sm/en_coref_sm-3.0.0/tokenizer -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      copying en_coref_sm/en_coref_sm-3.0.0/meta.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/lower_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/moves -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/tok2vec_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      copying en_coref_sm/en_coref_sm-3.0.0/ner/upper_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/ner
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/lower_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/moves -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/tok2vec_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      copying en_coref_sm/en_coref_sm-3.0.0/parser/upper_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/parser
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/lexemes.bin -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/strings.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      copying en_coref_sm/en_coref_sm-3.0.0/vocab/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/vocab
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/single_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/pairs_model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/static_vectors
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors/vectors -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      copying en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors/key2row -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/neuralcoref/tuned_vectors
      creating build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/tag_map -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/cfg -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/en_coref_sm-3.0.0/tagger/model -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/en_coref_sm-3.0.0/tagger
      copying en_coref_sm/meta.json -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm
      copying en_coref_sm/neuralcoref/neuralcoref.pyx -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/__init__.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      copying en_coref_sm/neuralcoref/neuralcoref.pxd -> build/lib.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      running build_ext
      building 'en_coref_sm.neuralcoref.neuralcoref' extension
      creating build/temp.macosx-10.7-x86_64-3.6
      creating build/temp.macosx-10.7-x86_64-3.6/en_coref_sm
      creating build/temp.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref
      gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/kyoungrok/anaconda/include -arch x86_64 -I/Users/kyoungrok/anaconda/include -arch x86_64 -I/Users/kyoungrok/anaconda/include/python3.6m -I/private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include -I/Users/kyoungrok/anaconda/include/python3.6m -c en_coref_sm/neuralcoref/neuralcoref.cpp -o build/temp.macosx-10.7-x86_64-3.6/en_coref_sm/neuralcoref/neuralcoref.o
      warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
      In file included from en_coref_sm/neuralcoref/neuralcoref.cpp:580:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/arrayobject.h:15:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/ndarrayobject.h:17:
      In file included from /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/ndarraytypes.h:1728:
      /private/var/folders/8n/v5p4940n2xbcn7svgbrqb8zh0000gn/T/pip-req-build-cor5g7o5/include/numpy/npy_deprecated_api.h:11:2: warning: "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
      #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION"
       ^
      en_coref_sm/neuralcoref/neuralcoref.cpp:583:10: fatal error: 'ios' file not found
      #include "ios"
               ^~~~~
      2 warnings and 1 error generated.
      error: command 'gcc' failed with exit status 1
    
      ----------------------------------------
      Failed building wheel for en-coref-sm
    
    opened by kyoungrok0517 14
  • Unable to import modules

    Unable to import modules

    Hi,

    I get the following error when I try to run either of the simple examples in your README file:

    Traceback (most recent call last): File "/Users/maximild/src/MaxQA/src/test.py", line 1, in import en_coref_md File "/Users/maximild/anaconda3/lib/python3.6/site-packages/en_coref_md/init.py", line 6, in from en_coref_md.neuralcoref import NeuralCoref File "/Users/maximild/anaconda3/lib/python3.6/site-packages/en_coref_md/neuralcoref/init.py", line 1, in from .neuralcoref import NeuralCoref File "strings.pxd", line 23, in init en_coref_md.neuralcoref.neuralcoref ValueError: spacy.strings.StringStore has the wrong size, try recompiling. Expected 88, got 112

    I appear to have successfully downloaded the en_coref_md model, but I am unable to import it. I'm using spaCy 2.0.11 and Python 3.6 if that helps.

    Any suggestions on what might be wrong?

    Thanks!

    opened by BBCMax 14
  • Extension 'has_coref' already exists on Doc.

    Extension 'has_coref' already exists on Doc.

    My code:

    import spacy import en_coref_sm

    nlp = en_coref_sm.load() doc = nlp(u'The lungs are located in the chest.They are conical in shape.')

    print (doc..has_coref) print (doc..coref_clusters)

    Hey I ran into the following error when I inputted my own sentence::::

    ValueError Traceback (most recent call last) in () 2 import en_coref_sm 3 ----> 4 nlp = en_coref_sm.load() 5 doc = nlp(u'The lungs are located in the chest.They are conical in shape.') 6

    ~\Anaconda3\lib\site-packages\en_coref_sm_init_.py in load(**overrides) 13 overrides['disable'] = disable + ['neuralcoref'] 14 nlp = load_model_from_init_py(file, **overrides) ---> 15 coref = NeuralCoref(nlp.vocab) 16 coref.from_disk(nlp.path / 'neuralcoref') 17 nlp.add_pipe(coref, name='neuralcoref')

    neuralcoref.pyx in en_coref_sm.neuralcoref.neuralcoref.NeuralCoref.init()

    doc.pyx in spacy.tokens.doc.Doc.set_extension()

    ValueError: [E090] Extension 'has_coref' already exists on Doc. To overwrite the existing extension, set force=True on Doc.set_extension.

    opened by humehta 14
  • Regarding finetuning neuralcoref

    Regarding finetuning neuralcoref

    So, I have my own spacy model for custom NER and I want to incorporate coreference resolution for my detected entities. So, would existing pretrained model work or would I have to create a new dataset for it?

    opened by Tanmay98 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • Results completely differ from web-demo

    Results completely differ from web-demo

    When using neuralcoref master with Space==2.1.0 I can use neuralcoref just fine. However the results drastically differ from the version deployed to huggingface.co/neuralcoref

    "She is close to the habour" yields: grafik

    Whereas the same text executed via examples/server.py yields an empty reply

    ❯ curl --data-urlencode "text=she is close to the habour" -G localhost:8000
    {}%
    

    I can confirm that my curl call succeeds with other prompts.

    ▽ {mentions: […], clusters: […], resolved: "she is close to the habour. where might she be heading?"}
      ▽ mentions: [{…}, {…}]
        ▽ [0]: {start: 0, end: 3, text: "she", resolved: "she"}
            start: 0
            end: 3
            text: "she"
            resolved: "she"
        ▽ [1]: {start: 40, end: 43, text: "she", resolved: "she"}
            start: 40
            end: 43
            text: "she"
            resolved: "she"
      ▽ clusters: [["she", "she"]]
        ▽ [0]: ["she", "she"]
            [0]: "she"
            [1]: "she"
        resolved: "she is close to the habour. where might she be heading?"
    

    It seems like NOMINAL is missing somehow.

    opened by chris-aeviator 0
  • (base) C:\Users\sk136\neuralcoref>python -m neuralcoref.train.learn --train ./data/train/ --eval ./data/dev/ facing problem while executing.. this command

    (base) C:\Users\sk136\neuralcoref>python -m neuralcoref.train.learn --train ./data/train/ --eval ./data/dev/ facing problem while executing.. this command

    . . . 🌋 Construct test file Writing in C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt 🌋 Computing score Error during the scoring Command '['perl', 'C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl', 'muc', './data/dev//key.txt', 'C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt']' returned non-zero exit status 2. Can't locate CorScorer.pm in @INC (you may need to install the CorScorer module) (@INC contains: scorer/lib /usr/lib/perl5/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl /usr/share/perl5/core_perl) at C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl line 16. BEGIN failed--compilation aborted at C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl line 16.

    Traceback (most recent call last): File "C:\Users\sk136\anaconda3\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\sk136\anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\sk136\neuralcoref\neuralcoref\train\learn.py", line 565, in run_model(args) File "C:\Users\sk136\neuralcoref\neuralcoref\train\learn.py", line 175, in run_model eval_evaluator.test_model() File "C:\Users\sk136\neuralcoref\neuralcoref\train\evaluator.py", line 180, in test_model self.get_score(file_path=ALL_MENTIONS_PATH) File "C:\Users\sk136\neuralcoref\neuralcoref\train\evaluator.py", line 283, in get_score scorer_out = subprocess.check_output( File "C:\Users\sk136\anaconda3\lib\subprocess.py", line 424, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "C:\Users\sk136\anaconda3\lib\subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['perl', 'C:\Users\sk136\neuralcoref\neuralcoref\train\scorer_wrapper.pl', 'muc', './data/dev//key.txt', 'C:\Users\sk136\neuralcoref\neuralcoref\train\test_mentions.txt']' returned non-zero exit status 2.

    perl related issue

    opened by sandeep16064 0
  • GPU support - cuda 11.1 - TypeError: Unsupported type <class 'numpy.ndarray'>

    GPU support - cuda 11.1 - TypeError: Unsupported type

    Example:

    import spacy
    
    spacy.require_gpu()
    >> True
    
    nlp = spacy.load("en_core_web_sm")
    doc = nlp("this is my example text")
    print(doc)
    >> this is my example text
    
    import neuralcoref
    neuralcoref.add_to_pipe(nlp)
    >> <spacy.lang.en.English object at 0x7f53d9da6d60>
    
    doc =  nlp("this is my example text")
    >> Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/brj/.local/share/virtualenvs/spacy-ozIRu_0L/lib/python3.8/site-packages/spacy/language.py", line 445, in __call__
        doc = proc(doc, **component_cfg.get(name, {}))
      File "neuralcoref.pyx", line 593, in neuralcoref.neuralcoref.NeuralCoref.__call__
      File "neuralcoref.pyx", line 720, in neuralcoref.neuralcoref.NeuralCoref.predict
      File "neuralcoref.pyx", line 908, in neuralcoref.neuralcoref.NeuralCoref.get_mention_embeddings
      File "neuralcoref.pyx", line 899, in neuralcoref.neuralcoref.NeuralCoref.get_average_embedding
      File "cupy/_core/core.pyx", line 1591, in cupy._core.core.ndarray.__array_ufunc__
      File "cupy/_core/_kernel.pyx", line 1218, in cupy._core._kernel.ufunc.__call__
      File "cupy/_core/_kernel.pyx", line 138, in cupy._core._kernel._preprocess_args
      File "cupy/_core/_kernel.pyx", line 124, in cupy._core._kernel._preprocess_arg
    TypeError: Unsupported type <class 'numpy.ndarray'>
    
    # printing versions
    import cupy
    spacy.__version__
    >> 2.3.7
    neuralcoref.__version__
    >> 4.1.0
    cupy.__version__
    >> 10.4.0
    

    Everything works fine if I run this without spacy.require_gpu().

    opened by bryanjohns 0
Releases(v4.0.0)
Owner
Hugging Face
Solving NLP, one commit at a time!
Hugging Face
A Japanese tokenizer based on recurrent neural networks

Nagisa is a python module for Japanese word segmentation/POS-tagging. It is designed to be a simple and easy-to-use tool. This tool has the following

325 Jan 05, 2023
This is Assignment1 code for the Web Data Processing System.

This is a Python program to Entity Linking by processing WARC files. We recognize entities from web pages and link them to a Knowledge Base(Wikidata).

3 Dec 04, 2022
LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation Tasks | Datasets | LongLM | Baselines | Paper Introduction LOT is a ben

46 Dec 28, 2022
Klexikon: A German Dataset for Joint Summarization and Simplification

Klexikon: A German Dataset for Joint Summarization and Simplification Dennis Aumiller and Michael Gertz Heidelberg University Under submission at LREC

Dennis Aumiller 8 Jan 03, 2023
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 A repository part of the MarIA project. Corpora 📃 Corpora Number of documents Number of tokens Size (GB) BNE 201,080,084

Plan de Tecnologías del Lenguaje - Gobierno de España 203 Dec 20, 2022
Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more than just a paraphrasing model.

Prithivida 681 Jan 01, 2023
Text Classification Using LSTM

Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new ar

KrishArul26 3 Jan 03, 2023
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

beyond masking Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers The code is coming Figure 1: Pipeline of token-based pre-

Yunjie Tian 23 Sep 27, 2022
Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

Simple bots or Simbots is a library designed to create simple chat bots using the power of python. This library utilises Intent, Entity, Relation and

14 Dec 15, 2021
justCTF [*] 2020 challenges sources

justCTF [*] 2020 This repo contains sources for justCTF [*] 2020 challenges hosted by justCatTheFish. TLDR: Run a challenge with ./run.sh (requires Do

justCatTheFish 25 Dec 27, 2022
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

Šarūnas Navickas 60 Sep 26, 2022
Two-stage text summarization with BERT and BART

Two-Stage Text Summarization Description We experiment with a 2-stage summarization model on CNN/DailyMail dataset that combines the ability to filter

Yukai Yang (Alexis) 6 Oct 22, 2022
Simple text to phones converter for multiple languages

Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool

CoML 762 Dec 29, 2022
Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Rowan Zellers 856 Dec 24, 2022
StarGAN - Official PyTorch Implementation

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Dec 30, 2022
Korean Sentence Embedding Repository

Korean-Sentence-Embedding 🍭 Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides

80 Jan 02, 2023
This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -

1 Mar 14, 2022
texlive expressions for documents

tex2nix Generate Texlive environment containing all dependencies for your document rather than downloading gigabytes of texlive packages. Installation

Jörg Thalheim 70 Dec 26, 2022
A curated list of efficient attention modules

awesome-fast-attention A curated list of efficient attention modules

Sepehr Sameni 891 Dec 22, 2022