spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

Overview

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

PyPI version python version Code style: black github actions pytest github actions docs github coverage CodeFactor

spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from the Huggingface model hub in your spaCy pipeline allowing inclusion of existing models within SpaCy workflows.

As for as possible it follows a similar API as spacy-transformers.

Installation

Installing spacy-wrap is simple using pip:

pip install spacy_wrap

There is no reason to update from GitHub as the version on PyPI should always be the same as on GitHub.

Example

The following shows a simple example of how you can quickly add a fine-tuned transformer model from the Huggingface model hub. In this example we will use the sentiment model by Barbieri et al. (2020) for classifying whether a tweet is positive, negative or neutral. We will add this model to a blank English pipeline:

import spacy
import spacy_wrap

nlp = spacy.blank("en")

config = {
    "doc_extension_trf_data": "clf_trf_data",  # document extention for the forward pass
    "doc_extension_prediction": "sentiment",  # document extention for the prediction
    "labels": ["negative", "neutral", "positive"],
    "model": {
        "name": "cardiffnlp/twitter-roberta-base-sentiment",  # the model name or path of huggingface model
    },
}

transformer = nlp.add_pipe("classification_transformer", config=config)
transformer.model.initialize()

doc = nlp("spaCy is a wonderful tool")

print(doc._.clf_trf_data)
# TransformerData(wordpieces=...
print(doc._.sentiment)
# 'positive'
print(doc._.sentiment_prob)
#{'prob': array([0.004, 0.028, 0.969], dtype=float32), 'labels': ['negative', 'neutral', 'positive']}

These pipelines can also easily be applied to multiple documents using the nlp.pipe as one would expect from a spaCy component:

docs = nlp.pipe(
    [
        "I hate wrapping my own models",
        "Isn't there a tool for this?",
        "spacy-wrap is great for wrapping models",
    ]
)

for doc in docs:
    print(doc._.sentiment)
# 'negative'
# 'neutral'
# 'positive'

More Examples

It is always nice to have more than one example. Here is another one where we add the Hate speech model for Danish to a blank Danish pipeline:

import spacy
import spacy_wrap

nlp = spacy.blank("da")

config = {
    "doc_extension_trf_data": "clf_trf_data",  # document extention for the forward pass
    "doc_extension_prediction": "hate_speech",  # document extention for the prediction
    "labels": ["Not hate Speech", "Hate speech"],
    "model": {
        "name": "DaNLP/da-bert-hatespeech-detection",  # the model name or path of huggingface model
    },
}

transformer = nlp.add_pipe("classification_transformer", config=config)
transformer.model.initialize()

doc = nlp("Senile gamle idiot") # old senile idiot

doc._.clf_trf_data
# TransformerData(wordpieces=...
doc._.hate_speech
# "Hate speech"
doc._.hate_speech_prob
# {'prob': array([0.013, 0.987], dtype=float32), 'labels': ['Not hate Speech', 'Hate speech']}

📖 Documentation

Documentation
🔧 Installation Installation instructions for spacy-wrap.
📰 News and changelog New additions, changes and version history.
🎛 Documentation The reference for spacy-wrap's API.

💬 Where to ask questions

Type
🚨 FAQ FAQ
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Issue Tracker
👩‍💻 Usage Questions GitHub Discussions
🗯 General Discussion GitHub Discussions
Comments
  • Custom model for NER

    Custom model for NER

    Hello thanks you for setting up this ..

    The examples are amazing.

    Is it possible to use this wrapper with a Named Entity Recognition model?

    If that is the case, is it possible to add an example with a NER model from hugging face?

    Following the example, I am trying to add this but it is not working, I do not why may be I should go back and learn how spacy works.

    import spacy
    import spacy_wrap
    
    nlp = spacy.blank("fr")
    
    config = {
        "model": {
            "@architectures": "spacy-transformers.TransformerModel.v3",
            "name": "Jean-Baptiste/camembert-ner-with-dates",
            "tokenizer_config" : {"use_fast": False},
            "get_spans":  {"@span_getters": "spacy-transformers.doc_spans.v1"}
        }
    }
    
    transformer = nlp.add_pipe("ner", config=config)
    
    enhancement good first issue 
    opened by espoirMur 6
  • :arrow_up: Bump sphinx-notes/pages from 2 to 3

    :arrow_up: Bump sphinx-notes/pages from 2 to 3

    Bumps sphinx-notes/pages from 2 to 3.

    Release notes

    Sourced from sphinx-notes/pages's releases.

    3.0beta

    2.1

    No release notes provided.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies github_actions 
    opened by dependabot[bot] 3
  • Updated in accordance with spacy-transformers update

    Updated in accordance with spacy-transformers update

    Update following this PR allowing for alternate model loaders. This should make the code-base easier to maintain and removes existing monkey patch from the code.

    opened by KennethEnevoldsen 3
  • Poetry dependency resolution fails. PyPi 1.0.0 version requires spaCy < 3.3.0

    Poetry dependency resolution fails. PyPi 1.0.0 version requires spaCy < 3.3.0

    How to reproduce the behaviour

    pyproject.toml contains

    spacy = "^3.3.0"
    

    Add spacy-wrap with

    poetry add spacy_wrap
    

    Info about spaCy

    • spaCy version: 3.3.0
    • Platform: Windows-10-10.0.19043-SP0
    • Python version: 3.9.10
    • Pipelines: en_core_web_sm (3.3.0), en_core_web_trf (3.3.0)

    Issue

    Dependency resolution fails with poetry

    SolverProblemError
    
      Because spacy-wrap (1.0.0) depends on spacy (>=3.2.1,<3.3.0)
       and no versions of spacy-wrap match >1.0.0,<2.0.0, spacy-wrap (>=1.0.0,<2.0.0) requires spacy (>=3.2.1,<3.3.0).
      So, because spacytest depends on both spacy (^3.3.0) and spacy-wrap (^1.0.0), version solving failed.
    

    setup,cfg still has

    install_requires = 
    	spacy_transformers>=1.1.4,<1.2.0
    	spacy>=3.2.1,<3.3.0
    	thinc>=8.0.13,<8.1.0
    
    bug 
    opened by Dris101 3
  • :arrow_up: Bump ruff from 0.0.191 to 0.0.211

    :arrow_up: Bump ruff from 0.0.191 to 0.0.211

    Bumps ruff from 0.0.191 to 0.0.211.

    Release notes

    Sourced from ruff's releases.

    v0.0.211

    What's Changed

    Full Changelog: https://github.com/charliermarsh/ruff/compare/v0.0.210...v0.0.211

    v0.0.210

    What's Changed

    New Contributors

    Full Changelog: https://github.com/charliermarsh/ruff/compare/v0.0.209...v0.0.210

    v0.0.209

    What's Changed

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • :arrow_up: Bump MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26

    :arrow_up: Bump MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26

    Bumps MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies github_actions 
    opened by dependabot[bot] 2
  • :arrow_up: Update sphinxext-opengraph requirement from <0.7.4,>=0.7.3 to >=0.7.3,<0.7.6

    :arrow_up: Update sphinxext-opengraph requirement from <0.7.4,>=0.7.3 to >=0.7.3,<0.7.6

    Updates the requirements on sphinxext-opengraph to permit the latest version.

    Release notes

    Sourced from sphinxext-opengraph's releases.

    v0.7.5

    What's Changed

    Full Changelog: https://github.com/wpilibsuite/sphinxext-opengraph/compare/v0.7.4...v0.7.5

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update sphinx requirement from <5.4.0,>=5.3.0 to >=5.3.0,<6.2.0

    :arrow_up: Update sphinx requirement from <5.4.0,>=5.3.0 to >=5.3.0,<6.2.0

    Updates the requirements on sphinx to permit the latest version.

    Release notes

    Sourced from sphinx's releases.

    v6.1.0

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    Changelog

    Sourced from sphinx's changelog.

    Release 6.1.0 (released Jan 05, 2023)

    Dependencies

    Incompatible changes

    • #10979: gettext: Removed support for pluralisation in get_translation. This was unused and complicated other changes to sphinx.locale.

    Deprecated

    • sphinx.util functions:

      • Renamed sphinx.util.typing.stringify() to sphinx.util.typing.stringify_annotation()
      • Moved sphinx.util.xmlname_checker() to sphinx.builders.epub3._XML_NAME_PATTERN

      Moved to sphinx.util.display:

      • sphinx.util.status_iterator
      • sphinx.util.display_chunk
      • sphinx.util.SkipProgressMessage
      • sphinx.util.progress_message

      Moved to sphinx.util.http_date:

      • sphinx.util.epoch_to_rfc1123
      • sphinx.util.rfc1123_to_epoch

      Moved to sphinx.util.exceptions:

      • sphinx.util.save_traceback
      • sphinx.util.format_exception_cut_frames

    Features added

    • Cache doctrees in the build environment during the writing phase.
    • Make all writing phase tasks support parallel execution.
    • #11072: Use PEP 604 (X | Y) display conventions for typing.Optional and typing.Optional types within the Python domain and autodoc.

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update pytest-cov requirement from <3.0.1,>=3.0.0 to >=3.0.0,<4.0.1

    :arrow_up: Update pytest-cov requirement from <3.0.1,>=3.0.0 to >=3.0.0,<4.0.1

    Updates the requirements on pytest-cov to permit the latest version.

    Changelog

    Sourced from pytest-cov's changelog.

    4.0.0 (2022-09-28)

    Note that this release drops support for multiprocessing.

    • --cov-fail-under no longer causes pytest --collect-only to fail Contributed by Zac Hatfield-Dodds in [#511](https://github.com/pytest-dev/pytest-cov/issues/511) <https://github.com/pytest-dev/pytest-cov/pull/511>_.

    • Dropped support for multiprocessing (mostly because issue 82408 <https://github.com/python/cpython/issues/82408>_). This feature was mostly working but very broken in certain scenarios and made the test suite very flaky and slow.

      There is builtin multiprocessing support in coverage and you can migrate to that. All you need is this in your .coveragerc::

      [run] concurrency = multiprocessing parallel = true sigterm = true

    • Fixed deprecation in setup.py by trying to import setuptools before distutils. Contributed by Ben Greiner in [#545](https://github.com/pytest-dev/pytest-cov/issues/545) <https://github.com/pytest-dev/pytest-cov/pull/545>_.

    • Removed undesirable new lines that were displayed while reporting was disabled. Contributed by Delgan in [#540](https://github.com/pytest-dev/pytest-cov/issues/540) <https://github.com/pytest-dev/pytest-cov/pull/540>_.

    • Documentation fixes. Contributed by Andre Brisco in [#543](https://github.com/pytest-dev/pytest-cov/issues/543) <https://github.com/pytest-dev/pytest-cov/pull/543>_ and Colin O'Dell in [#525](https://github.com/pytest-dev/pytest-cov/issues/525) <https://github.com/pytest-dev/pytest-cov/pull/525>_.

    • Added support for LCOV output format via --cov-report=lcov. Only works with coverage 6.3+. Contributed by Christian Fetzer in [#536](https://github.com/pytest-dev/pytest-cov/issues/536) <https://github.com/pytest-dev/pytest-cov/issues/536>_.

    • Modernized pytest hook implementation. Contributed by Bruno Oliveira in [#549](https://github.com/pytest-dev/pytest-cov/issues/549) <https://github.com/pytest-dev/pytest-cov/pull/549>_ and Ronny Pfannschmidt in [#550](https://github.com/pytest-dev/pytest-cov/issues/550) <https://github.com/pytest-dev/pytest-cov/pull/550>_.

    3.0.0 (2021-10-04)

    Note that this release drops support for Python 2.7 and Python 3.5.

    • Added support for Python 3.10 and updated various test dependencies. Contributed by Hugo van Kemenade in [#500](https://github.com/pytest-dev/pytest-cov/issues/500) <https://github.com/pytest-dev/pytest-cov/pull/500>_.
    • Switched from Travis CI to GitHub Actions. Contributed by Hugo van Kemenade in [#494](https://github.com/pytest-dev/pytest-cov/issues/494) <https://github.com/pytest-dev/pytest-cov/pull/494>_ and [#495](https://github.com/pytest-dev/pytest-cov/issues/495) <https://github.com/pytest-dev/pytest-cov/pull/495>_.
    • Add a --cov-reset CLI option. Contributed by Danilo Šegan in [#459](https://github.com/pytest-dev/pytest-cov/issues/459) <https://github.com/pytest-dev/pytest-cov/pull/459>_.
    • Improved validation of --cov-fail-under CLI option. Contributed by ... Ronny Pfannschmidt's desire for skark in [#480](https://github.com/pytest-dev/pytest-cov/issues/480) <https://github.com/pytest-dev/pytest-cov/pull/480>_.
    • Dropped Python 2.7 support.

    ... (truncated)

    Commits
    • 28db055 Bump version: 3.0.0 → 4.0.0
    • 57e9354 Really update the changelog.
    • 56b810b Update chagelog.
    • f7fced5 Add support for LCOV output
    • 1211d31 Fix flake8 error
    • b077753 Use modern approach to specify hook options
    • 00713b3 removed incorrect docs on data_file.
    • b3dda36 Improve workflow with a collecting status check. (#548)
    • 218419f Prevent undesirable new lines to be displayed when report is disabled
    • 60b73ec migrate build command from distutils to setuptools
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update sphinx requirement from <5.0.0,>=4.5.0 to >=4.5.0,<7.0.0

    :arrow_up: Update sphinx requirement from <5.0.0,>=4.5.0 to >=4.5.0,<7.0.0

    Updates the requirements on sphinx to permit the latest version.

    Release notes

    Sourced from sphinx's releases.

    v6.0.0

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    Changelog

    Sourced from sphinx's changelog.

    Release 6.0.0 (released Dec 29, 2022)

    Dependencies

    • #10468: Drop Python 3.6 support
    • #10470: Drop Python 3.7, Docutils 0.14, Docutils 0.15, Docutils 0.16, and Docutils 0.17 support. Patch by Adam Turner

    Incompatible changes

    • #7405: Removed the jQuery and underscore.js JavaScript frameworks.

      These frameworks are no longer be automatically injected into themes from Sphinx 6.0. If you develop a theme or extension that uses the jQuery, $, or $u global objects, you need to update your JavaScript to modern standards, or use the mitigation below.

      The first option is to use the sphinxcontrib.jquery_ extension, which has been developed by the Sphinx team and contributors. To use this, add sphinxcontrib.jquery to the extensions list in conf.py, or call app.setup_extension("sphinxcontrib.jquery") if you develop a Sphinx theme or extension.

      The second option is to manually ensure that the frameworks are present. To re-add jQuery and underscore.js, you will need to copy jquery.js and underscore.js from the Sphinx repository_ to your static directory, and add the following to your layout.html:

      .. code-block:: html+jinja

      {%- block scripts %} {{ super() }} {%- endblock %}

      .. _sphinxcontrib.jquery: https://github.com/sphinx-contrib/jquery/

      Patch by Adam Turner.

    • #10471, #10565: Removed deprecated APIs scheduled for removal in Sphinx 6.0. See :ref:dev-deprecated-apis for details. Patch by Adam Turner.

    • #10901: C Domain: Remove support for parsing pre-v3 style type directives and roles. Also remove associated configuration variables c_allow_pre_v3 and c_warn_on_allowed_pre_v3. Patch by Adam Turner.

    Features added

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 1
  • :arrow_up: Bump furo from 2022.9.29 to 2022.12.7

    :arrow_up: Bump furo from 2022.9.29 to 2022.12.7

    Bumps furo from 2022.9.29 to 2022.12.7.

    Changelog

    Sourced from furo's changelog.

    Changelog

    2022.12.07 -- Reverent Raspberry

    • ✨ Add support for Sphinx 6.
    • ✨ Improve footnote presentation with docutils 0.18+.
    • Drop support for Sphinx 4.
    • Improve documentation about what the edit button does.
    • Improve handling of empty-flexboxes for better print experience on Chrome.
    • Improve styling for inline signatures.
    • Replace the meta generator tag with a comment.
    • Tweak labels with icons to prevent users selecting icons as text on touch.

    2022.09.29 -- Quaint Quartz

    • Add ability to set arbitrary URLs for edit button.
    • Add support for aligning text in MyST-parser generated tables.

    2022.09.15 -- Pragmatic Pistachio

    • Add a minimum version constraint on pygments.
    • Add an explicit dependency on sass.
    • Change right sidebar title from "Contents" to "On this page".
    • Correctly position sidebars on small screens.
    • Correctly select only Furo's own svg in related pages nav.
    • Make numpy-style documentation headers consistent.
    • Retitle the reference section.
    • Update npm dependencies.

    2022.06.21 -- Opulent Opal

    • Fix docutils <= 0.17.x compatibility.
    • Bump to the latest Node.js LTS.

    2022.06.04.1 -- Naughty Nickel bugfix

    • Fix the URL used in the "Edit this page" for Read the Docs builds.

    2022.06.04 -- Naughty Nickel

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 1
  • IndexError when emojis in input

    IndexError when emojis in input

    How to reproduce the behaviour

    import spacy
    import spacy_wrap
    nlp = spacy.blank("en")
    
    # specify model from the hub
    config = {"model": {"name": "dslim/bert-base-NER"}}
    # add it to the pipe
    nlp.add_pipe("token_classification_transformer", config=config)
    
    doc = nlp("My name is Wolfgang 🚀 and I live in Berlin.")
    

    Your Environment

    • spacy-wrap Version Used: spacy_wrap-1.2.0-py2.py3-none-any.whl
    • Operating System: MacOS, Apple M1, Ventura 13.0.1
    • Python Version Used: 3.10.6
    • spaCy Version Used: 3.4.3
    • Environment Information: poetry
    • Torch version: 1.13.0-cp310 torch = {url = "https://files.pythonhosted.org/packages/79/b3/eaea3fc35d0466b9dae1e3f9db08467939347b3aaa53c0fd81953032db33/torch-1.13.0-cp310-none-macosx_11_0_arm64.whl"}

    (Had to set torch manually because spacy-wrap fails to install on Mac with the default torch version. Specifically, the dependency nvidia-cublas-cu11 returns a RuntimeError with "Unable to find installation candidates for nvidia-cublas-cu11 (11.10.3.66)". No distributions available for Mac, see https://pypi.org/project/nvidia-cublas-cu11/11.10.3.66/#files )

    Above example yields an IndexError in TokenClassificationTransformer.convert_to_token_predictions(data, aggregation_strategy, labels)

          305 logits = data.model_output.logits[0]
          306 for align in data.align:
          307     # aggregate the logits for each token
    --> 308     agg_token_logits = agg(logits[align.data[:, 0]])
          309     token_probabilities_ = {
          310         "prob": softmax(agg_token_logits).round(decimals=3),
          311         "label": labels,
          312     }
          313     token_probabilities.append(token_probabilities_)
    
    IndexError: index 0 is out of bounds for axis 1 with size 0
    

    Issue is the same with other models. Tried saattrupdan/nbailab-base-ner-scandi.

    Tests with transformers pipeline works with same models.

    from transformers import pipeline
    pipe = pipeline(model="dslim/bert-base-NER")
    res = pipe("My name is Wolfgang 🚀 and I live in Berlin.")
    [e["word"] for e in res]
    # ['Wolfgang', '🚀', 'Berlin']
    
    bug 
    opened by nthomsencph 6
Releases(v1.3.0)
Owner
Kenneth Enevoldsen
Interdisciplinary PhD Student on representation learning in Clinical NLP and Genetics at Aarhus University and Interacting Minds Centre
Kenneth Enevoldsen
BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

Andrew Tavis McAllister 41 Dec 27, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

44 Jan 06, 2023
SimpleChinese2 集成了许多基本的中文NLP功能,使基于 Python 的中文文字处理和信息提取变得简单方便。

SimpleChinese2 SimpleChinese2 集成了许多基本的中文NLP功能,使基于 Python 的中文文字处理和信息提取变得简单方便。 声明 本项目是为方便个人工作所创建的,仅有部分代码原创。

Ming 30 Dec 02, 2022
AudioCLIP Extending CLIP to Image, Text and Audio

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

458 Jan 02, 2023
YACLC - Yet Another Chinese Learner Corpus

汉语学习者文本多维标注数据集YACLC V1.0 中文 | English 汉语学习者文本多维标注数据集(Yet Another Chinese Learner

BLCU-ICALL 47 Dec 15, 2022
中文生成式预训练模型

T5 PEGASUS 中文生成式预训练模型,以mT5为基础架构和初始权重,通过类似PEGASUS的方式进行预训练。 详情可见:https://kexue.fm/archives/8209 Tokenizer 我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer,它对中文更

410 Jan 03, 2023
Google's Meena transformer chatbot implementation

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

Francesco Pham 94 Dec 25, 2022
EasyTransfer is designed to make the development of transfer learning in NLP applications easier.

EasyTransfer is designed to make the development of transfer learning in NLP applications easier. The literature has witnessed the success of applying

Alibaba 819 Jan 03, 2023
Document processing using transformers

Doc Transformers Document processing using transformers. This is still in developmental phase, currently supports only extraction of form data i.e (ke

Vishnu Nandakumar 13 Dec 21, 2022
Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs

ItemSubjector Tool made to add main subject statements to items based on the title using a home-brewed CirrusSearch-based Named Entity Recognition alg

Dennis Priskorn 9 Nov 17, 2022
A toolkit for document-level event extraction, containing some SOTA model implementations

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le

84 Dec 15, 2022
A PyTorch-based model pruning toolkit for pre-trained language models

English | 中文说明 TextPruner是一个为预训练语言模型设计的模型裁剪工具包,通过轻量、快速的裁剪方法对模型进行结构化剪枝,从而实现压缩模型体积、提升模型速度。 其他相关资源: 知识蒸馏工具TextBrewer:https://github.com/airaria/TextBrewe

Ziqing Yang 231 Jan 08, 2023
Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

This repository is the official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

vanint 101 Dec 30, 2022
CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

CodeBERT This repo provides the code for reproducing the experiments in CodeBERT: A Pre-Trained Model for Programming and Natural Languages. CodeBERT

Microsoft 1k Jan 03, 2023
A Python script that compares files in directories

compare-files A Python script that compares files in different directories, this is similar to the command filecmp.cmp(f1, f2). I made this script in

Colvin 1 Oct 15, 2021
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Quickly train T5 models in just 3 lines of code + ONNX support simpleT5 is built on top of PyTorch-lightning ⚡️ and Transformers 🤗 that lets you quic

Shivanand Roy 220 Dec 30, 2022
Official implementation of Meta-StyleSpeech and StyleSpeech

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dongchan Min, Dong Bok Lee, Eunho Yang, and Sung Ju Hwang This is an official code

min95 169 Jan 05, 2023
ChatterBot is a machine learning, conversational dialog engine for creating chat bots

ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on

Gunther Cox 12.8k Jan 03, 2023
An automated program that helps customers of Pizza Palour place their pizza orders

PIzza_Order_Assistant Introduction An automated program that helps customers of Pizza Palour place their pizza orders. The program uses voice commands

Tindi Sommers 1 Dec 26, 2021
Python library for Serbian Natural language processing (NLP)

SrbAI - Python biblioteka za procesiranje srpskog jezika SrbAI je projekat prikupljanja algoritama i modela za procesiranje srpskog jezika u jedinstve

Serbian AI Society 3 Nov 22, 2022