NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

Last update: Dec 15, 2022

Related tags

Overview

NLPretext

Working on an NLP project and tired of always looking for the same silly preprocessing functions on the web? 😫

Need to efficiently extract email adresses from a document? Hashtags from tweets? Remove accents from a French post? 😥

NLPretext got you covered! 🚀

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

🔍 Quickly explore below our preprocessing pipelines and individual functions referential.

Default preprocessing pipeline
Custom preprocessing pipeline
Replacing phone numbers
Removing hashtags
Extracting emojis
Data augmentation

Cannot find what you were looking for? Feel free to open an issue.

Installation

This package has been tested on Python 3.6, 3.7 and 3.8.

We strongly advise you to do the remaining steps in a virtual environnement.

To install this library you just have to run the following command:

pip install nlpretext

This library uses Spacy as tokenizer. Current models supported are en_core_web_sm and fr_core_news_sm. If not installed, run the following commands:

pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz

pip install https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-2.3.0/fr_core_news_sm-2.3.0.tar.gz

Preprocessing pipeline

Default pipeline

Need to preprocess your text data but no clue about what function to use and in which order? The default preprocessing pipeline got you covered:

from nlpretext import Preprocessor
text = "I just got the best dinner in my life @latourdargent !!! I  recommend 😀 #food #paris \n"
preprocessor = Preprocessor()
text = preprocessor.run(text)
print(text)
# "I just got the best dinner in my life !!! I recommend"

Create your custom pipeline

Another possibility is to create your custom pipeline if you know exactly what function to apply on your data, here's an example:

from nlpretext import Preprocessor
from nlpretext.basic.preprocess import (normalize_whitespace, remove_punct, remove_eol_characters,
remove_stopwords, lower_text)
from nlpretext.social.preprocess import remove_mentions, remove_hashtag, remove_emoji
text = "I just got the best dinner in my life @latourdargent !!! I  recommend 😀 #food #paris \n"
preprocessor = Preprocessor()
preprocessor.pipe(lower_text)
preprocessor.pipe(remove_mentions)
preprocessor.pipe(remove_hashtag)
preprocessor.pipe(remove_emoji)
preprocessor.pipe(remove_eol_characters)
preprocessor.pipe(remove_stopwords, args={'lang': 'en'})
preprocessor.pipe(remove_punct)
preprocessor.pipe(normalize_whitespace)
text = preprocessor.run(text)
print(text)
# "dinner life recommend"

Take a look at all the functions that are available here in the preprocess.py scripts in the different folders: basic, social, token.

Individual Functions

Replacing emails

from nlpretext.basic.preprocess import replace_emails
example = "I have forwarded this email to [email protected]"
example = replace_emails(example, replace_with="*EMAIL*")
print(example)
# "I have forwarded this email to *EMAIL*"

Replacing phone numbers

from nlpretext.basic.preprocess import replace_phone_numbers
example = "My phone number is 0606060606"
example = replace_phone_numbers(example, country_to_detect=["FR"], replace_with="*PHONE*")
print(example)
# "My phone number is *PHONE*"

Removing Hashtags

from nlpretext.social.preprocess import remove_hashtag
example = "This restaurant was amazing #food #foodie #foodstagram #dinner"
example = remove_hashtag(example)
print(example)
# "This restaurant was amazing"

Extracting emojis

from nlpretext.social.preprocess import extract_emojis
example = "I take care of my skin 😀"
example = extract_emojis(example)
print(example)
# [':grinning_face:']

Data augmentation

The augmentation module helps you to generate new texts based on your given examples by modifying some words in the initial ones and to keep associated entities unchanged, if any, in the case of NER tasks. If you want words other than entities to remain unchanged, you can specify it within the stopwords argument. Modifications depend on the chosen method, the ones currently supported by the module are substitutions with synonyms using Wordnet or BERT from the nlpaug library.

from nlpretext.augmentation.text_augmentation import augment_text
example = "I want to buy a small black handbag please."
entities = [{'entity': 'Color', 'word': 'black', 'startCharIndex': 22, 'endCharIndex': 27}]
example = augment_text(example, method=”wordnet_synonym”, entities=entities)
print(example)
# "I need to buy a small black pocketbook please."

Make HTML documentation

In order to make the html Sphinx documentation, you need to run at the nlpretext root path: sphinx-apidoc -f nlpretext -o docs/ This will generate the .rst files. You can generate the doc with cd docs && make html

You can now open the file index.html located in the build folder.

Project Organization

├── LICENSE
├── VERSION
├── CONTRIBUTING.md     <- Contribution guidelines
├── README.md           <- The top-level README for developers using this project.
├── .github/workflows   <- Where the CI lives
├── datasets/external   <- Bash scripts to download external datasets
├── docs                <- Sphinx HTML documentation
├── nlpretext           <- Main Package. This is where the code lives
│   ├── preprocessor.py <- Main preprocessing script
│   ├── augmentation    <- Text augmentation script
│   ├── basic           <- Basic text preprocessing 
│   ├── social          <- Social text preprocessing
│   ├── token           <- Token text preprocessing
│   ├── _config         <- Where the configuration and constants live
│   └── _utils          <- Where preprocessing utils scripts lives
├── tests               <- Where the tests lives
├── setup.py            <- makes project pip installable (pip install -e .) so the package can be imported
├── requirements.txt    <- The requirements file for reproducing the analysis environment, e.g.
│                          generated with `pip freeze > requirements.txt`
└── pylintrc            <- The linting configuration file

Comments

Bump actions/cache from 2.1.6 to 3.2.1
Bumps actions/cache from 2.1.6 to 3.2.1.

Release notes

Sourced from actions/cache's releases.

v3.2.1

What's Changed

Release compression related changes for windows by @Phantsure in actions/cache#1039

Upgrade codeql to v2 by @Phantsure in actions/cache#1023

Full Changelog: https://github.com/actions/cache/compare/v3.2.0...v3.2.1

v3.2.0

What's Changed

fix wrong timeout env var key in README.md by @walterddr in actions/cache#959

Updated release doc with correct env variable by @kotewar in actions/cache#960

Create pull_request_template.md by @pdotl in actions/cache#963

Update README with clearer info about cache-hit and its value by @kotewar in actions/cache#961

Change datadog/squid to Ubuntu/squid in CI check by @bishal-pdMSFT in actions/cache#976

Add more details to version section in readme by @bishal-pdMSFT in actions/cache#971

Update hashFiles documentation reference by @asaf400 in actions/cache#979

Updated link for cache segment download info by @kotewar in actions/cache#986

Readme update for deleting caches by @t-dedah in actions/cache#981

Add oncall logic to assign issues and PRs by @vsvipul in actions/cache#997

Bump minimatch from 3.0.4 to 3.1.2 by @dependabot in actions/cache#998

Revert "Bump minimatch from 3.0.4 to 3.1.2" by @vsvipul in actions/cache#1005

Fix npm vulnerability by @Phantsure in actions/cache#1007

refactor: Use early return pattern to avoid nested conditions by @jongwooo in actions/cache#1013

Use cache in check-dist.yml by @jongwooo in actions/cache#1004

chore: Use built-in cache action to cache dependencies by @jongwooo in actions/cache#1014

Updated node example by @t-dedah in actions/cache#1008

Fix: Node npm doc example by @apascualm in actions/cache#1026

docs: fix an invalid link in workarounds.md by @teatimeguest in actions/cache#929

General Availability release for granular cache by @kotewar in actions/cache#1035 More details here on beta release.

New Contributors

@walterddr made their first contribution in actions/cache#959

@asaf400 made their first contribution in actions/cache#979

@jongwooo made their first contribution in actions/cache#1013

@apascualm made their first contribution in actions/cache#1026

@teatimeguest made their first contribution in actions/cache#929

Full Changelog: https://github.com/actions/cache/compare/v3...v3.2.0

v3.2.0-beta.1

What's Changed

Actions Cache Granular Control Implementation by @kotewar in actions/cache#1006

v3.1.0-beta.3

What's Changed

Bug fixes for bsdtar fallback, if gnutar not available, and gzip fallback, if cache saved using old cache action, on windows.

Full Changelog: https://github.com/actions/cache/compare/v3.1.0-beta.2...v3.1.0-beta.3

... (truncated)

Changelog

Sourced from actions/cache's changelog.

3.2.1

Update @actions/cache on windows to use gnu tar and zstd by default and fallback to bsdtar and zstd if gnu tar is not available. (issue)

Added support for fallback to gzip to restore old caches on windows.

Added logs for cache version in case of a cache miss.

Commits

c1a5de8 Upgrade codeql to v2 (#1023)

9b0be58 Release compression related changes for windows (#1039)

c17f4bf GA for granular cache (#1035)

ac25611 docs: fix an invalid link in workarounds.md (#929)

dc097e3 Update examples.md (#1026)

fb86cbf Updated node example (#1008)

a57932f Merge pull request #1014 from jongwooo/chore/use-built-in-cache-action

04b13ca chore: Use built-in cache action to cache dependencies

941bc71 Merge pull request #1004 from jongwooo/chore/use-cache-in-check-dist

08d8639 Merge branch 'main' into chore/use-cache-in-check-dist

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

draft dependencies github_actions
opened by dependabot[bot] 0
Bump python from 3.9.7-slim-buster to 3.11.1-slim-buster in /docker
Bumps python from 3.9.7-slim-buster to 3.11.1-slim-buster.

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

draft docker dependencies
opened by dependabot[bot] 0
The current release is not functional as emoji lib has changed
🐛 Bug Report

🔬 How To Reproduce

Steps to reproduce the behavior:

install nlpretext from pip (1.1.0)

run from nlpretext._config import constants

Code sample

Environment

OS: macOS Silicon

Python version: 3.7, 3.8, 3.9

📈 Expected behavior

EMOJI_PATTERN = _emoji.get_emoji_regexp()

AttributeError: module 'emoji' has no attribute 'get_emoji_regexp'

bug
opened by Guillaume6606 1
Bump release-drafter/release-drafter from 5.15.0 to 5.21.1
Bumps release-drafter/release-drafter from 5.15.0 to 5.21.1.

Release notes

Sourced from release-drafter/release-drafter's releases.

v5.21.1

What's Changed

Dependency Updates

Address set-output deprecation (#1247) @NotMyFault

Full Changelog: https://github.com/release-drafter/release-drafter/compare/v5.21.0...v5.21.1

v5.21.0

What's Changed

New

fetch 100 labels for pull requests instead of 10 (#1220) @matoubidou

Full Changelog: https://github.com/release-drafter/release-drafter/compare/v5.20.1...v5.21.0

v5.20.1

What's Changed

Bug Fixes

Add missing inputs to action config (#1202) @gilbertsoft

Documentation

Add more comments about pull requests permission (#1187) @Kirade

Fix Vercel link (#1188) @shinshin86

Add permissions to README (#1132) @danyeaw

Dependency Updates

Bump eslint-plugin-unicorn from 42.0.0 to 43.0.2 (#1192) @dependabot

Bump node from af50279 to 4c8f734 (#1191) @dependabot

Bump node from 17.9.0-alpine to 18.7.0-alpine (#1190) @dependabot

Bump jest from 28.1.0 to 28.1.3 (#1182) @dependabot

Bump eslint from 8.16.0 to 8.20.0 (#1185) @dependabot

Bump nock from 13.2.4 to 13.2.9 (#1186) @dependabot

Bump probot from 12.2.4 to 12.2.5 (#1178) @dependabot

Bump eslint-plugin-prettier from 4.0.0 to 4.2.1 (#1176) @dependabot

Bump lint-staged from 13.0.0 to 13.0.3 (#1172) @dependabot

Bump prettier from 2.6.2 to 2.7.1 (#1166) @dependabot

Bump @actions/core from 1.8.2 to 1.9.0 (#1164) @dependabot

Bump lint-staged from 12.4.3 to 13.0.0 (#1156) @dependabot

Bump probot from 12.2.3 to 12.2.4 (#1155) @dependabot

Bump @vercel/ncc from 0.33.4 to 0.34.0 (#1151) @dependabot

... (truncated)

Commits

6df64e4 v5.21.1

26be07d Address set-output deprecation (#1247)

df69d58 v5.21.0

ecbbed9 fetch 100 labels for pull requests instead of 10 (#1220)

06a49bf v5.20.1

6e6a13c Add missing inputs to action config (#1202)

0e58cd4 Bump eslint-plugin-unicorn from 42.0.0 to 43.0.2 (#1192)

c3d9042 quote schema defaults that contain *

bd579b5 Bump node from af50279 to 4c8f734 (#1191)

c464263 Bump node from 17.9.0-alpine to 18.7.0-alpine (#1190)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

draft dependencies github_actions
opened by dependabot[bot] 0
Bump cloudpickle from 2.0.0 to 2.2.0
Bumps cloudpickle from 2.0.0 to 2.2.0.

Changelog

Sourced from cloudpickle's changelog.

2.2.0

Fix support of PyPy 3.8 and later. ([issue #455](cloudpipe/cloudpickle#455))

2.1.0

Support for pickling abc.abstractproperty, abc.abstractclassmethod, and abc.abstractstaticmethod. ([PR #450](cloudpipe/cloudpickle#450))

Support for pickling subclasses of generic classes. ([PR #448](cloudpipe/cloudpickle#448))

Support and CI configuration for Python 3.11. ([PR #467](cloudpipe/cloudpickle#467))

Support for the experimental nogil variant of CPython ([PR #470](cloudpipe/cloudpickle#470))

Commits

f31859b Release 2.2.0

23cbe15 FIX: Support PyPy > 3.7 (#480)

f5472e1 Fix for dis module is not yet available in 3.11b3 (#475)

8bbea3e compat: Import Pickler from "pickle" instead of "_pickle" (#469)

0006829 Install development version of dask in downstream tests (#472)

f926a04 Back to dev mode

d50bd11 Release 2.1.0

6a0e12d Improve compatibility with "nogil" Python and 3.11 (#470)

2fc334d Fix downstream CI (#471)

f758eb3 Fix compatibility with Python 3.11 (#467)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

draft dependencies python
opened by dependabot[bot] 0

Releases(1.1.0)

1.1.0(Sep 16, 2021)
What’s Changed

[FIX] Removed direct dependency and changed docker registry (#163) @Cedric-Magnan

[DOC] Updated method for spacy tokenizer installation (#159) @Cedric-Magnan

Feature/ignore stopwords (#157) @Guillaume6606

fix: display explicit error message when model not downloaded (#156) @benoitgoujon

Feature/dataloader (#152) @sachalasry-artefact

Hotfix/pylint (#151) @amaleelhamri

Fix/credits (#150) @rafaelleaygalenq

:busts_in_silhouette: List of contributors

@Cedric-Magnan, @Guillaume6606, @amaleelhamri, @benoitgoujon, @hugovasselin, @rafaelleaygalenq and @sachalasry-artefact
Source code(tar.gz)
Source code(zip)
1.0.3(Feb 18, 2021)

Update license MIT to Apache in PyPI
Source code(tar.gz)
Source code(zip)
nlpretext-1.0.2-py3-none-any.whl(131.91 KB)
nlpretext-1.0.2.tar.gz(275.42 KB)
1.0.1(Feb 18, 2021)
Readme fix

Long description add

Augmentation sphinx documentation fix

Source code(tar.gz)
Source code(zip)
nlpretext-1.0.1-py3-none-any.whl(131.90 KB)
nlpretext-1.0.1.tar.gz(275.33 KB)
1.0.0(Feb 18, 2021)
First release

Easy pipelines to clean text efficiently

Catalogue of preprocessing functions for different needs

Source code(tar.gz)
Source code(zip)
nlpretext-1.0.0-py3-none-any.whl(126.46 KB)
nlpretext-1.0.0.tar.gz(271.90 KB)

Owner

Artefact

GitHub Repository https://nlpretext.readthedocs.io/en/latest/

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

77.1k Dec 31, 2022

End-to-end image captioning with EfficientNet-b3 + LSTM with Attention

Image captioning End-to-end image captioning with EfficientNet-b3 + LSTM with Attention Model is seq2seq model. In the encoder pretrained EfficientNet

2 Feb 10, 2022

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

NLP Boot Camp (Jan) Synopsis Full Name: Prameya Mohanty Name of your School: Delhi Public School, Rourkela Class: VIII Title of the Project: iTransect

1 Feb 01, 2022

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

166 Dec 11, 2022

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib

227 Jan 02, 2023

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

Trankit: A Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing Trankit is a light-weight Transformer-based Pyth

652 Jan 06, 2023

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

1.1k Dec 27, 2022

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

2 Sep 27, 2022

Neural network sequence labeling model

Sequence labeler This is a neural network sequence labeling system. Given a sequence of tokens, it will learn to assign labels to each token. Can be u

250 Nov 03, 2022

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

Takker - 一个普通的QQ机器人此项目为基于 Nonebot2 和 go-cqhttp 开发，以 Sqlite 作为数据库的QQ群娱乐机器人关于纯兴趣开发，部分功能借鉴了大佬们的代码，作为Q群的娱乐+功能性Bot 声明此项目仅用于学习交流，请勿用于非法用途这是开发者的第一个Pytho

79 Dec 29, 2022

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

44 Dec 31, 2022

NLP: SLU tagging

3 Jan 14, 2022

Word Bot for JKLM Bomb Party

Word Bot for JKLM Bomb Party A bot for Bomb Party on https://www.jklm.fun (Only English) Requirements pynput pyperclip pyautogui Usage: Step 1: Run th

7 Oct 30, 2022

Convolutional 2D Knowledge Graph Embeddings resources

ConvE Convolutional 2D Knowledge Graph Embeddings resources. Paper: Convolutional 2D Knowledge Graph Embeddings Used in the paper, but do not use thes

586 Dec 24, 2022

This project converts your human voice input to its text transcript and to an automated voice too.

Human Voice to Automated Voice & Text Introduction: In this project, whenever you'll speak, it will turn your voice into a robot voice and furthermore

3 Oct 15, 2021

A deep learning-based translation library built on Huggingface transformers

DL Translate A deep learning-based translation library built on Huggingface transformers and Facebook's mBART-Large 💻 GitHub Repository 📚 Documentat

244 Dec 30, 2022

PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

SITT The repo contains official PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation. Authors: Boyi Li Yin Cui T

52 Jan 05, 2023

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration This is the official repository for the EMNLP 2021 long pa

70 Dec 11, 2022

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks arXiv link: upcoming To be published in Findings of NA

16 Nov 12, 2022

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention ACL2021 Findings Usage 0. Prepare environment Requirements: python==3.6 te

8 Dec 16, 2022

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

Related tags

Overview

NLPretext

Installation

Preprocessing pipeline

Default pipeline

Create your custom pipeline

Individual Functions

Replacing emails

Replacing phone numbers

Removing Hashtags

Extracting emojis

Data augmentation

Make HTML documentation

Project Organization

Comments

Bump actions/cache from 2.1.6 to 3.2.1

v3.2.1

What's Changed

v3.2.0

What's Changed

New Contributors

v3.2.0-beta.1

What's Changed

v3.1.0-beta.3

What's Changed

3.2.1

Bump python from 3.9.7-slim-buster to 3.11.1-slim-buster in /docker

The current release is not functional as emoji lib has changed

🐛 Bug Report

🔬 How To Reproduce

Code sample

Environment

📈 Expected behavior

Bump release-drafter/release-drafter from 5.15.0 to 5.21.1

v5.21.1

What's Changed

Dependency Updates

v5.21.0

What's Changed

New

v5.20.1

What's Changed

Bug Fixes

Documentation

Dependency Updates

Bump cloudpickle from 2.0.0 to 2.2.0

2.2.0

2.1.0

Releases(1.1.0)

1.1.0(Sep 16, 2021)

What’s Changed

:busts_in_silhouette: List of contributors

1.0.3(Feb 18, 2021)

1.0.1(Feb 18, 2021)

1.0.0(Feb 18, 2021)

Owner

Artefact

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

End-to-end image captioning with EfficientNet-b3 + LSTM with Attention

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Neural network sequence labeling model

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

NLP: SLU tagging

Word Bot for JKLM Bomb Party

Convolutional 2D Knowledge Graph Embeddings resources

This project converts your human voice input to its text transcript and to an automated voice too.

A deep learning-based translation library built on Huggingface transformers

PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.