A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Overview

PyPI - License Visits Badge

Gramformer

Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or word choice errors. Gramformer is a library that exposes 3 seperate interfaces to a family of algorithms to detect, highlight and correct grammar errors. To make sure the corrections and highlights recommended are of high quality, it comes with a quality estimator. You can use Gramformer in one or more areas mentioned under the "use-cases" section below or any other usecase as you see fit. Gramformer stands on the shoulders of gaints, it combines some of the top notch researches in grammar correction. Note: It works at sentence levels and has been trained on 128 length sentences, so not (yet) suitable for long prose or paragraphs (stay tuned for upcoming releases)

Table of contents

Usecases for Gramformer

Area 1: Post-processing machine generated text

Machine-Language generation is becoming mainstream, so will post-processing machine generated text.

  • Conditioned Text generation output(Text2Text generation).
    • NMT: Machine Translated output.
    • ASR or STT: Speech to text output.
    • HTR: Handwritten text recognition output.
    • Paraphrase generation output.
  • Controlled Text generation output(Text generation with PPLM) [TBD].
  • Free-form text generation output(Text generation)[TBD].

Area 2:Human-In-The-Loop (HITL) text

  • Most Supervised NLU (Chatbots and Conversational) systems need humans/experts to enter or edit text that needs to be grammtical correct otherwise the quality of HITL data can degrade the model over a period of time

Area 3:Assisted writing for humans

  • Integrating into custom Text editors of your Apps. (A Poor man's grammarly, if you will)

Area 4:Custom Platform integration

As of today grammatical safety nets for authoring social contents (Post or Comments) or text in messaging platforms is very little (word level correction) or non-existent.The onus is on the author to install tools like grammarly to proof read.

  • Messaging platforms and Social platforms can highlight / correct grammtical errors automatically without altering the meaning or intent.

Installation

pip install git+https://github.com/PrithivirajDamodaran/Gramformer.git@v0.1

Quick Start

Correcter - [Available now]

from gramformer import Gramformer
import torch

def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(1212)


gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 

influent_sentences = [
    "Matt like fish",
    "the collection of letters was original used by the ancient Romans",
    "We enjoys horror movies",
    "Anna and Mike is going skiing",
    "I walk to the store and I bought milk",
    "We all eat the fish and then made dessert",
    "I will eat fish for dinner and drank milk",
    "what be the reason for everyone leave the company",
]   

for influent_sentence in influent_sentences:
    corrected_sentence = gf.correct(influent_sentence)
    print("[Input] ", influent_sentence)
    print("[Correction] ",corrected_sentence[0])
    print("-" *100)
[Input]  Matt like fish
[Correction]  Matt likes fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Correction]  The collection of letters was originally used by the ancient Romans.
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Correction]  We enjoy horror movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Correction]  Anna and Mike are going skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Correction]  I walked to the store and bought milk.
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Correction]  We all ate the fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Correction]  I'll eat fish for dinner and drink milk.
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Correction]  what can be the reason for everyone to leave the company.
----------------------------------------------------------------------------------------------------

Challenge with generative models

While Gramformer aims to post-process outputs from the generative models, Gramformer itself is a generative model. So the question arises, who will post-process the Gramformer outputs ? (I know, very meta :-)). In general all generative models have the tendency to generate spurious text sometimes, which we cannot control. So to make sure the gramformer grammar corrections (and highlights) are as accurate as possible, A quality estimator (QE) will be added. It can estimate a error correction quality score and use that as a filter on Top-N candidates to return only the best based on the score.

Correcter with QE estimator - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
corrected_sentence = gf.correct(<your input sentence>, filter_by_quality=True, max_candidates=3)

Highlighter - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 1, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
highlighted_sentence = gf.highlight(<your input sentence>)
[Input]  Matt like fish
[Highlight]  Matt <e> like </e> fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Highlight]  the collection of letters was <e> original used </e> by the ancient Romans
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Highlight]  We <e> enjoys horror </e> movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Highlight]  Anna and Mike <e> is going </e> skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Highlight]  I <e> walk to </e> the store and I bought milk
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Highlight]  We all <e> eat the </e> fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Highlight]  I will eat fish for dinner and <e> drank milk </e> 
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Highlight]  <e> what be </e> the reason <e> for everyone </e> <e> leave the </e> company
----------------------------------------------------------------------------------------------------
[Input]  One of the most important issue is the lack of parking spaces at the local mall.
[Highlight]  One of the most important <e> issue is </e> the lack of parking spaces at the local mall.
----------------------------------------------------------------------------------------------------
[Input]  The survey we performed recently showed that most of customers are satisfied.
[Highlight]  The survey we performed recently showed that most <e> of customers </e> are satisfied.
----------------------------------------------------------------------------------------------------
[Input]  I’ve loved classical music ever since I was child.
[Highlight]  I’ve loved classical music ever since I <e> was child </e>.
----------------------------------------------------------------------------------------------------

Detector - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 0, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
grammar_fluency_score = gf.detect(<your input sentence>)

Models

Model Type Return status
prithivida/grammar_error_detector Classifier Label TBD (prithivida/parrot_fluency_on_BERT can be repurposed here, but I would recommend you wait :-))
prithivida/grammar_error_highlighter Seq2Seq Grammar errors enclosed in <e> and </e> Beta
prithivida/grammar_error_correcter Seq2Seq The corrected sentence Beta

Dataset

  • First idea is to generate the dataset using the techniques mentioned in the first paper highlighted in reference section. You can use the technique on anyone of the publicy available wikipedia edits datasets. Write some rules to filter only the grammatical edits, do some cleanup and thats it Bob's your uncle :-).
  • Second and possibly very complicated and $$$ way to get some 200M synthetic sentences. This is based on the last paper under references section. Not recommended but by all means knock yourself out if you are interested :-)
  • Third source is to repurpose the GEC Task data
  • I combined sources 1 and 3 to get my training data (still working on source 2, will keep you posted)
  • I ended up with ~1M records and after some heurtistics based filtering amounted to ~1/2M records.
  • It took ~12 hours to train each of the above models.

Benchmark

TBD (I will benchmark grammformer models against the following publicy available models: salesken/grammar_correction and flexudy/t5-small-wav2vec2-grammar-fixer shortly.

References

Citation

TBD

Comments
  • [Spacy error] Can't find model 'en'

    [Spacy error] Can't find model 'en'

    Hello I have successfully installed the Gramformer on my windows PC. but when I run, it gives the following error.

    Traceback (most recent call last):
      File "main.py", line 27, in <module>
        grammar_correction = Gramformer(models = 1, use_gpu=True)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\gramformer\gramformer.py", line 8, in __init__
        self.annotator = errant.load('en')
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\errant\__init__.py", line 16, in load
        nlp = nlp or spacy.load(lang, disable=["ner"])
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\__init__.py", line 30, in load
        return util.load_model(name, **overrides)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\util.py", line 175, in load_model
        raise IOError(Errors.E050.format(name=name))
    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
    
    opened by muzamil47 3
  • Commercial use issue

    Commercial use issue

    Hey @PrithivirajDamodaran

    The readme states that Gramformer versions above 1.0 are allowed for commercial use - however, this is not currently the case as the grammar_error_correcter_v1 model has been trained using the non-commercial WI&Locness data, even though the documentation states otherwise:

    The grammar_error_correcter_v1 model is actually identical to the previous grammar_error_correcter model which is trained using the non-commercial WI&Locness data – they have identical weights, which you can verify with this script

    As the models are the same, this means that both models have been trained using the non-commercial WI&Locness data, and the grammar_error_correcter_v1 model along with Gramformer v1.1 and v1.2 should not be allowed for commercial use.

    Could you please update the readme to clarify this, or upload a new model that has not been trained using WI&Locness?

    Thanks

    question 
    opened by SimonHFL 2
  • Use corrector for highligher

    Use corrector for highligher

    Hi @PrithivirajDamodaran

    This is a great framework. Is it possible (for now) to use model corrector (model=2) for the highlighter(model=1)? After getting some correction, match it to the input and give prefix and suffix () for the mismatch?

    Thanks

    question 
    opened by ilhamsyahids 2
  • Error loading the tokenizer in transformers==4.4.2

    Error loading the tokenizer in transformers==4.4.2

    I'm getting error when initializing the class object, specifically at tokenizer loading:

    In [6]: correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-6-d34dd9c5fe99> in <module>
    ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        414             tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
        415             if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
    --> 416                 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
        417             else:
        418                 if tokenizer_class_py is not None:
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
       1703
       1704         return cls._from_pretrained(
    -> 1705             resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
       1706         )
       1707
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
       1774         # Instantiate tokenizer.
       1775         try:
    -> 1776             tokenizer = cls(*init_inputs, **init_kwargs)
       1777         except OSError:
       1778             raise OSError(
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/t5/tokenization_t5_fast.py in __init__(self, vocab_file, tokenizer_file, eos_token, unk_token, pad_token, extra_ids, additional_special_tokens, **kwargs)
        134             extra_ids=extra_ids,
        135             additional_special_tokens=additional_special_tokens,
    --> 136             **kwargs,
        137         )
        138
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_fast.py in __init__(self, *args, **kwargs)
         85         if fast_tokenizer_file is not None and not from_slow:
         86             # We have a serialization from tokenizers which let us directly build the backend
    ---> 87             fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
         88         elif slow_tokenizer is not None:
         89             # We need to convert a slow tokenizer to build the backend
    
    Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 1 column 329667
    

    transformers==4.4.2.

    The installation package didn't specify the transformers version that this library is using. What should be the correct version? Or is it version independent and it's something else?

    opened by zhangyilun 2
  • Figma Gramformer Plugin

    Figma Gramformer Plugin

    Figma is used in creating a lot of digital interfaces today, a Gramformer Figma plugin would go a long way. I'll be willing to design the interface for the plugin but I don't know how to make the plugin itself. I hope someone takes this up. This is a link to get started https://www.figma.com/plugin-docs/setup/

    enhancement 
    opened by ayoolafelix 2
  • README.md get_edits and get_highlight example small fixes

    README.md get_edits and get_highlight example small fixes

    Hi there, when I copy and pasted the examples in the README locally I noticed they were bugging out for the edits and highlights (were only pulling the first char of the sentence for errant). Providing the full sentence seemed to get the desired output.

    opened by parisac 1
  • Training dataset

    Training dataset

    Hi Prithiviraj,

    Is there any chance you'd be able to release the training dataset you used to train the Gramformer huggingface model? I see that there are some details on the slices of data that you brought together in the Readme, but it would be useful to be able to use the same data that you used.

    The main reason I'm asking is I'd like to create a model that can take correct text and add grammatical errors to it. So I was thinking I could take the dataset you used to train Gramformer and use the inverse to train a model that does the inverse. I can go through the data prep process as you did, but it would definitely be easier if I were able to reuse yours, and it might be useful for reproducibility for others as well.

    invalid question 
    opened by d4buss 1
  • OSError: Can't load config for 'prithivida/grammar_error_correcter'

    OSError: Can't load config for 'prithivida/grammar_error_correcter'

    Hi, I have been using your code for the last few days. Suddenly, it started to crash.

    Have a look at the code and error given below:

    Code (Link: https://huggingface.co/prithivida/grammar_error_correcter_v1):

    from gramformer import Gramformer
    import torch
    
    def set_seed(seed):
      torch.manual_seed(seed)
      if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
    
    set_seed(1212)
    
    
    gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
    
    influent_sentences = [
        "Matt like fish",
        "the collection of letters was original used by the ancient Romans",
        "We enjoys horror movies",
        "Anna and Mike is going skiing",
        "I walk to the store and I bought milk",
        "We all eat the fish and then made dessert",
        "I will eat fish for dinner and drank milk",
        "what be the reason for everyone leave the company",
    ]   
    
    for influent_sentence in influent_sentences:
        corrected_sentence = gf.correct(influent_sentence)
        print("[Input] ", influent_sentence)
        print("[Correction] ",corrected_sentence[0])
        print("-" *100)
    

    Error

    404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    ---------------------------------------------------------------------------
    HTTPError                                 Traceback (most recent call last)
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        491                 use_auth_token=use_auth_token,
    --> 492                 user_agent=user_agent,
        493             )
    
    7 frames
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, use_auth_token, local_files_only)
       1278             use_auth_token=use_auth_token,
    -> 1279             local_files_only=local_files_only,
       1280         )
    
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, use_auth_token, local_files_only)
       1441             r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=etag_timeout)
    -> 1442             r.raise_for_status()
       1443             etag = r.headers.get("X-Linked-Etag") or r.headers.get("ETag")
    
    /usr/local/lib/python3.7/dist-packages/requests/models.py in raise_for_status(self)
        942         if http_error_msg:
    --> 943             raise HTTPError(http_error_msg, response=self)
        944 
    
    HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    
    During handling of the above exception, another exception occurred:
    
    OSError                                   Traceback (most recent call last)
    <ipython-input-10-0f43e537fe87> in <module>
         10 
         11 
    ---> 12 gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all
         13 
         14 influent_sentences = [
    
    /usr/local/lib/python3.7/dist-packages/gramformer/gramformer.py in __init__(self, models, use_gpu)
         14 
         15     if models == 2:
    ---> 16         self.correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
         17         self.correction_model     = AutoModelForSeq2SeqLM.from_pretrained(correction_model_tag)
         18         self.correction_model     = self.correction_model.to(device)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        400         kwargs["_from_auto"] = True
        401         if not isinstance(config, PretrainedConfig):
    --> 402             config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
        403 
        404         use_fast = kwargs.pop("use_fast", True)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
        428         """
        429         kwargs["_from_auto"] = True
    --> 430         config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
        431         if "model_type" in config_dict:
        432             config_class = CONFIG_MAPPING[config_dict["model_type"]]
    
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        502                 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n"
        503             )
    --> 504             raise EnvironmentError(msg)
        505 
        506         except json.JSONDecodeError:
    
    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:
    
    - 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'
    
    - or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file
    ![Screenshot from 2021-07-01 18-36-07](https://user-images.githubusercontent.com/4704211/124133526-5a9da900-da9b-11eb-9733-61df46ab01e1.png)
    
    

    Possible Solution:

    Rename this link from: https://huggingface.co/prithivida/grammar_error_correcter/ to: https://huggingface.co/prithivida/grammar_error_correcter_v1/

    Please help me fix this. thank you

    opened by Nomiluks 1
  • Inference Issue !!!

    Inference Issue !!!

    OSError Traceback (most recent call last)

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 241 if resolved_config_file is None: --> 242 raise EnvironmentError 243 config_dict = cls._dict_from_json_file(resolved_config_file)

    OSError:

    During handling of the above exception, another exception occurred:

    OSError Traceback (most recent call last)

    3 frames

    in () ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained("prithivida/grammar_error_correcter") 2 correction_model = AutoModelForSeq2SeqLM.from_pretrained("prithivida/grammar_error_correcter") 3 print("[Gramformer] Grammar error correction model loaded..") 4 5

    /usr/local/lib/python3.7/dist-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 204 config = kwargs.pop("config", None) 205 if not isinstance(config, PretrainedConfig): --> 206 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs) 207 208 if "bert-base-japanese" in str(pretrained_model_name_or_path):

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 201 202 """ --> 203 config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) 204 205 if "model_type" in config_dict:

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 249 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n" 250 ) --> 251 raise EnvironmentError(msg) 252 253 except json.JSONDecodeError:

    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:

    • 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'

    • or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file

    Solutions for this issue????

    invalid 
    opened by sabhi27 1
  • How to train Gramformer on non-English languages.

    How to train Gramformer on non-English languages.

    Hey @PrithivirajDamodaran , Great work on building Gramformer, ive played with it and the results are amazing.

    I work on pushing nlp forward in under represented languages, and hence i humbly request you to please tell me how do i train gramformer on non-English sentences ?

    I checked out your HuggingFace page 'https://huggingface.co/prithivida/grammar_error_correcter' but coudn't find any resources on how to train gramformer from scratch. If you could help me in training Gramformer on non-English langauages it would really mean a lot to me. Do let me know.

    Thanks

    question 
    opened by StephennFernandes 1
  • pip install is erroring out,

    pip install is erroring out,

    I am unable to do pip install of the package, here is the error:

    Collecting git+https://github.com/PrithivirajDamodaran/[email protected] Cloning https://github.com/PrithivirajDamodaran/Gramformer.git (to revision v0.1) to c:\users\sumit\appdata\local\temp\pip-req-build-sw54k_0h ERROR: Error [WinError 2] The system cannot find the file specified while executing command git clone -q https://github.com/PrithivirajDamodaran/Gramformer.git 'C:\Users\Sumit\AppData\Local\Temp\pip-req-build-sw54k_0h' ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?

    I also tried directly downloading the repo and tried executing the package. Model is not present in location(correction_model_tag = "prithivida/grammar_error_correcter"). Any way to download the pretrain model.

    opened by ranjan-sumit 1
  • OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_9376\2706950954.py in 25 26 ---> 27 gf = Gramformer(models = 1, use_gpu=False) # 1=corrector, 2=detector 28 29 influent_sentences = [

    ~\anaconda3_9\envs\python37\lib\site-packages\gramformer\gramformer.py in init(self, models, use_gpu) 7 import errant 8 #self.annotator = errant.load('en_core_web_sm') ----> 9 self.annotator = errant.load('en') # en is deprecated from spacy 3.0 onwards 10 11 if use_gpu:

    ~\anaconda3_9\envs\python37\lib\site-packages\errant_init_.py in load(lang, nlp) 17 18 # Load spacy ---> 19 nlp = nlp or spacy.load(lang, disable=["ner"]) 20 21 # Load language edit merger

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy_init_.py in load(name, **overrides) 28 if depr_path not in (True, False, None): 29 warnings.warn(Warnings.W001.format(path=depr_path), DeprecationWarning) ---> 30 return util.load_model(name, **overrides) 31 32

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy\util.py in load_model(name, **overrides) 173 elif hasattr(name, "exists"): # Path or Path-like to model data 174 return load_model_from_path(name, **overrides) --> 175 raise IOError(Errors.E050.format(name=name)) 176 177

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    opened by vky2998 2
  • Word limit

    Word limit

    The model is having trouble with long sentences. Specially if the words in the sentences are in upper case. It outputs only limited sentence as an output and the rest neglected sentence is shown as error.

    opened by Talib6509 0
  • Gramformer Highlight function not working

    Gramformer Highlight function not working

    Hello... I'm trying to get the edits between two sentences, but the highlight function is not working. Has anybody faced the same issue? Many thanks in advance

    opened by NourAlMerey 0
  • Suggestions to improve the grammar results for short sentences

    Suggestions to improve the grammar results for short sentences

    Hello..!

    I have used Gramformer model and I think this could be quite useful for checking and correcting some grammar points, especially for correcting singular/plural, verb forms and tenses, and spelling. However, some other grammar points (like correcting sentence structure, comparative/superlative forms, pronoun cases, etc.) seem to be still tricky.

    Note: I need to use the model on short sentences.

    The biggest challenge I faced in my case is: (Please suggest how to avoid it or improve it or changing some parameters...) 1 - Since it corrects grammar by generating text, most of the time it completely changes the sentence and rephrase it. How can we avoid this.

    whose bags you can bring? --> Which bags you can bring? (Just a sample, and sometime it generates totally changed verbose sentence)

    2 - Every time I give the same sentence as input, it generates different outputs:

    I go can there: three outputs in three different run ("I go, there"., "can I go there?", "I go back there.")

    Thanks!

    opened by muzamil47 0
Releases(v1.4)
  • v1.4(Aug 10, 2021)

    ⚡️ Features added/changed

    ✅ Correct API uses a ranker to sort good quality corrections. ✅ Highlight API returns sents w/errors marked up as readable tags. ✅ Edit API returns error types, positions, and respective corrections. ✅ The latest model checkpoint has been refreshed w/more data.

    License update to MIT.

    Source code(tar.gz)
    Source code(zip)
Owner
Prithivida
Applied NLP, XAI for NLP and Data Engineering
Prithivida
Flake8 plugin that checks import order against various Python Style Guides

flake8-import-order A flake8 and Pylama plugin that checks the ordering of your imports. It does not check anything else about the imports. Merely tha

Python Code Quality Authority 270 Nov 24, 2022
A static type analyzer for Python code

pytype - 🦆 ✔ Pytype checks and infers types for your Python code - without requiring type annotations. Pytype can: Lint plain Python code, flagging c

Google 4k Dec 31, 2022
docstring style checker

pydocstyle - docstring style checker pydocstyle is a static analysis tool for checking compliance with Python docstring conventions. pydocstyle suppor

Python Code Quality Authority 982 Jan 03, 2023
A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Gramformer Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or

Prithivida 1.3k Jan 08, 2023
Performant type-checking for python.

Pyre is a performant type checker for Python compliant with PEP 484. Pyre can analyze codebases with millions of lines of code incrementally – providi

Facebook 6.2k Jan 04, 2023
Static type checker for Python

Static type checker for Python Speed Pyright is a fast type checker meant for large Python source bases. It can run in a “watch” mode and performs fas

Microsoft 9.2k Jan 03, 2023
A simple plugin that allows running mypy from PyCharm and navigate between errors

mypy-PyCharm-plugin The plugin provides a simple terminal to run fast mypy daemon from PyCharm with a single click or hotkey and easily navigate throu

Dropbox 301 Dec 09, 2022
The official GitHub mirror of https://gitlab.com/pycqa/flake8

Flake8 Flake8 is a wrapper around these tools: PyFlakes pycodestyle Ned Batchelder's McCabe script Flake8 runs all the tools by launching the single f

Python Code Quality Authority 2.6k Jan 03, 2023
Unbearably fast O(1) runtime type-checking in pure Python.

Look for the bare necessities, the simple bare necessities. Forget about your worries and your strife. — The Jungle Book.

beartype 1.4k Jan 01, 2023
open source tools to generate mypy stubs from protobufs

mypy-protobuf: Generate mypy stub files from protobuf specs We just released a new major release mypy-protobuf 2. on 02/02/2021! It includes some back

Dropbox 527 Jan 03, 2023
Optional static typing for Python 3 and 2 (PEP 484)

Mypy: Optional Static Typing for Python Got a question? Join us on Gitter! We don't have a mailing list; but we are always happy to answer questions o

Python 14.4k Jan 08, 2023
Pymxs, the 3DsMax bindings of Maxscript to Python doesn't come with any stubs

PyMXS Stubs generator What Pymxs, the 3DsMax bindings of Maxscript to Python doe

Frieder Erdmann 19 Dec 27, 2022
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a

coala development group 3.4k Dec 29, 2022
An extension for flake8 that forbids some imports statements in some modules.

flake8-obey-import-goat An extension for flake8 that forbids some imports statements in some modules. Important: this project is developed using DDD,

Ilya Lebedev 10 Nov 09, 2022
Tools for improving Python imports

imptools Tools for improving Python imports. Installation pip3 install imptools Overview Detailed docs import_path Import a module from any path on th

Danijar Hafner 7 Aug 07, 2022
Flake8 plugin for managing type-checking imports & forward references

flake8-type-checking Lets you know which imports to put in type-checking blocks. For the imports you've already defined inside type-checking blocks, i

snok 67 Dec 16, 2022
Collection of awesome Python types, stubs, plugins, and tools to work with them.

Awesome Python Typing Collection of awesome Python types, stubs, plugins, and tools to work with them. Contents Static type checkers Dynamic type chec

TypedDjango 1.2k Jan 04, 2023
Pylint plugin to enforce some secure coding standards for Python.

Pylint Secure Coding Standard Plugin pylint plugin that enforces some secure coding standards. Installation pip install pylint-secure-coding-standard

Nguyen Damien 2 Jan 04, 2022
MyPy types for WSGI applications

WSGI Types for Python This is an attempt to bring some type safety to WSGI applications using Python's new typing features (TypedDicts, Protocols). It

Blake Williams 2 Aug 18, 2021
Easy saving and switching between multiple KDE configurations.

Konfsave Konfsave is a config manager. That is, it allows you to save, back up, and easily switch between different (per-user) system configurations.

42 Sep 25, 2022