Black for Python docstrings and reStructuredText (rst).

Overview

Style-Doc

Apache 2.0 License Contributor Covenant v2.0 Python Version pypi
Static Code Checks GitHub issues

Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python files or reStructuredText.

One Conversation
This project is maintained by the One Conversation team of Deutsche Telekom AG.
It is based on the style_doc.py script from the HuggingFace Inc. team.

Installation

Style-Doc is available at the Python Package Index (PyPI). It can be installed with pip:

$ pip install style-doc

Usage

$ style-doc --help
usage: style-doc [-h] [--max_len MAX_LEN] [--check_only] [--py_only]
                 [--rst_only]
                 files [files ...]

positional arguments:
  files              The file(s) or folder(s) to restyle.

optional arguments:
  -h, --help         show this help message and exit
  --max_len MAX_LEN  The maximum length of lines.
  --check_only       Whether to only check and not fix styling issues.
  --py_only          Whether to only check py files.
  --rst_only         Whether to only check rst files.

Examples

  • format all docstrings (.py files) and rst files in the src and docs folder with line length of 99:
    style-doc --max_len 99 src docs
  • check all docstrings (.py files) and rst files in the src and docs folder with line length of 99:
    style-doc --max_len 99 --check_only src docs
  • format all docstrings (.py files only) in the src folder with line length of 99:
    style-doc --max_len 99 --py_only src
  • check all docstrings (.py files only) in the src folder with line length of 99:
    style-doc --max_len 99 --check_only --py_only src
  • format all rst files only in the docs folder with line length of 99:
    style-doc --max_len 99 --rst_only docs
  • check all rst files only in the docs folder with line length of 99:
    style-doc --max_len 99 --check_only --rst_only docs

To integrate Style-Doc (and more checks) into your GitHub Actions see our static_checks.yml example and our configuration in setup.py.

Support and Feedback

The following channels are available for discussions, feedback, and support requests:

Contribution

Our commitment to open source means that we are enabling -in fact encouraging- all interested parties to contribute and become part of our developer community.

Contribution and feedback is encouraged and always welcome. For more information about how to contribute, as well as additional contribution information, see our Contribution Guidelines. By participating in this project, you agree to abide by its Code of Conduct at all times.

Code of Conduct

This project has adopted the Contributor Covenant in version 2.0 as our code of conduct. Please see the details in our CODE_OF_CONDUCT.md. All contributors must abide by the code of conduct.

Working Language

We decided to apply English as the primary project language.

Consequently, all content will be made available primarily in English. We also ask all interested people to use English as language to create issues, in their code (comments, documentation etc.) and when you send requests to us. The application itself and all end-user facing content will be made available in other languages as needed.

Licensing

Copyright (c) 2020 The HuggingFace Inc. team
Copyright (c) 2021 Philip May, Deutsche Telekom AG

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments
  • --max-len seems mandatory, not optional parameter

    --max-len seems mandatory, not optional parameter

    I run style-doc . --check and get an error while ```style-doc . --check `--max-len 80`` works.

    The error message is:

      File "c:\users\epogr\anaconda3\lib\site-packages\style_doc\style_doc.py", line 460, in style_docstring 
        if len(docstring) < max_len and "\n" not in docstring:
    TypeError: '<' not supported between instances of 'int' and 'NoneType'
    
    opened by epogrebnyak 2
  • How should we

    How should we "communicate" an error?

    "You must not set --py_only and --rst_only at the same time." with sys.exit(1) or -1 or raise ValueError(...

    raise ValueError(f"{len(changed)} files should be restyled!") or use ``sys.exit...`

    enhancement help wanted 
    opened by PhilipMay 2
  • Ignore commented-out classes/functions/etc.

    Ignore commented-out classes/functions/etc.

    Currently the search for """ isn't respecting commented out code:

        # For future implementation
        # def base_url(self) -> str:
        #     """
        #     Generate SCIM base url
        #     """
        #     return "https://app.asana.com/api/1.0/scim/"
    

    becomes:

        # For future implementation
        # def base_url(self) -> str:
        #     """
        # Generate SCIM base url #
        """
        #     return "https://app.asana.com/api/1.0/scim/"
    

    Which is a syntax error, since it is uncommenting one of the """.

    opened by dragonpaw 2
  • Create a git pre-commit hook for style-doc

    Create a git pre-commit hook for style-doc

    Have you considered packaging style-doc for use as a git pre-commit hook, and listing it with the pre-commit project? It seems like it would be a great addition, and make it very easy for people to integrate the docstring formatter into their existing workflows and get automatic updates when new releases happen.

    opened by zaneselvans 1
  • Fix issues when code has `

    Fix issues when code has `"""` but is not a docstring

    We had to apply this workaround:

    https://github.com/telekom/style-doc/blob/db352ed72ae4473a805d485692df58ec4511a673/style_doc/style_doc.py#L495-L497

    # fmt: off and # fmt: on is needed so black does not convert it back to '"""'.

    bug 
    opened by PhilipMay 0
  • Add option to use config file

    Add option to use config file

    Use pyproject.toml

    see black

    • https://github.com/psf/black/blob/7567cdf3b4f32d4fb12bd5ca0da838f7ff252cfc/src/black/files.py#L69
    • https://github.com/psf/black/blob/017aafea992ca1c6d7af45d3013af7ddb7fda12a/src/black/init.py#L44
    enhancement good first issue low priority 
    opened by PhilipMay 0
Releases(0.2.0)
Owner
Telekom Open Source Software
published by Deutsche Telekom AG and partner companies
Telekom Open Source Software
Simple text to phones converter for multiple languages

Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool

CoML 762 Dec 29, 2022
C.J. Hutto 3.8k Dec 30, 2022
BERN2: an advanced neural biomedical namedentity recognition and normalization tool

BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by

DMIS Laboratory - Korea University 99 Jan 06, 2023
Code for Text Prior Guided Scene Text Image Super-Resolution

Code for Text Prior Guided Scene Text Image Super-Resolution

82 Dec 26, 2022
Lingtrain Aligner — ML powered library for the accurate texts alignment.

Lingtrain Aligner ML powered library for the accurate texts alignment in different languages. Purpose Main purpose of this alignment tool is to build

Sergei Averkiev 76 Dec 14, 2022
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classifi

186 Dec 24, 2022
Simple NLP based project without any use of AI

Simple NLP based project without any use of AI

Shripad Rao 1 Apr 26, 2022
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

230 Dec 07, 2022
Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

Franck Dernoncourt 1.6k Dec 27, 2022
Tools for curating biomedical training data for large-scale language modeling

Tools for curating biomedical training data for large-scale language modeling

BigScience Workshop 242 Dec 25, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transform

Bytedance Inc. 2.5k Jan 03, 2023
Textpipe: clean and extract metadata from text

textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata

Textpipe 298 Nov 21, 2022
Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

SunLu Z 7 Nov 11, 2022
Unsupervised Language Modeling at scale for robust sentiment classification

** DEPRECATED ** This repo has been deprecated. Please visit Megatron-LM for our up to date Large-scale unsupervised pretraining and finetuning code.

NVIDIA Corporation 1k Nov 17, 2022
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
Pretrain CPM - 大规模预训练语言模型的预训练代码

CPM-Pretrain 版本更新记录 为了促进中文自然语言处理研究的发展,本项目提供了大规模预训练语言模型的预训练代码。项目主要基于DeepSpeed、Megatron实现,可以支持数据并行、模型加速、流水并行的代码。 安装 1、首先安装pytorch等基础依赖,再安装APEX以支持fp16。 p

Tsinghua AI 37 Dec 06, 2022
Implementation of Natural Language Code Search in the project CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

CodeBERT-Implementation In this repo we have replicated the paper CodeBERT: A Pre-Trained Model for Programming and Natural Languages. We are interest

Tanuj Sur 4 Jul 01, 2022
Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

capbot-siic Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021. Problem Inspiration A plethora

Aryan Kargwal 19 Feb 17, 2022
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Welcome to AdaptNLP A high level framework and library for running, training, and deploying state-of-the-art Natural Language Processing (NLP) models

Novetta 407 Jan 03, 2023
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc

GUOKUN LAI 197 Dec 11, 2022