Paranoid text spacing in Python

Overview

pangu.py

https://img.shields.io/travis/vinta/pangu.py/master.svg?style=flat-square https://img.shields.io/codecov/c/github/vinta/pangu.py/master.svg?style=flat-square https://img.shields.io/pypi/v/pangu.svg?style=flat-square https://img.shields.io/pypi/pyversions/pangu.svg?style=flat-square https://img.shields.io/badge/made%20with-%e2%9d%a4-ff69b4.svg?style=flat-square

Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols).

Installation

$ pip install -U pangu

Usage

In Python

import pangu

new_text = pangu.spacing_text('當你凝視著bug,bug也凝視著你')
# new_text = '當你凝視著 bug,bug 也凝視著你'

nwe_content = pangu.spacing_file('path/to/file.txt')
# nwe_content = '與 PM 戰鬥的人,應當小心自己不要成為 PM'

In CLI

$ pangu "請使用uname -m指令來檢查你的Linux作業系統是32位元或是[敏感词已被屏蔽]位元"
請使用 uname -m 指令來檢查你的 Linux 作業系統是 32 位元或是 [敏感词已被屏蔽] 位元

$ python -m pangu "為什麼小明有問題都不Google?因為他有Bing"
為什麼小明有問題都不 Google?因為他有 Bing

$ echo "未來的某一天,Gmail配備的AI可能會得出一個結論:想要消滅垃圾郵件最好的辦法就是消滅人類" >> path/to/file.txt
$ pangu -f path/to/file.txt >> pangu_file.txt
$ cat pangu_file.txt
未來的某一天,Gmail 配備的 AI 可能會得出一個結論:想要消滅垃圾郵件最好的辦法就是消滅人類

$ echo "心裡想的是Microservice,手裡做的是Distributed Monolith" | pangu
心裡想的是 Microservice,手裡做的是 Distributed Monolith

$ echo "你從什麼時候開始產生了我沒使用Monkey Patch的錯覺?" | python -m pangu
你從什麼時候開始產生了我沒使用 Monkey Patch 的錯覺?
Owner
Vinta Chen
I failed the Turing Test.
Vinta Chen
Utility for Text Normalisation or Inverse Normalisation

Text Processor Text Normalisation or Inverse Normalisation for Indonesian, e.g. measurements "123 kg" - "seratus dua puluh tiga kilogram" Currency/Mo

Cahya Wirawan 2 Aug 11, 2022
Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

SeatGeek 1.2k Jan 01, 2023
Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Meaningful Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short li

Mohammed Rabil 1 Jan 01, 2022
A Python app which can convert normal text to Handwritten text.

Text to HandWritten Text ✍️ Converter Watch Tutorial for this project Usage:- Clone my repository. Open CMD in working directory. Run following comman

Kushal Bhavsar 5 Dec 11, 2022
Vastasanuli - Vastasanuli pelaa Sanuli-peliä.

Vastasanuli Vastasanuli pelaa SANULI -peliä. Se ei aina voita. Käyttö Tarttet Pythonin (3.6+). Aja make (tai lataa words.txt muualta) Asentele vaaditt

Aarni Koskela 1 Jan 06, 2022
Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files

cdvpp Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files Reads a Digital Vaccination Pass PDF file as in

Esteban Borai 1 Nov 17, 2021
Code Jam for creating a text-based adventure game engine and custom worlds

Text Based Adventure Jam Author: Devin McIntyre Our goal is two-fold: Create a text based adventure game engine that can parse a standard file format

HTTPChat 4 Dec 26, 2021
Little python script + dictionary to help solve Wordle puzzles

Wordle Solver Little python script + dictionary to help solve Wordle puzzles Usage Usage: ./wordlesolver.py [letters in word] [letters not in word] [p

Luke Stephens (hakluke) 4 Jul 24, 2022
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

Niklas 29 Dec 20, 2022
Text to ASCII and ASCII to text

Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character

4 Jan 22, 2022
A username generator made from French Canadian most common names.

This script is used to generate a username list using the most common first and last names in Quebec in different formats. It can generate some passwords using specific patterns such as Tremblay2020.

5 Nov 26, 2022
Widevine KEY Extractor in Python

Widevine Client 3 This was originally written by T3rry7f. This repo is slightly modified version of his repo. This only works on standard Windows! Usa

Vank0n (SJJeon) 68 Dec 29, 2022
Convert ebooks with few clicks on Telegram!

E-Book Converter Bot A bot that converts e-books to various formats, powered by calibre! It currently supports 34 input formats and 19 output formats.

Youssif Shaaban Alsager 45 Jan 05, 2023
Wordle strategy: Find frequency of letters appearing in 5-letter words in the English language

Find frequency of letters appearing in 5-letter words in the English language In

Gabriel Apolinário 1 Jan 17, 2022
🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

Brandon 5.6k Jan 03, 2023
WorldCloud Orçamento de Estado 2022

World Cloud Orçamento de Estado 2022 What it does This script creates a worldcloud, masked on a image, from a txt file How to run it? Install all libr

Jorge Gomes 2 Oct 12, 2021
box is a text-based visual programming language inspired by Unreal Engine Blueprint function graphs.

Box is a text-based visual programming language inspired by Unreal Engine blueprint function graphs. $ cat factorial.box ┌─ƒ(Factorial)───┐

Pranav 104 Dec 24, 2022
Repository containing the code for An-Gocair text normaliser

Scottish Gaelic Text Normaliser The following project contains the code and resources for the Scottish Gaelic text normalisation project. The repo can

3 Jun 28, 2022
Aml - anti-money laundering

Anti-money laundering Dedect relationship between A and E by tracing through payments with similar amounts and identifying payment chains. For example

3 Nov 21, 2022
An online markdown resume template project, based on pywebio

An online markdown resume template project, based on pywebio

极简XksA 5 Nov 10, 2022