Python flexible slugify function

Last update: Dec 20, 2022

Related tags

Miscellaneous awesome-slugify

Overview

awesome-slugify

Python flexible slugify function

PyPi: https://pypi.python.org/pypi/awesome-slugify

Github: https://github.com/dimka665/awesome-slugify

Install

pip install awesome-slugify

Usage

from slugify import slugify

slugify('Any text')  # 'Any-text'

Custom slugify

from slugify import slugify, Slugify, UniqueSlugify

slugify('Any text', to_lower=True)  # 'any-text'

custom_slugify = Slugify(to_lower=True)
custom_slugify('Any text')          # 'any-text'

custom_slugify.separator = '_'
custom_slugify('Any text')          # 'any_text'

custom_slugify = UniqueSlugify()
custom_slugify('Any text')          # 'any-text'
custom_slugify('Any text')          # 'any-text-1'

slugify function optional args

to_lower              # if True convert text to lowercase
max_length            # output string max length
separator             # separator string
capitalize            # if True upper first letter

Slugify class args

pretranslate = None               # function or dict for replace before translation
translate = unidecode.unidecode   # function for slugifying or None
safe_chars = ''                   # additional safe chars
stop_words = ()                   # remove these words from slug

to_lower = False                  # default to_lower value
max_length = None                 # default max_length value
separator = '-'                   # default separator value
capitalize = False                # default capitalize value

UniqueSlugify class args

# all Slugify class args +
uids = []                         # initial unique ids

Predefined slugify functions

Some slugify functions is predefined this way:

from slugify import Slugify, CYRILLIC, GERMAN, GREEK

slugify = Slugify()
slugify_unicode = Slugify(translate=None)

slugify_url = Slugify()
slugify_url.to_lower = True
slugify_url.stop_words = ('a', 'an', 'the')
slugify_url.max_length = 200

slugify_filename = Slugify()
slugify_filename.separator = '_'
slugify_filename.safe_chars = '-.'
slugify_filename.max_length = 255

slugify_ru = Slugify(pretranslate=CYRILLIC)
slugify_de = Slugify(pretranslate=GERMAN)
slugify_el = Slugify(pretranslate=GREEK)

Examples

from slugify import Slugify, UniqueSlugify, slugify, slugify_unicode
from slugify import slugify_url, slugify_filename
from slugify import slugify_ru, slugify_de

slugify('one kožušček')                       # one-kozuscek
slugify('one two three', separator='.')       # one.two.three
slugify('one two three four', max_length=12)  # one-two-four   (12 chars)
slugify('one TWO', to_lower=True)             # one-two
slugify('one TWO', capitalize=True)           # One-TWO

slugify_filename(u'Дrаft №2.txt')             # Draft_2.txt
slugify_url(u'Дrаft №2.txt')                  # draft-2-txt

my_slugify = Slugify()
my_slugify.separator = '.'
my_slugify.pretranslate = {'я': 'i', '♥': 'love'}
my_slugify('Я ♥ борщ')                        # I.love.borshch  (custom translate)

slugify('Я ♥ борщ')                           # Ia-borshch  (standard translation)
slugify_ru('Я ♥ борщ')                        # Ya-borsch   (alternative russian translation)
slugify_unicode('Я ♥ борщ')                   # Я-борщ      (sanitize only)

slugify_de('ÜBER Über slugify')               # UEBER-Ueber-slugify

slugify_unique = UniqueSlugify(separator='_')
slugify_unique('one TWO')                     # One_TWO
slugify_unique('one TWO')                     # One_TWO_1

slugify_unique = UniqueSlugify(uids=['cellar-door'])
slugify_unique('cellar door')                 # cellar-door-1

Custom Unique Slugify Checker

from slugify import UniqueSlugify

def my_unique_check(text, uids):
    if text in uids:
        return False
    return not SomeDBClass.objects.filter(slug_field=text).exists()

custom_slugify_unique = UniqueSlugify(unique_check=my_unique_check)

# Checks the database for a matching document
custom_slugify_unique('te occidere possunt')

Running UnitTests

$ virtualenv venv
$ venv/bin/pip install -r requirements.txt
$ venv/bin/nosetests slugify

Comments

Update minimum unidecode version

Unidecode has moved to semantic versioning. I don't believe there are any major changes that awesome-slugify requires.

Unidecode release change:

2018-01-05	unidecode 1.0.22
	* Move to semantic version numbering, no longer following version
	  numbers from the original Perl module. This fixes an issue with
	  setuptools (>= 8) and others expecting major.minor.patch format.
	  (https://github.com/avian2/unidecode/issues/13)
	* Add transliterations for currency signs U+20B0 through U+20BF
	  (thanks to Mike Swanson)
	* Surround transliterations of vulgar fractions with spaces to avoid
	  incorrect combinations with adjacent numerals
	  (thanks to Jeffrey Gerard)

opened by jwbixby 5

Fix for import on python 2.7.7 (windows)

Error with awesome-slugify 1.6.4:

python -c "from slugify import UniqueSlugify" Traceback (most recent call last): File "", line 1, in File "myvirtualenv\lib\site-packages\slugify__init__.py", line 2, in from slugify.main import Slugify, UniqueSlugify ImportError: No module named main

opened by srault95 2
UniqueSlugify Improvements
Hello, I'm using "awesome-slugify" for my work, and there are a few minor improvements I felt I could contribute which we were looking for on my team.

Thanks! Greg

UniqueSlugify should use sets for self.uids instead of a list, for performance reasons.

Running the following code on the base branch (with self.uids as a list) yields:

>>> import time >>> import uuid >>> from slugify import UniqueSlugify >>> >>> def test_time(): ... start = time.time() ... slugify = UniqueSlugify() ... for i in xrange(100000): ... _ = slugify(str(uuid.uuid4())) ... return time.time() - start ... >>> test_time() 212.86210703849792

Running the same code with sets:

>>> test_time() 10.824954986572266

Often, uids are stored in a database or external key/value system. It's helpful to have an option to override the uniqueness check without having to load all the uids into memory. For example, supposing I have a Django project with a slug field, instead of having to do:

from django_blog.models import BlogPost from slugify import UniqueSlugify slugify = UniqueSlugify(uids=BlogPost.objects.values_list('url_slug', flat=True))

I can check with a lightweight existence call to the DB by overriding the check:

from slugify import UniqueSlugify def my_unique_check(text, uids): if text in uids: return False return not BlogPost.objects.filter(url_slug=text).exists() custom_slugify_unique = UniqueSlugify(unique_check=my_unique_check) custom_slugify_unique('te occidere possunt')

Also added some documentation on running the unit tests.
opened by gthole 2
not working in python 3

Please make this library compatible with Python 3. Right now it doesn't work because of at least several syntax errors in some string literals, and an iteration over a changing dict issue.

opened by irmen 2
Please remove capitalizing of first character in get_pretranslate function

Great project! Thank you so much!

However, when using the pretranslations, it forces capitalization. Is there any reason for this (something unicode related) or can I submit a pull request to maintain the case of the entire word?

opened by pydanny 2
Fix the import path for get_slugify
At the moment, I can't do:

from slugify import get_slugify

I get an import error because of how slugify.__init__.py is defined. Instead I have to do the following:

from slugify.main import get_slugify

I would like to either submit a patch to slugify.__init__.py or correct the documentation. @dimka665, what approach would you prefer?
opened by pydanny 2
Add test cases with numbers in them.

We use awesome-slugify for Wok and it's mostly great, but I've found that it errors on numeric slugs (in our case, 404).

These test cases demonstrate what I'd naively assume to be the correct slugification behavior for alphanumerics, but I'm PR-ing to verify that before spending a lot of time modifying Slugify's behavior to meet these expectations in case I'm wrong.

opened by edunham 1
Add simple implementation with Travis CI

Check out passing tests of my fork on Travis https://travis-ci.org/jpadilla/awesome-slugify

If you accept this, you just have to signup to Travis and setup your accounts https://travis-ci.org/profile. Then add a badge to the README.md.

opened by jpadilla 1
Fix to_lower with unicode text
I noticed that when trying to do something like:

slugify('自転車', to_lower=True)

I got Zi-Zhuan-Che instead of the expected zi-zhuan-che. Effectively I introduced a test which failed, implemented a fix, and all tests kept passing.
opened by jpadilla 1
allow Slugify to take callable pretranslate

Slugify::set_pretranslate should accept a dictionary, a callable, or None, but as currently implemented, it only accepts a dict or None. This PR allows set_translate to take a callable as well. All tests passing.

opened by jmcarp 1

Exception StopIteration on empty strings '' with max_length or separator

>>> from slugify import slugify
>>> slugify('', max_length=40, separator='_')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/slugify/main.py", line 108, in slugify
    text = join_words(words, separator, max_length)
  File "/usr/lib/python2.7/dist-packages/slugify/main.py", line 85, in join_words
    text = next(words)    # text = words.pop(0)
StopIteration

opened by sebest 1

Incorrect Japanese transliteration for っ
っ is U+3063 HIRAGANA LETTER SMALL TU, which is different than つ U+3064 HIRAGANA LETTER TU in that the phonetic transliteration of it is a glottal stop; the English equivalent is doubling the consonant-sound of the next mora.

For example, ほっこり should be transliterated as 'hokkori', but awesome-slugify incorrectly renders it 'hotsukori' (as if the っ were a つ):

>>> slugify('ほっこり') 'hotsukori'

See also https://translate.google.com/?sl=ja&tl=en&text=%E3%81%BB%E3%81%A3%E3%81%93%E3%82%8A&op=translate for the use of this character (and https://translate.google.com/?sl=ja&tl=en&text=%E3%81%BB%E3%81%A4%E3%81%93%E3%82%8A%0A%0A&op=translate to see what the large tsu does instead).
opened by fluffy-critter 0
Clash with zacharyvoase/slugify

The project at https://github.com/zacharyvoase/slugify named "slugify" also has the module name "slugify".

pip install slugify

import slugify slugify.slugify(u"Héllø Wörld")

If you add both packages to your requirements.txt, what happens when you import "slugify"?

opened by Chris2048 4

Fix DeprecationWarnings in newer Pythons (3.6+)

I see these warnings each time I run:

...python3.6/site-packages/slugify/main.py:65
  ...python3.6/site-packages/slugify/main.py:65: DeprecationWarning: invalid escape sequence \p
    '''

...python3.6/site-packages/slugify/main.py:98
  ...python3.6/site-packages/slugify/main.py:98: DeprecationWarning: invalid escape sequence \L
    PRETRANSLATE = re.compile(u'(\L<options>)', options=convert_dict)

...python3.6/site-packages/slugify/main.py:140
  ...python3.6/site-packages/slugify/main.py:140: DeprecationWarning: invalid escape sequence \p
    unwanted_chars_re = u'[^\p{{AlNum}}{safe_chars}]+'.format(safe_chars=re.escape(self._safe_chars or ''))

...python3.6/site-packages/slugify/main.py:144
  ...python3.6/site-packages/slugify/main.py:144: DeprecationWarning: invalid escape sequence \p
    unwanted_chars_and_words_re = unwanted_chars_re + u'|(?<!\p{AlNum})(?:\L<stop_words>)(?!\p{AlNum})'

Perhaps this is the problem: https://stackoverflow.com/questions/50504500/deprecationwarning-invalid-escape-sequence-what-to-use-instead-of-d

opened by mcarans 0

Update requirements.txt

regex==2018.11.6 breaks this package:

$ pip install regex==2018.11.6
[...]
$ python
Python 2.7.10 (default, Oct  6 2017, 22:29:07)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import slugify
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/slugify/__init__.py", line 2, in <module>
    from slugify.main import Slugify, UniqueSlugify
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/slugify/main.py", line 68, in <module>
    class Slugify(object):
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/slugify/main.py", line 70, in Slugify
    upper_to_upper_letters_re = re.compile(UPPER_TO_UPPER_LETTERS_RE, re.VERBOSE | re.VERSION1)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/regex.py", line 345, in compile
    return _compile(pattern, flags, kwargs)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/regex.py", line 486, in _compile
    parsed = _parse_pattern(source, info)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 388, in _parse_pattern
    branches = [parse_sequence(source, info)]
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 413, in parse_sequence
    element = parse_paren(source, info)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 823, in parse_paren
    subpattern = _parse_pattern(source, info)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 388, in _parse_pattern
    branches = [parse_sequence(source, info)]
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 410, in parse_sequence
    sequence.append(parse_escape(source, info, False))
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 1186, in parse_escape
    return parse_property(source, info, ch == "p", in_set)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 1341, in parse_property
    prop = lookup_property(prop_name, name, positive != negate, source)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 1603, in lookup_property
    value = standardise_name(value)
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 1593, in standardise_name
    return ascii_upper("".join(ch for ch in name if ch not in "_- "))
  File "/Users/artur/venv/qq/lib/python2.7/site-packages/_regex_core.py", line 1586, in ascii_upper
    return s.translate(upper_trans)
TypeError: character mapping must return integer, None or unicode

opened by arturh 0

UniqueSlugify will exceed max_length if it adds digits to make slug unique

Python 3.6.4 (default, Mar  9 2018, 23:15:03)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from slugify import UniqueSlugify
>>> s = UniqueSlugify(to_lower=True, max_length=3)
>>> s("Hello World")
'hel'
>>> s("Hello World")
'hel-1'

I would expect something like hel, h-1, etc.

opened by iandees 0

Releases(v1.4)

v1.4(Apr 8, 2014)

Source code(tar.gz)
Source code(zip)

Owner

Dmitry Voronin

GitHub Repository https://pypi.python.org/pypi/awesome-slugify

A web-based chat application that enables multiple users to interact with one another

A web-based chat application that enables multiple users to interact with one another, in the same chat room or different ones according to their choosing.

3 Apr 22, 2022

A Python wrapper for Matrix Synapse admin API

Synapse-admin-api-python A Python wrapper for Matrix Synapse admin API. Versioning This library now supports up to Synapse 1.45.0, any Admin API intro

9 Sep 28, 2022

Fabric mod where anyone can PR anything, concerning or not. I'll merge everything as soon as it works.

Guess What Will Happen In This Fabric mod where anyone can PR anything, concerning or not (Unless it's too concerning). I'll merge everything as soon

65 Dec 25, 2022

Library for Memory Trace Statistics in Python

Memory Search Library for Memory Trace Statistics in Python The library uses tracemalloc as a core module, which is why it is only available for Pytho

1 Dec 20, 2021

In this project we will be using OpenCV to virtually drag a rectangle and drop it at a different location. It will be further used for Virtual Mouse.

Virtual Drag & Drog using OpenCV In this project we will be using OpenCV to virtually drag a rectangle and drop it at a different location. It will be

5 Sep 27, 2021

Code for Crowd counting via unsupervised cross-domain feature adaptation.

CDFA-pytorch Code for Unsupervised crowd counting via cross-domain feature adaptation. Pre-trained models Google Drive Baidu Cloud : t4qc Environment

6 Dec 11, 2022

Interactive class notebooks for ECE4076 Computer Vision, weeks 1 - 6

ECE4076 Interactive class notebooks for ECE4076 Computer Vision, weeks 1 - 6. ECE4076 is a computer vision unit at Monash University, covering both cl

9 Jun 16, 2022

Design-by-contract in Python3 with informative violation messages and inheritance

icontract icontract provides design-by-contract to Python3 with informative violation messages and inheritance. It also gives a base for a flourishing

275 Jan 02, 2023

Reactjs web app written entirely in python, using transcrypt compiler.

22 Nov 27, 2022

A calculator for common measurements used in sci-fi books.

Sci-fi-speed-calculator A calculator for common measurements used in sci-fi books. Author: Tyler Windmemuth Purpose: This program allows sci-fi author

0 Apr 22, 2022

An async API wrapper for Dress To Impress written in Python.

dti.py An async API wrapper for Dress To Impress written in Python. Some notes: For the time being, there are no front-facing docs for this beyond doc

1 Dec 14, 2022

:fishing_pole_and_fish: List of `pre-commit` hooks to ensure the quality of your `dbt` projects.

pre-commit-dbt List of pre-commit hooks to ensure the quality of your dbt projects. BETA NOTICE: This tool is still BETA and may have some bugs, so pl

262 Nov 25, 2022

Convex Optimisation MVA course - Assignment

Convex Optimisation MVA course - Assignment This repository contains the coding files of the third assignment in the MVA Convex Optimisation course. U

1 Nov 27, 2021

Backups made easy, automated, monitored and SECURED with an audited encryption

Backup Controller Backups made easy, automated, monitored and SECURED with an audited encryption. Schedules backup tasks executed by Backup Maker, upl

1 Jan 30, 2022

github action test, because I dont know it.

mad-y testing testing pip install -r requirements.txt add the DISCORD_TOKEN value to your env vars. and run mad-y how to Deploy ` docker build -t mad-

1 Oct 29, 2021

Fixes your Microphone Level to one specific value.

MicLeveler Fixes your Microphone Level to one specific value. Intention A friend of mine has the problem that some programs are setting his microphone

2 Oct 14, 2021

A community-driven python bot that aims to be as simple as possible to serve humans with their everyday tasks

JARVIS on Messenger Just A Rather Very Intelligent System, now on Messenger! Messenger is now used by 1.2 billion people every month. With the launch

1.3k Jan 07, 2023

A10 cipher - A Hill 2x2 cipher that totally gone wrong

A10_cipher This is a Hill 2x2 cipher that totally gone wrong, it encrypts with H

15 Oct 19, 2022

The next generation Canto RSS daemon

Canto Daemon This is the RSS backend for Canto clients. Canto-curses is the default client at: http://github.com/themoken/canto-curses Requirements De

155 Dec 28, 2022

This is an implementation of PEP 557, Data Classes.

This is an implementation of PEP 557, Data Classes. It is a backport for Python 3.6. Because dataclasses will be included in Python 3.7, any discussio

561 Dec 06, 2022