How to make Django slugify work properly with Unicode strings?
There is a python package called unidecode that I've adopted for the askbot Q&A forum, it works well for the latin-based alphabets and even looks reasonable for greek:
>>> import unidecode>>> from unidecode import unidecode>>> unidecode(u'διακριτικός')'diakritikos'
It does something weird with asian languages:
>>> unidecode(u'影師嗎')'Ying Shi Ma '>>>
Does this make sense?
In askbot we compute slugs like so:
from unidecode import unidecodefrom django.template import defaultfiltersslug = defaultfilters.slugify(unidecode(input_text))
The Mozilla website team has been working on an implementation :https://github.com/mozilla/unicode-slugifysample code athttp://davedash.com/2011/03/24/how-we-slug-at-mozilla/
With Django >= 1.9, django.utils.text.slugify
has a allow_unicode
parameter:
>>> slugify("你好 World", allow_unicode=True)"你好-world"
If you use Django <= 1.8 (which you should not since April 2018), you can pick up the code from Django 1.9.