PostgreSQL improperly sorts unicode chars with Czech collation

postgresql locale

It is correct. Accent for á, ď, é, ě, í, ň, ó, ť, ú, ů, ý should be ignored see article

Czech sort rules are little bit complex :)

postgresql locale

PostgreSQL does not have its own sort rules, it uses the rules provided by the operating system. If you try with /usr/bin/sort with the same locale, you'll get the same sort order.

Here's the result with your sample data when tried with Ubuntu 12.04, PostgreSQL 9.1:

create COLLATION cs_CZ (locale="cs_CZ.UTF-8");select * from (values('Ca'),('Čb'),('Cc')) as l(a) order by a collate cs_CZ;

Result:

 a  ---- Ca Cc Čb(3 rows)

Notice that it's sorted as you say it should.

If your operating system sorts differently and you're sure that it's wrong according to official czech rules, then it's a bug in its czech locale implementation.

UPDATE following comment:

 SELECT * FROM (values('A'),('Da'),('Ďb'),('Dc'),('E')) AS l(a)   ORDER BY a COLLATE cs_CZ;

results in:

 a  ---- A Da Ďb Dc E

postgresql locale

sorting in czech collation is correct by czech grammar rules!

Characters like á, ď, é, ě, í, ň, ó, ť, ú, ů, ý are sorted like they don't have punctuation so result:

A, Da, Ďb, Dc, E is corret by czech grammar.

For Slovak and Czech it can sounds crazy, but "rules as rules".

Other rules are for slovak language (collate sk_SK) where characters d-ď, t-ť, n-ň, l-ľ are in alphabetical order like czech Ď in this case.

CodeHunter

PostgreSQL improperly sorts unicode chars with Czech collation

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last