How do I know if my PostgreSQL server is using the "C" locale? How do I know if my PostgreSQL server is using the "C" locale? postgresql postgresql

How do I know if my PostgreSQL server is using the "C" locale?


Currently some locale [docs] support can only be set at initdb time, but I think the one relevant to _pattern_ops can be modified via SET at runtime, LC_COLLATE. To see the set values you can use the SHOW command.

For example:

SHOW LC_COLLATE

_pattern_ops indexes are useful in columns that use pattern matching constructs, like LIKE or regexps. You still have to make a regular index (without _pattern_ops) to do equality search on an index. So you have to take all this into consideration to see if you need such indexes on your tables.

About what locale is, it's a set of rules about character ordering, formatting and similar things that vary from language/country to another language/country. For instance, the locale fr_CA (French in Canada) might have some different sorting rules (or way of displaying numbers and so on) than en_CA (English in Canada.). The standard "C" locale is the POSIX standards-compliant default locale. Only strict ASCII characters are valid, and the rules of ordering and formatting are mostly those of en_US (US English)

In computing, locale is a set of parameters that defines the user's language, country and any special variant preferences that the user wants to see in their user interface. Usually a locale identifier consists of at least a language identifier and a region identifier.


psql -l

according to handbook

example output:

                               List of databases    Name     | Owner  | Encoding |   Collate   |    Ctype    | Access privileges-------------+--------+----------+-------------+-------------+------------------- packrd      | packrd | UTF8     | en_US.UTF-8 | en_US.UTF-8 | postgres    | packrd | UTF8     | en_US.UTF-8 | en_US.UTF-8 | template0   | packrd | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/packrd        +             |        |          |             |             | packrd=CTc/packrd template1   | packrd | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/packrd        +             |        |          |             |             | packrd=CTc/packrd(5 rows)


OK, from my perusings, it appears that this initial setting

initdb --locale=xxx

 --locale=locale       Specifies the locale to be used in this database. This is equivalent to specifying both --lc-collate and --lc-ctype.

basically specifies the "default" locale for all database that you create after that (i.e. it specifies the settings for template1, which is the default template). You can create new databases with a different locale like this:

Locale is different than encoding, you can manually specify it and/or encoding:

 CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;

If you want to manually call it out.

Basically if you don't specify it, it uses the system default, which is almost never "C".

So if your show LC_COLLATE returns anything other than "C" or "POSIX" then you are not using the standard C locale and you will need to specify the xxx_pattern_ops for your indexes. Note also the caveat that if you want to use the <, <=, >, or >= operators you need to create a second index without the xxx_pattern_ops flag (unless you are using the standard C locale on your database, which is rare...). For just == and LIKE (etc.) then you don't need a second index. If you don't need LIKE then you don't need the index with xxx_pattern_ops, possibly, as well.

Even if your indexes are defined to collate with the "default" like

CREATE INDEX my_index_name  ON table_name  USING btree  (identifier COLLATE pg_catalog."default");

This is not enough, unless the default is the "C" (or POSIX, same thing) collation, it can't be used for patterns like LIKE 'ABC%'. You need something like this:

CREATE INDEX my_index_name  ON table_name  USING btree  (identifier COLLATE pg_catalog."default" varchar_pattern_ops);