How do I know if my PostgreSQL server is using the "C" locale?
Currently some locale [docs] support can only be set at initdb time, but I think the one relevant to _pattern_ops
can be modified via SET at runtime, LC_COLLATE. To see the set values you can use the SHOW command.
For example:
SHOW LC_COLLATE
_pattern_ops
indexes are useful in columns that use pattern matching constructs, like LIKE
or regexps. You still have to make a regular index (without _pattern_ops
) to do equality search on an index. So you have to take all this into consideration to see if you need such indexes on your tables.
About what locale is, it's a set of rules about character ordering, formatting and similar things that vary from language/country to another language/country. For instance, the locale fr_CA (French in Canada) might have some different sorting rules (or way of displaying numbers and so on) than en_CA (English in Canada.). The standard "C" locale is the POSIX standards-compliant default locale. Only strict ASCII characters are valid, and the rules of ordering and formatting are mostly those of en_US (US English)
In computing, locale is a set of parameters that defines the user's language, country and any special variant preferences that the user wants to see in their user interface. Usually a locale identifier consists of at least a language identifier and a region identifier.
psql -l
according to handbook
example output:
List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges-------------+--------+----------+-------------+-------------+------------------- packrd | packrd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | packrd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | packrd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/packrd + | | | | | packrd=CTc/packrd template1 | packrd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/packrd + | | | | | packrd=CTc/packrd(5 rows)
OK, from my perusings, it appears that this initial setting
initdb --locale=xxx
--locale=locale Specifies the locale to be used in this database. This is equivalent to specifying both --lc-collate and --lc-ctype.
basically specifies the "default" locale for all database that you create after that (i.e. it specifies the settings for template1, which is the default template). You can create new databases with a different locale like this:
Locale is different than encoding, you can manually specify it and/or encoding:
CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
If you want to manually call it out.
Basically if you don't specify it, it uses the system default, which is almost never "C".
So if your show LC_COLLATE
returns anything other than "C" or "POSIX" then you are not using the standard C locale
and you will need to specify the xxx_pattern_ops for your indexes. Note also the caveat that if you want to use the <, <=, >, or >= operators you need to create a second index without the xxx_pattern_ops flag (unless you are using the standard C locale on your database, which is rare...). For just == and LIKE
(etc.) then you don't need a second index. If you don't need LIKE
then you don't need the index with xxx_pattern_ops, possibly, as well.
Even if your indexes are defined to collate with the "default" like
CREATE INDEX my_index_name ON table_name USING btree (identifier COLLATE pg_catalog."default");
This is not enough, unless the default is the "C" (or POSIX, same thing) collation, it can't be used for patterns like LIKE 'ABC%'
. You need something like this:
CREATE INDEX my_index_name ON table_name USING btree (identifier COLLATE pg_catalog."default" varchar_pattern_ops);