Best practise for storing multilingual strings
First make sure that the database locale can deal with different languages. Use an UTF-8 server-encoding. Optionally set LC_COLLATE = 'C'
to be on neutral ground or use a collation for your first language to have a default sort order. Start by reading the chapter Collation Support in the manual.
I would strongly suggest that you use the latest version of PostgreSQL (9.1 at time of writing) because it has superior collation support.
As for the table structure: keep it simple. It sounds like there is a low, fixed number of languages to deal with. You could just have a column for each language then:
CREATE TABLE txt ( txt_id serial PRIMARY KEY ,txt text NOT NULL -- master language NOT NULL? ,txt_fr text -- others can be NULL? ,txt_es text ,txt_de text);
This is pretty effective, even with many languages. NULL storage is very cheap.
If you have a varying number of languages to deal with, a separate table might be the better solution. This solution assumes that you have a "master language", where the string is always present:
CREATE TABLE txt ( txt_id serial PRIMARY KEY ,txt text NOT NULL -- master language NOT NULL?);CREATE TABLE lang ( lang_abbr text PRIMARY KEY -- de, es, fr, ... ,lang text NOT NULL ,note text);
Or, if a (two-letter) abbreviation is enough, just create a enum
type to identify the language.
CREATE TABLE txt_trans ( txt_id int REFERENCES txt(txt_id) ON UPDATE CASCADE ON DELETE CASCADE ,lang_abbr text REFERENCES lang(lang_abbr) ON UPDATE CASCADE ,txt text NOT NULL -- master language NOT NULL? ,CONSTRAINT txt_trans_pkey PRIMARY KEY (txt_id, lang_abbr));
Not treating the master language special and keeping all language variants in the same table might make handling in your app simpler. But it really depends on your requirements.