Postgresql: In-Row vs Out-of-Row for text/varchar Postgresql: In-Row vs Out-of-Row for text/varchar database database

Postgresql: In-Row vs Out-of-Row for text/varchar


Edited Answer

For first part of the question: you are correct (see for instance this).

For the second part of the question: the standard way of storing columns is to compress variable length text fields if their size is over 2KB, and eventually store them into a separate area, called “TOAST table”.

You can give a “hint” to the system on how to store a field by using the following command for your columns:

ALTER TABLE YourTable  ALTER COLUMN YourColumn SET STORAGE (PLAIN | EXTENDED | EXTERNAL | MAIN)

From the manual:

SET STORAGE

This form sets the storage mode for a column. This controls whether this column is held inline or in a secondary TOAST table, and whether the data should be compressed or not. PLAIN must be used for fixed-length values such as integer and is inline, uncompressed. MAIN is for inline, compressible data. EXTERNAL is for external, uncompressed data, and EXTENDED is for external, compressed data. EXTENDED is the default for most data types that support non-PLAIN storage. Use of EXTERNAL will make substring operations on very large text and bytea values run faster, at the penalty of increased storage space. Note that SET STORAGE doesn't itself change anything in the table, it just sets the strategy to be pursued during future table updates. See Section 59.2 for more information.

Since the manual is not completely explicit on this point, this is my interpretation: the final decision about how to store the field is left in any case to the system, given the following constraints:

  1. No field can be stored such that the total size of a row is over8KB
  2. No field is stored out-of-row if its size is less then theTOAST_TUPLE_THRESHOLD
  3. After satisfying the previousconstraints, the system tries to satisfy the SET STORAGE strategyspecified by the user. If no storage strategy is specified, each TOAST-ablefield is automatically declared EXTENDED.

Under these assumption, the only way to be sure that all the values of a column are stored out-of-row is to recompile the system with a value of TOAST_TUPLE_THRESHOLD less then the minumum size of any value of the column.