Fastest way to PostgreSQL Distinct and Format Fastest way to PostgreSQL Distinct and Format sql sql

Fastest way to PostgreSQL Distinct and Format


I think you shouldn't select distinct from this huge table. Instead try to generate a short years sequence (say from 1900 to 2100) and select from this sequence only years that exist in acs_objects table. Result set will be the same but I think it will be faster. EXISTS subquery have to run fast on an indexed field creation_date.

SELECT y FROM(   select generate_series(1900,2100) as y) as tWHERE EXISTS (SELECT 1 FROM acs_objects                     WHERE creation_date >= DATE (''||t.y||'-01-01')                                AND  creation_date < DATE (''||t.y + 1||'-01-01'))

SQLFiddle demo


In your second attempt you get distinct dates from the sub-query which you then all convert to a string representation and then you select the distinct ones. That is rather inefficient. Better is it to first extract the distinct years from the creation_date in a sub-query and simply cast those to text in the main query:

SELECT year::textFROM (  SELECT DISTINCT extract(year FROM creation_date) AS year FROM acs_objects) AS distinct_years;

If you create an INDEX on the table, the query should run much faster still:

CREATE INDEX really_fast ON acs_objects((extract(year FROM creation_date)));

However, this may impact other uses of your table, in particular if you have many modifying statements (insert, update, delete). And this will only work if creation_date has a data type of date or timestamp (specifically not timestamp with timezone).

The below option looked promising because it does not use a sub-query, but it is in fact much slower (see comments below), probably because the DISTINCT clause is applied on a string:

SELECT DISTINCT extract(year FROM creation_date)::textFROM acs_objects;


I'm no sure what you use it for. I probably would consider using a using a materialized view.

Now you can refresh the view when needed and have a very fast way to retrieve the (distinct) year list (since the data is basically static stored).

Have a look here: