Efficient latest record query with Postgresql Efficient latest record query with Postgresql sql sql

Efficient latest record query with Postgresql


If you don't want to change your data model, you can use DISTINCT ON to fetch the newest record from table "b" for each entry in "a":

SELECT DISTINCT ON (a.id) *FROM aINNER JOIN b ON a.id=b.idORDER BY a.id, b.date DESC

If you want to avoid a "sort" in the query, adding an index like this might help you, but I am not sure:

CREATE INDEX b_id_date ON b (id, date DESC)SELECT DISTINCT ON (b.id) *FROM aINNER JOIN b ON a.id=b.idORDER BY b.id, b.date DESC

Alternatively, if you want to sort records from table "a" some way:

SELECT DISTINCT ON (sort_column, a.id) *FROM aINNER JOIN b ON a.id=b.idORDER BY sort_column, a.id, b.date DESC

Alternative approaches

However, all of the above queries still need to read all referenced rows from table "b", so if you have lots of data, it might still just be too slow.

You could create a new table, which only holds the newest "b" record for each a.id -- or even move those columns into the "a" table itself.


this could be more eficient. Difference: query for table b is executed only 1 time, your correlated subquery is executed for every row:

SELECT * FROM table a JOIN (SELECT ID, max(date) maxDate        FROM table      GROUP BY ID) bON a.ID = b.ID AND a.date = b.maxDateWHERE ID IN $LIST 


what do you think about this?

select * from (   SELECT a.*, row_number() over (partition by a.id order by date desc) r    FROM table a where ID IN $LIST )WHERE r=1

i used it a lot on the past