Query executer - Start of upper step is not overlap with end of lower step Query executer - Start of upper step is not overlap with end of lower step postgresql postgresql

Query executer - Start of upper step is not overlap with end of lower step


You might be interested in this series of blog posts on understanding query plans.

In your case, you are misinterpreting what the two numbers in each cost/timing represent. They are not the start and end of the operation, but (roughly) the cost/time before the first row, and the cost/time including all rows.

Depesz gives the example of a sort operation having "cost=22.88..23.61" - the cost to prepare the data is high, because you have to sort everything before you can return any of it; the cost of actually returning it is much lower, because it's just spooling through your sorted list.

So in your example the 19199.316 doesn't mean that the HashAggregate doesn't start running until t=19199.316, it means that until t=19199.316 no data will come out of the HashAggregate because it is still preparing stuff.

In fact, the HashAggregate will start processing data as soon as the Index Scan starts returning it, which is at t=0.033; by t=5730.776, the Index Scan has found all the relevant rows, but the HashAggregate is still processing them. At t=19199.316, the HashAggregate is ready to start returning data to its parent (which in this case is the final result), and at t=25518.672, it's finished returning them.

Depezs also has a tool which interprets query plans into a table form; here's your plan. Note that the HashAggregate shows an "exclusive time" of 19787.896 - that's how long it took to do the hashing, ignoring where the input data came from.


The reason for the observed behaviour: your statistics are wrong:

HashAggregate  (cost=412378.59..418074.28 rows=569569 width=70) (actual time=**19199.316**..25518.672 rows=4935988 loops=1) [expected] -----------------------------------^^^^^^       [actual rows found] > ------------------------ ^^^^^^^

This is off by a factor nine, and causes the planner to choose a hash-table based aggregation, since it thinks that the result will fit into work_mem. It is initially undersized, needs to be resized a few times and if does not fit into workmem it even has to be spilt to disk.

BTW: I was unable to recreate this kind of plan. I keep getting bitmap index scans.