Postgres 10.3: SELECT queries hang for hours
generic troubleshooting suggestions:
start gathering runtime statistics of the server - there's wide choice of tools - https://munin-monitoring.org/, https://grafana.com/ + influx db + telegraf, many more. regardless of the solution you should keep historical statistics of:
- amount of disk operations done per second
- latency of the disk storage [ regardless if it's spinning rust, ssd, nvme or network-attached ]
- server CPU usage, load, memory usage
get also statistics about postgresql - https://www.percona.com/downloads/pmm2 might be helpful here
based on those stats - see if there's any build-up before problematic query happens.
occasional slow down might be caused by:
- uneven performance of the storage subsystem [ ssd at the end of its life, patrol-read on the RAID array, hdd reallocating data due to bad sectors ]
- incorrect index statistics leading to suboptimal query plan
- overload of the system by incoming queries
- overload of the system by other workloads running on the same hardware, noisy neighbors if you're running in a virtualized environment