SQL Server slow select from large table SQL Server slow select from large table sql sql

SQL Server slow select from large table

You'll likely want to create a composite index for this type of query - when the query runs slowly it is most likely choosing to scan down an index on the CreatedAt column and perform a residual filter on the SourceUserId value, when in reality what you want to happen is to jump directly to all records for a given SourceUserId ordered properly - to achieve this, you'll want to create a composite index primarily on SourceUserId (performing an equality check) and secondarily on CreateAt (to preserve the order within a given SourceUserId value). You may want to try adding the TypeId in as well, depending on the selectivity of this column.

So, the 2 that will most likely give the best repeatable performance (try them out and compare) would be:

  1. Index on (SourceUserId, CreatedAt)
  2. Index on (SourceUserId, TypeId, CreatedAt)

As always, there are also many other considerations to take into account with determining how/what/where to index, as Remus discusses in a separate answer one big consideration is covering the query vs. keeping lookups. Additionally you'll need to consider write volumes, possible fragmentation impact (if any), singleton lookups vs. large sequential scans, etc., etc.

I have indexes on each column except MetaId

Non-covering indexes will likely hit the 'tipping point' and the query would revert to a table scan. Just adding an index on every column because it is used in a where clause does not equate good index design. To take your query for example, a good 100% covering index would be:

INDEX ON (SourceUserId , CreatedAt) INCLUDE (TypeId, SrcMemberId, DstMemberId)

Following index is also usefull, altough it still going to cause lookups:

INDEX ON (SourceUserId , CreatedAt) INCLUDE (TypeId)

and finaly an index w/o any included column may help, but is just as likely will be ignored (depends on the column statistics and cardinality estimates):

INDEX ON (SourceUserId , CreatedAt)

But a separate index on SourceUSerId and one on CreatedAt is basically useless for your query.

See Index Design Basics.

The fact that the table has indexes built on GUID values, indicates a possible series of problems that would affect performance:

  • High index fragmentation: since new GUIDs are generated randomly, the index cannot organize them in a sequential order and the nodes are spread unevenly.
  • High number of page splits: the size of a GUID (16 bytes) causes many page splits in the index, since there's a greater chance than a new value wont't fit in the remaining space available in a page.
  • Slow value comparison: comparing two GUIDs is a relatively slow operation because all 33 characters must be matched.

Here a couple of resources on how to investigate and resolve these problems: