Primary Key Sorting Primary Key Sorting sql sql

Primary Key Sorting

Data is physically stored by clustered index, which is usually the primary key but doesn't have to be.

Data in SQL is not guaranteed to have order without an ORDER BY clause. You should always specify an ORDER BY clause when you need the data to be in a particular order. If the table is already sorted that way, the optimizer won't do any extra work, so there's no harm in having it there.

Without an ORDER BY clause, the RDBMS might return cached pages matching your query while it waits for records to be read in from disk. In that case, even if there is an index on the table, data might not come in in the index's order. (Note this is just an example - I don't know or even think that a real-world RDBMS will do this, but it's acceptable behaviour for an SQL implementation.)


If you have a performance impact when sorting versus when not sorting, you're probably sorting on a column (or set of columns) that doesn't have an index (clustered or otherwise). Given that it's a time series, you might be sorting based on time, but the clustered index is on the primary bigint. SQL Server doesn't know that both increase the same way, so it has to resort everything.

If the time column and the primary key column are a related by order (one increases if and only if the other increases or stays the same), sort by the primary key instead. If they aren't related this way, move the clustered index from the primary key to whatever column(s) you're sorting by.

Without an explicit ORDER BY, there is no default sort order. A very common question. As such, there is a canned answer:

Without ORDER BY, there is no default sort order.

Can you elaborate why "The performance difference is significant."?

A table by default is not 'clustered' , i.e. organized by PK. You do have the option of specifying it as such. So the default is "HEAP" (in no particular order), and the option you are looking for is "CLUSTERED" (SQL Server, in Oracle its called IOT).

  • A table can only have one CLUSTERED (makes sense)
  • Use the PRIMARY KEY CLUSTERED syntax on the DDL
  • Order by PK still needs to be issued on your SELECTS, the fact of it being clustered will cause the query to run faster, as the optimizer plan will know it does not need to do the sorting on a clustered index

The earlier poster is correct, SQL (and the theoretical basis of it) specifically defines a select as an unordered set/tuple.

SQL usually tries to stay in the logical-realm and not make assumptions about the physical organization / locations etc. of the data. The CLUSTERED option allows us to do that for practical real-life situations.