SQL Server heap v.s. clustered index

sql-server sql-server-2008 heap clustered-index heap-table

Heap storage has nothing to do with these heaps.

Heap just means records themselves are not ordered (i. e. not linked to one another).

When you insert a record, it just gets inserted into the free space the database finds.

Updating a row in a heap based table does not affect other records (though it affects secondary indexes)

If you create a secondary index on a HEAP table, the RID (a kind of a physical pointer to the storage space) is used as a row pointer.

Clustered index means that the records are part of a B-Tree. When you insert a record, the B-Tree needs to be relinked.

Updating a row in a clustered table causes relinking of the B-Tree, i. e. updating internal pointers in other records.

If you create a secondary index on a clustered table, the value of the clustered index key is used as a row pointer.

This means a clustered index should be unique. If a clustered index is not unique, a special hidden column called uniquifier is appended to the index key that makes if unique (and larger in size).

It is also worth noting that creating a secondary index on a column makes the values or the clustered index's key to be the part of the secondayry index's key.

By creating an index on a clustered table, you in fact always get a composite index

CREATE UNIQUE CLUSTERED INDEX CX_mytable_1234 (col1, col2, col3, col4)CREATE INDEX IX_mytable_5678 (col5, col6, col7, col8)

Index IX_mytable_5678 is in fact an index on the following columns:

col5col6col7col8col1col2col3col4

This has one more side effect:

A `DESC` condition in a single-column index on a clustered table makes sense in `SQL Server`

This index:

CREATE INDEX IX_mytable ON mytable (col1)

can be used in a query like this:

SELECT  TOP 100 *FROM    mytableORDER BY       col1, id

, while this one:

CREATE INDEX IX_mytable ON mytable (col1 DESC)

can be used in a query like this:

SELECT  TOP 100 *FROM    mytableORDER BY       col1, id DESC

sql-server sql-server-2008 heap clustered-index heap-table

Heaps are just tables without a clustering key - without a key that enforces a certain physical order.

I would not really recommend having heaps at any time - except maybe if you use a table temporarily to bulk-load an external file, and then distribute those rows to other tables.

In every other case, I would strongly recommend using a clustering key. SQL Server will use the Primary Key as the clustering key by default - which is a good choice, in most cases. UNLESS you use a GUID (UNIQUEIDENTIFIER) as your primary key, in which case using that as your clustering key is a horrible idea.

See Kimberly Tripp's excellent blog posts GUIDs as Primary and/or the clustering key and The Clustered Index Debate Continues for excellent explanations why you should always have a clustering key, and why a GUID is a horrible clustering key.

My recommendation would be:

in 99% of all cases try to use a INT IDENTITY as your primary key and let SQL Server make that the clustering key as well
exception #1: if you're bulk loading huge data amounts, you might be fine without a primary / clustering key for your temporary table
exception #2: if you must use a GUID as your primary key, then set your clustering key to a different column - preferably a INT IDENTITY - and I would even create a separate INT column just for that purpose, if no other column can be used

Marc

sql-server sql-server-2008 heap clustered-index heap-table

Books Online is the best source!

The whole Database Engine - Planning and Architecture - Tables and Index Data Structures Architecture is very good internal introduction.

From this link you can download a local copy of Books Online(it is free). It is the best (and official) reference to all Sql 2008 questions.

CodeHunter

SQL Server heap v.s. clustered index

Updating a row in a heap based table does not affect other records (though it affects secondary indexes)

Updating a row in a clustered table causes relinking of the B-Tree, i. e. updating internal pointers in other records.

By creating an index on a clustered table, you in fact always get a composite index

A `DESC` condition in a single-column index on a clustered table makes sense in `SQL Server`

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last

SQL Server heap v.s. clustered index

Updating a row in a heap based table does not affect other records (though it affects secondary indexes)

Updating a row in a clustered table causes relinking of the B-Tree, i. e. updating internal pointers in other records.

By creating an index on a clustered table, you in fact always get a composite index

A DESC condition in a single-column index on a clustered table makes sense in SQL Server

Recent Posts

A `DESC` condition in a single-column index on a clustered table makes sense in `SQL Server`