Pros & Cons of Date Column as Part of Primary Key

sql sql-server database primary-key composite-primary-key

In general, I think the best primary keys are synthetic auto-incremented keys. These have certain advantages:

The key value records the insertion order.
The keys are fixed length (typically 4 bytes).
Single keys are much simpler for foreign key references.
In databases (such as SQL Server by default) that cluster the data based on the primary key, inserts go "at the end".
They are relatively easy to type and compare (my eyes just don't work well for comparing UUIDs).

The fourth of these is a really big concern in a database that has lots of inserts, as suggested by your data.

There is nothing a priori wrong with composite primary keys. They are sometimes useful. But that is not a direction I would go in.

sql sql-server database primary-key composite-primary-key

I agree that it is better to keep the identity column/uniqueidentifier as primary key in this scenario, Also if you make partid and date as composite primary key, it is going to fail in a case when two concurrent users try to update the part price at same time.The primary key is going to fail in that case.So the better approach will be to have an identity column as primary key and keep on dumping the changes in log table.In case you hit some performance barriers later on you can partition your table year wise and can overcome that performance challenge.

sql sql-server database primary-key composite-primary-key

Pros and cons will vary depending on the performance requirements and how often you will query this table.

As a first example think about the following:

CREATE TABLE Part_Price_Log (    ModifiedDate DATE,    PartID INT,    PRIMARY KEY (ModifiedDate, PartID))

If the ModifiedDate is first and this is an logging table with insert-only rows, then every new row will be placed at the end, which is good (reduces fragmentation). This approach is also good when you want to filter directly by ModifiedDate, or by ModifiedDate + PartID, as ModifiedDate is the first column in the primary key. A con here would be searching by PartID, as the clustered index of the primary key won't be able to seek directly the PartID.

A second example would be the same but inverted primary key ordering:

CREATE TABLE Part_Price_Log (    ModifiedDate DATE,    PartID INT,    PRIMARY KEY (PartID, ModifiedDate))

This is good for queries by PartID, but not much for queries directly by ModifiedDate. Also having PartID first would make inserts displace data pages as inserted PartIDis lower than the max PartID (which increases fragmentation).

The last example would be using a surrogate primary key like an IDENTITY.

CREATE TABLE Part_Price_Log (    LogID BIGINT IDENTITY PRIMARY KEY,    ModifiedDate DATE,    PartID INT)

This will make all inserts go last and reduce fragmentation but you will need an additional index to query your data, such as:

CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_PartID_Date ON Part_Price_Log (PartID, ModifiedDate)

The con about this last one is that insert operations will take longer (as the index also has to be updated) and the size of the table will increase due to indexes.

Also keep in mind that if your data allows for multiple updates of the same part for the same day, then using compound PRIMARY KEY would make the 2nd update fail. Your choices here are to use a surrogate key, use a DATETIME instead of DATE (will give you more margin for updates), or use a CLUSTERED INDEX with no PRIMARY KEY or UNIQUE constraint.

I would suggest doing the following. You only keep one index (the actual table, as it is clustered), the order is always insert, you don't need to worry about repeated ModifiedDate with same PartID and your queries by date will be fast.

CREATE TABLE Part_Price_Log (    LogID INT IDENTITY PRIMARY KEY NONCLUSTERED,    ModifiedDate DATE,    PartID INT)CREATE CLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)

CodeHunter

Pros & Cons of Date Column as Part of Primary Key

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last