Why use an auto-incrementing primary key when other unique fields exist? Why use an auto-incrementing primary key when other unique fields exist? database database

Why use an auto-incrementing primary key when other unique fields exist?


Auto-incrementing primary keys are useful for several reasons:

  • They allow duplicate user names as on Stack Overflow
  • They allow the user name (or email address, if that's used to login) to be changed (easily)
  • Selects, joins and inserts are faster than varchar primary keys as its much faster to maintain a numeric index
  • As you mentioned, validation becomes very simple: if ((int)$id > 0) { ... }
  • Sanitation of input is trivial: $id = (int)$_GET['id']
  • There is far less overhead as foreign keys don't have to duplicate potentially large string values

I would say trying to use any piece of string information as a unique identifier for a record is a bad idea when an auto-incrementing numeric key is so readily available.

Systems with unique user names are fine for very small numbers of users, but the Internet has rendered them fundamentally broken. When you consider the sheer number of people named "john" that might have to interact with a website, it's ridiculous to require each of them to use a unique display name. It leads to the awful system we see so frequently with random digits and letters decorating a username.

However, even in a system where you enforced unique usernames, it's still a poor choice for a primary key. Imagine a user with 500 posts: The foreign key in the posts table is going to contain the username, duplicated 500 times. The overhead is prohibitive even before you consider that somebody might eventually need to change their username.


If the username is the primary key and a user changes his/her username, you will need to update all the tables which have foreign key references to the users table.


If you have demonstrated to your professor that assigning a unique arbitrary integer to each user is of value to your application then of course he would be wrong to say that it is "completely useless and unnecessary".

However, maybe you missed his point. If he told you that the requirement is that "no two users can have the same username" then you haven't met that requirement.

Sincere thanks for posting your SQL DDL, it is very useful but most don't bother on SO.

Using your table, I can do this:

INSERT INTO users (username) VALUES (NULL);INSERT INTO users (username) VALUES (NULL);INSERT INTO users (username) VALUES (NULL);INSERT INTO users (username) VALUES (NULL);INSERT INTO users (username) VALUES (NULL);

Which results in this:

SELECT uid, username, passhash, email, rdate FROM users;uid   username   passhash   email   rdate1     <NULL>     <NULL>     <NULL>  <NULL>2     <NULL>     <NULL>     <NULL>  <NULL>3     <NULL>     <NULL>     <NULL>  <NULL>4     <NULL>     <NULL>     <NULL>  <NULL>

I think is the point your professor was trying to make: without enforcing the natural key on username you don't really have any data integrity at all.

If I was the prof, I'd also urge you to remove nullable columns from your design.