Why does Postgres handle NULLs inconsistently where unique constraints are involved?
It is almost always a mistake when dealing with null
to say:
"nulls behave like so-and-so here, *so they should behave likesuch-and-such here"
Here is an excellent essay on the subject from a postgres perspective. Briefly summed up by saying nulls are treated differently depending on the context and don't make the mistake of making any assumptions about them.
The bottom line is, PostgreSQL does what it does with nulls because the SQL standard says so.
Nulls are obviously tricky and can be interpreted in multiple ways (unknown value, absent value, etc.), and so when the SQL standard was initially written, the authors had to make some calls at certain places. I'd say time has proved them more or less right, but that doesn't mean that there couldn't be another database language that handles unknown and absent values slightly (or wildly) differently. But PostgreSQL implements SQL, so that's that.
As was already mentioned in a different answer, Jeff Davis has written some good articles and presentations on dealing with nulls.
NULL
is considered to be unique because NULL
doesn't represent the absence of a value. A NULL
in a column is an unknown value. When you compare two unknowns, you don't know whether or not they are equal because you don't know what they are.
Imagine that you have two boxes marked A and B. If you don't open the boxes and you can't see inside, you never know what the contents are. If you're asked "Are the contents of these two boxes the same?" you can only answer "I don't know".
In this case, PostgreSQL will do the same thing. When asked to compare two NULL
s, it says "I don't know." This has a lot to do with the crazy semantics around NULL
in SQL databases. The article linked to in the other answer is an excellent starting point to understanding how NULL
s behave. Just beware: it varies by vendor.