Database Design: replace a boolean column with a timestamp column? Database Design: replace a boolean column with a timestamp column? oracle oracle

Database Design: replace a boolean column with a timestamp column?


Is it a good or bad idea?

Good idea.

You've eliminated space taken by a redundant column; the DATE column serves double duty--you know the work was finished, and when.

I've heard like you can't have an index on a column with NULL values, so "where data_finished is not null" will be very slow on big tables.

That's incorrect. Oracle indexes ignore NULL values.

You can create a function based index in order to get around the NULL values not being indexed, but most DBAs I've encountered really don't like them so be prepared for a fight.


There is a right way to index null values, and it doesn't use a FBI. Oracle will index null values, but it will NOT index null LEAF values in the tree. So, you could eliminate the column is_finished and create the index like this.

CREATE INDEX ON workflow (date_finished, 1);

Then, if you check the explain plan on this query:

SELECT count(*) FROM workflow WHERE date_finished is null;

You might see the index being used (if the optimizer is happy).

Back to the original question: looking at the variety of answers here, I think there is no right answer. I may have a personal preference to eliminate a column if it is unnecessary, but I also don't like overloading the meaning of columns either. There are two concepts here:

  1. The record has finished. is_finished
  2. The record finished on a particular date. date_finished

Maybe you need to keep these separate, maybe you don't. When I think about eliminating the is_finished column, it bothers me. Down the road, the situation may arise where the record finished, but you don't know precisely when. Perhaps you have to import data from another source and the date is unknown. Sure, that's not in the business requirements now, but things change. What do you do then? Well, you have to put some dummy value in the date_finished column, and now you've compromised the data a bit. Not horribly, but there is a rub there. The little voice in my head is shouting YOU'RE DOING IT WRONG when I do things like that.

My advice, keep it separate. You're talking about a tiny column and a very skinny index. Storage should not be an issue here.

Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

-Eric S. Raymond


Is it a good or bad idea? I've heard like you can't have an index on a column with NULL values, so "where data_finished is not null" will be very slow on big tables.

Oracle does index nullable fields, but does not index NULL values

This means that you can create an index on a field marked NULL, but the records holding NULL in this field won't make it into the index.

This, on its turn, means that if you make date_finished NULL, the index will be less in size, as the NULL values won't be stored in the index.

So the queries involving equality of range searches on date_finished will in fact perform better.

The downside of this solution, of course, is that the queries involving the NULL values of date_finished will have to revert to full table scan.

You can work around this by creating two indexes:

CREATE INDEX ON mytable (date_finished)CREATE INDEX ON mytable (DECODE(date_finished, NULL, 1))

and use this query to find unfinished work:

SELECT  *FROM    mytableWHERE   DECODE(date_finished, NULL, 1) = 1

This will behave like partitioned index: the complete works will be indexed by the first index; the incomplete ones will be indexed by the second.

If you don't need to search for complete or incomplete works, you can always get rid of the appropriate indexes.