Database design? Database design? database database

Database design?


There is an excellent study on that subject here, by Allen Browne.


In general, it is best to have a normalized database. If you have data duplication and the software one day doesn't quite work as it should, then you could end up with inconsistent data, which is no good to anyone.

For most purposes, calculations can be made efficient enough that you can do them on the raw data, as you suggest. If this does present a problem in scalability for your particular project, perhaps it would be best to have a table with the calculated values in. This would never be treated as original data, and could be recalculated at any point. Then your data is safe, and your calculations are fast.

The other alternative, depending on your DBMS, is to consider a view.


Generally speaking, you want more normalization to maintain data integrity, and this also tends to optimize for data updating (you don't need to update the same piece of information in more than one place). The only real exception to this is if your database is going to be primarily used for reporting, and only occasionally updating. Then, the cost of updating in several different places (because you are denormalized) is paid for by the fact that you don't have to fetch information from many different places when reporting.

If your database has to be fast for transactions (updating), and only occasionally report, then normalize as much as possible. It's also easier to keep all the data consistent this way. However, if you are only occasionally updating, and mostly reporting (i.e. reading, or just pulling data out of the database), then your boss might be right and it may be a better choice to denormalize. It's more complicated to handle error conditions and maintain data integrity, though (you have to do more thinking about what is a "logical transaction", and when to back out all of the updates made so far when something goes wrong).