How to best handle the storage of historical data? How to best handle the storage of historical data? database database

How to best handle the storage of historical data?


If the requirement is solely for reporting, consider building a separate data warehouse. This lets you use data structures like slowly changing dimensions that are much better for historical reporting but don't work well in a transactional system. The resulting combination also moves the historical reporting off your production database which will be a performance and maintenance win.

If you need this history to be available within the application then you should implement some sort of versioning or logical deletion feature or make everything fully contra and restate (i.e. transactions never get deleted, just reversed out and restated). Think very carefully about whether you really need this as it will add a lot of complexity. Making a transactional application that can reconstruct historical state correctly is considerably harder than it looks. Financial software (e.g. insurance underwriting sytems) fails to do this a lot more than you might think.

If you need the history solely for audit logging, make shadow tables and audit logging triggers. This is much simpler and more robust than trying to correctly and comprehensively implement audit logging within the application. The triggers will also pick up changes to the database from sources outside the application.


This question goes along the line of Business Logic. Know your business requirements first then start from there. A Data Warehouse is a nice solution for this kind of situation. ETL will give you lots of options in dealing with data flows. Your basic concept of 'History' vs 'Active' is quite correct. Your history data will be more efficient and flexible if kept in a data warehouse with all their dimension and fact tables.