Best solution for reporting database Best solution for reporting database database database

Best solution for reporting database


I am thinking this is a classic textbook case of separation of frontend and backend database.

For the projects and people I have worked with, there was a strong agreement that the two should be separated.

In one case, there were three tiers of databases:

  1. Frontend transactions Middle summary
  2. repository for reference by frontend transactions
  3. Backend information repository

The frontend transaction speed was so critical, even that layer was dissected into multiple databases, one database per manufacturing area. The transactions were performed by equipment requiring very fast response.

Data from the frontend databases were used, together with customer and management -oriented databases to construct records for the backend reporting repository at an hourly frequency, because management needed short information latency for their operational and engineering decisions. If we could perform the information-compilation at 15 minute intervals, we would have done it. Depending on project, that backend repository could either be Oracle or Sybase IQ.

However, the frontend transactions performed by equipment needed to refer to some meta information. Response time required by the equipment could not run the risk of being interrupted by someone running a huge adhoc query on the backend repository, which was frequent.

So, a middle layer bridging database was created, which consists of nightly abstracts of information from the backend repository.

Schema designed with commonality-keys

Schema design is very important, to optimise the response and performance of all the databases. You have to ensure your database records are commonality-key-indexed and discrete-time-indexed.

For a manufacturing plant filled with robots and equipment, divided into manufacturing areas, each area has a frontend transaction database. Each area database needs to have a commonality-key dispatcher. Whena piece of equipment needed to perform a batch of operations, the beginOp event requests for a discrete-key from the dispatcher. An operation cycle may take seconds, or days, or weeks. Every time a piece of equipment needed to perform a transaction on its state of operation, it includes that commonality-key. An operation could have sub-operations and sub-sub-operations, etc - but each of such operation is required to obtain a commonality-key from its area dispatcher.

The commonality-key dispatcher is simply the beginOp table in the database with an auto-increment key. Any equipment sharing a same begun operation, it is able to infer/obtain that commonality-key from the table due to meticulous process sequencing strategy.

For areas where we could ensure that no two operations on the whole floor could start at the same 100 millisecond, there was no need for a dispatcher because we could simply use the date-time of a beginOp event, where the datetime function of the database server is the natural/spontaneous key dispatcher.

The reason for this discussion on commonality-key is because the transaction response required is so quick, you do not want pieces of equipment to have to communicate with each other unnecessarily just to tell each other they are recording events of the same operation. The robots and equipment simply perform the transactions with the commonality-key they are holding.

The hourly compilation of information for insertion into the backend repository conveniently uses the composite-key of commonality+area, to construct the hierarchy of events.

Frontend piping database

OK, this is really extreme. In some areas, the transactions were so frequent, that we had a FIFO database. We introduced a fourth tier database. For optimal transaction response, we had to keep a database size below 1GB. A transaction-piping process existed to empty old transactions into the fourth tier databases. I found that it was easier (and better response) to create a pool of new databases, so that every time its size reaches 1GB, it is moved out and immediately replaced with a new database from the pool - leaving the machines performing the hourly compilation to join up the databases. So that left us with depending on an existing metadata database to house the commonality-key dispatcher table with some meta-data tables.

In retrospect, one might think the commonality-key dispatcher table and metadata tables could have been housed in the middle tier bridging database, but because the database management processes were automated and cookie-cut, it was cleaner to create a new process than to modify the process managing the mid-tier bridging database. Those management routines were used across the world, so you cannot willy-nilly change them without causing havoc to the financial performance of the company or offending the respective data layer architects maintaining them.

It took a lot of organisational skills for the managers to pull all these together. So transactional data design is not just simply a technical skill but process planning skills involving a whole lot of people head-butting each other until you get it right.


What you ask for is totally standard - OLAP and OLTP do not mix in heavy load scenarios.

You use SQL Server. Look into SSAS (SQL Server Analytical Processing) for something to build cubes (different approach than SQL) that you can then report against.

If you do not wnat that, then mirriroring is the next best solution - you can put a mirror online in read only mode for reporting, and it gives you, also, a backup to activate if the main server fails ;) Always good.

CLustering is a non-issue - it will allow you to move the database to another node, but it does not solve the performance issue at all. Log file shipping, replication - good, though I would go with mirroring, read only copy for reporting, loading the data into SSAS.


We have a read/write Cluster which replicates (using transactional replication) to "read only" servers (not physically read only , the web app just performs reads on them). We do the same for reporting and this scales pretty good.

We have multiple sites, 32+ servers and a couple of reporting servers in this configuration with very high volume of inserts, updates and reads.

We primarily use reporting services for internal reporting. Reporting doesnt effect our core business , which I guess is your main concern.