Database triggers / referential integrity and in-memory caching Database triggers / referential integrity and in-memory caching database database

Database triggers / referential integrity and in-memory caching


Simple Answer:

  • Referential integrity is a must have
  • Caching is a qualified must have
  • Triggers are a nice to have

Longer Answer

I've been developing applications on relational databases since 1993 (Dec RDB since you ask, and on flat file systems before then) and triggers have never been popular with many developers because they can 'delete stuff that you don't want deleting'. Referential integrity is also often frowned on by developers because a database in third normal form with proper referential integrity is difficult to bodge together in a few minutes.

Caching is also often considered quite 'hard' to do right, although I'm not sure why.

Whilst many systems can live without triggers, I'd say that no application database can comfortably survive without referential integrity. Look at the tags on this question, the database behind this site will have a table for the tags (probably called 'Tag') and questions (probably called 'Question'). 'Question' will have a foreign key to the primary key on the Tag table, but as questions can have many tags and tags can have many questions I'd guess that the relationship is like this:

   Question   (TagId)         1 | Database triggers / referential integrity and in-memory caching      |      -----    | | |  QuestionTag (QuestionId)       1 | 1  ... 1 | 2  ... 1 | 3 ...    (TagId)    | | |    -----      |     Tag            1 | database ... 2 | referential-integrity ... 3 | triggers ...   (TagId)

This kind of referential integrity is the bedrock of any reliable application and isn't negotiable. You can see how it adds credibility to the application design and confidence in its longevity.

The caching on SO may be turned on for such a thing as tags (although it isn't guaranteed) so assume that the tags are cached in memory and that you have enough reputation to be allowed to add a tag to SO. You add your tag and it may well be persisted to the database instantly - but is the cache then updated?

What you have is a trade-off. Can the site survive without knowing about your new tag? And if so for how long? Fundamentally what is the lifecycle of a tag, as it progresses from being added by a user to being in the database, available to other users, used by other users? The cache will be rebuilt according to the rules set down by the development team - and that rule will be a trade-off essentially so that any new tag is available quickly enough without slowing the application down.

Triggers can enforce referential integrity, say the tag you add is 'rubbish', but by the time the admins see it three questions are tagged 'rubbish'. The admins then decide to delete the 'rubbish' tag - but what about the questions that are tagged with it? If there is a trigger on the 'tag' table that is fired on the delete, it could then run round the 'question' table and remove all references to 'rubbish'. There are lots of alternatives to this approach - many of which are programmatic workrounds - but is there a cleaner alternative?

I've worked on lots of sites in the last 20 years, the good ones use referential integrity and increasingly caching. Triggers that change data anonymously (all they fundamentally are are event driven stored procedures) are not popular and increasingly misunderstood but still have a role.

Caching and referential integrity cannot be considered an either-or - development teams must design applications so that both can be incorporated.