Best practice for a mysql data versioning system
My choice would be a variation of approach 2. Bold indicates fields in the primary key.
- You insert every article in a table
articles_versioned
(id, timestamp, name, text) - Your second table is
articles
(id, timestamp, [name, text]). Note how timestamp is not primary; name and text may be replicated, or you may use a join witharticles_versioned
(which will be fast since id and timestamp are thearticles_versioned
primary key) articles_versioned
has a trigger on insert that takes the just inserted row and replicates it onarticles
- To restore a specific version of an article you modify the
articles
table.
The advantages of this approach are:
- You get for free another information (the date and time of the article) in your table, that you may need anyway
- You do not need to query the database to get the current date. If you use version, you have to.
- Your code doesn't have to insert the article in two tables. You simply insert in
articles_versioned
and read fromarticles
, the db takes care of migrating data as you insert it via the trigger, avoiding any consistency problems.
Con's
- In an heavily concurrent environment, two versions may be inserted at the very same time, so one of them may fail. This shouldn't be a problem when inserting user-written articles (it is highly unlikely given precision of timestamps these days). If you don't specify the timestamp in your
INSERT
statement, but instead you set the datetime field to have the current time as a default value, you may avoid this problem entirely.
To answer the rest of your question. Approach 1 will not lead to longer queries as long as you add an index on status. This makes sense only if you tend to have many different versions of each article; as long as you have 2 versions per article on average or less, the index will only slow you down, and approach 2 would not be sensibly faster anyway (altough I'd still recommend my approach because it simplyfies code, since restoring a version does not require switching status for two rows).
Related resources, like images, should follow a similar versioning. I assume you are saving them on the filesystem; instead of saving them with their real name, use a table (id, image_name) to give to each image an id, then save the image as -id-.jpg
. The image_name field will make you able to know what the original file name was (if you care about that). This way you can version images the same way as you version articles, and in articles you would use something like <img src="-id-.jpg">
, that you know will remain available forever.