Scalable Database Tagging Schema Scalable Database Tagging Schema database database

Scalable Database Tagging Schema


Here is how I'd do it:

posts:          [postId], content, ownerId, date, noteId, noteType='post'tag_assoc:      [postId, tagName], ownerId, date, noteId, noteType='tagAssoc'tags:           [tagName], ownerId, date, noteId, noteType='tag'notes:          [noteId, noteType], ownerId, date, content

The fields in square brackets are the primary key of the respective table.

Define a constraint on noteType in each table: posts, tag_assoc, and tags. This prevents a given note from applying to both a post and a tag, for example.

Store tag names as a short string, not an integer id. That way you can use the covering index [postId, tagName] in the tag_assoc table.

Doing tag completion is done with an AJAX call. If the user types "datab" for a tag, your web page makes an AJAX call and on the server side, the app queries: SELECT tagName FROM tags WHERE tagName LIKE ?||'%'.


"A tag is almost like a post itself, because people can post notes about the tag." - this phrase makes me think you really just want one table for POST, with a primary key and a foreign key that references the POST table. Now you can have as many tags for each post as your disk space will allow.

I'm assuming there's no need for many to many between POST and tags, because a tag isn't shared across posts, based on this:

"Users can create tags which have notes, date created, owner, etc."

If creation date and owner are shared, those would be two additional foreign key relationships, IMO.


A linked list is almost certainly the wrong approach. It certainly means that your queries will be either complex or sub-optimal - which is ironic since the most likely reason for using a linked list is to keep the data in the correct sorted order. However, I don't see an easy way to avoid iteratively fetching a row, and then using the flink value retrieved to condition the select operation for the next row.

So, use a table-based approach with normal foreign key to primary key references. The one outlined by Bill Karwin looks similar to what I'd outline.