Schema to support dynamic properties Schema to support dynamic properties database database

Schema to support dynamic properties


You are implementing something called Entity-Attribute-Value model http://en.wikipedia.org/wiki/Entity-attribute-value_model.

Lots of folks will say it's a bad idea (usually I am one of those) because the answer to your last question, "What would the SQL for fetching..." tends to be "thick hairy and nasty, and gettting worse."

These criticisms tend to hold once you allow users to start nesting objects inside of other objects, if you do not allow that, the situation will remain manageable.

For your first question, "what would the "type" column of the prop_defs table hold", everything will be simpler if you have a table of types and descriptions that holds {"numeric","Any Number"}, {"textual","String"}, etc. The first value is the primary key. Then in prop_defs your column "type" is a foreign key to that table and holds values "numeric", "textual", etc. Some will tell you incorrectly to always use integer keys because they JOIN faster, but if you use the values "numeric", "textual" etc. you don't have to JOIN and the fastest JOIN is the one you don't do.

The query to grab a single value will have a CASE statement:

SELECT case when pd.type = "numeric" then pv.numeric            when pd.type = "textual" then pv.textual            when pd.type = "boolean" then pv.boolean  from prov_vals pv   JOIN prop_defs pd ON pv.prop_def_id = pv.id WHERE pv.object_id = 2   AND pd.name = "Name"


You must accept that relational databases are not good at providing this kind of functionality. They CAN provide it, but are not good at it. (I hope I'm wrong). Relational databases lend themselves better to defined interfaces, not changing interfaces.

--EAV tables give dynamic fields but suck on performance. Sucks on indexing. And it is complex to query. It gets the job done in many situations, but can fall apart on big tables with lots of users hitting the system.

--"Regular" tables with several place holder columns are OK for performance, but you get non-descriptive column names and are limited in the number of columns you can "add". Also it does not support sub-type separation.

--Typically you create/modify tables at development time, not run time. Should we really discriminate against modifying the database at run time? maybe, maybe not. Creating new tables, foreign keys, and columns at run-time can achieve true dynamic objects, while giving the performance benefits of "regular" tables. But you would have to query the schema of the database, then dynamically generate all of your queries. That would suck. It would totally break the concept of tables as an interface.