SQL Database storing different types of values (in or simulated as a single field) SQL Database storing different types of values (in or simulated as a single field) database database

SQL Database storing different types of values (in or simulated as a single field)


Quick point you can simplify your design in this way -- Just have each of the Values tables point back at the variables table. There is no need for the linking table. The only reason for a linking table I can think of is if you want an "easier" way to have sequence accross all variable types. If this is not needed then use this design below:

Variable    ID    NameValuesInt    IDvariable    IntValValuesFloat    IDvariable    FloatValValuesDouble    IDvariable    DoubleValetc...etc...etc...

How your sql is easy:

select v.name as Variable,       coalesce(cast(vi.IntVal as varchar(max)),                cast(vf.FoatVal as varchar(max)),                cast(vd.DoubleVal as varchar(max)),                '') as ValueFrom Variable VJOIN ValuesInt vi on V.ID = vi.IDvariableJOIN ValuesFloat vf on V.ID = vf.IDvariableJOIN ValuesDouble vd on V.ID = vd.IDvariable


A couple of points:

  • In your example, the variable ezas123 has three values with different data types, meaning that the variable itself doesn't actually have a defined data type. This will probably cause problems downstream and is likely to indicate that the data is pretty poorly defined. I'd look at including a restriction that all of the values for a given variable must have the same data type.

  • Hogan's SQL query makes the point that whenever you list the values in the way you requested (i.e. across variables with different data types) you'll be having to cast the result to varchar or similar to display it (since you can't have values with different data types in the same output column). With that in mind, do you really need different data types, or would a varchar type work well for all the data you're dealing with?

If the different types are needed, I'd look at putting all of the different IntVal, FloatVal, DoubleVal, ... columns into one table. Your table definitions could then look something like:

Variables      ID          NOT NULL     ,Name        NOT NULL     ,DataType    NOT NULL CHECK (DataType IN ('INT','FLOAT','DOUBLE','TEXT'))     ,CONSTRAINT PK_Variables PRIMARY KEY (ID)   ,CONSTRAINT UQ_Variables_1 UNIQUE (Name)   ,CONSTRAINT UQ_Variables_2 UNIQUE  (ID,DataType)    Values      IDvals      NOT NULL     ,ID          NOT NULL     ,DataType    NOT NULL CHECK (DataType IN ('INT','FLOAT','DOUBLE','TEXT'))     ,IntVal      NULL     ,FloatVal    NULL     ,DoubleVal   NULL     ,TextVal     NULL   ,CONSTRAINT PK_Values PRIMARY KEY (IDvals)   ,CONSTRAINT FK_Values_Variable FOREIGN KEY (ID,DataType) REFERENCES Variables(ID,DataType)   ,CONSTRAINT CH_Values CHECK ( NOT(DataType <> 'INT'    AND IntVal     IS NOT NULL)  AND                                 NOT(DataType <> 'FLOAT'  AND FloatVal   IS NOT NULL)  AND                                 NOT(DataType <> 'DOUBLE' AND DoubleVal  IS NOT NULL)  AND                                 NOT(DataType <> 'TEXT'   AND TextVal    IS NOT NULL)                                )
  • The UNIQUE constraint on Variables(ID,DataType) will probably be required (DBMS?) to allow you to make it the subject of a FK;
  • The CHECK constraints ensure that only valid data types are being used and that the only the correct value columns can be populated;
  • Having DataType in Values as well as Variables means that a combination of FK and CHECK can be used to ensure that all values for a given variable have the same data type, rather than having to use triggers or application logic.

A query against the tables then looks something like:

SELECT v.name as Variable,       COALESCE(cast(a.IntVal       as varchar(max)),                cast(a.FloatVal     as varchar(max)),                cast(a.DoubleVal    as varchar(max)),                cast(a.TextVal      as varchar(max)),                '') as ValueFROM Variables VJOIN Values a on V.ID = a.ID AND v.DataType = a.DataType

This could also be written (probably more correctly) with a CASE based on Variable.DataType being used to choose the relevant column.

Having all of the values in one table means less tables/constraints/indexes in the database and means that extending the solution to hold new data types just means adding new columns to the Values table (and modifying the constraints) rather than adding new tables.