Diamond schema: how (de)normalized is that? Diamond schema: how (de)normalized is that? database database

Diamond schema: how (de)normalized is that?


The naive way would be to make your journalists and camera_operators dependent entities, dependent upon the studio for which they work. That means the production studio foreign key becomes part of their primary key. Your news_footage table then has a primary key consisting of 4 components:

  • production_studio_id
  • journalist_id
  • camera_operator_id
  • footage_id

and two foreign keys:

  • journalist_id,production_studio_id, pointing to the journalist table, and
  • camera_operator,production_studio_id, pointing to the camera operator table

Easy.

Or Not. Now you have defined in your E-R model the notion that the very existence of a camera operator or a journalist is dependent upon the studio for which they work. This does not reflect the real work very well: in this model, people can't change their employer.

Let's not do that.

In your original model, you confusing a person with a _role they play (journalist or camera operator), and you're missing a somewhat transient entity that is actually responsible for the production of your news footage: the [studio-specific] production team.

My E-R model would look something like this:

create table studio(  id int not null primary key ,  title varchar(200) not null ,)create table person(  id int not null primary key ,  title varchar(200) not null ,)create table team(  studio_id          int not null ,  journalist_id      int not null ,  camera_operator_id int not null ,  primary key ( studio_id , journalist_id , camera_operator ) ,  foreign key ( studio_id          ) references studio ( id ) ,  foreign key ( journalist_id      ) references person ( id ) ,  foreign key ( camera_operator_id ) references person ( id ) ,)create table footage(  studio_id          int not null ,  journalist_id      int not null ,  camera_operator_id int not null ,  id                 int not null ,  description        varchar(200) not null ,  primary key ( studio_id , journalist_id , camera_operator_id , id ) ,  foreign key     ( studio_id , journalist_id , camera_operator_id )  references team ( studio_id , journalist_id , camera_operator_id ) ,)

Now you have a world in which people can work in different roles: the same person might be a camera operator in some contexts and a journalist in others. People can change employers. Studio-specific teams are composed, consisting of a journalist and a camera operator. In some contexts, the same person might play both roles on a team. And, finally, a piece of news footage is produced by one and only one studio-specific team.

This reflects the real world much better, and it is much more flexible.

Edited to add sample query:

To find the journalists working for a particular studio:

select p.*from studio sjoin team   t on t.studio_id = s.idjoin person p on p.id        = t.journalist_idwhere s.title = 'my desired studio name'

This would give you the set of people who are (or have) been associated with a studio in the role of journalist. One should note though, that in the real world, people work for employers for a period of time: to model it properly you need a start/end date and you need to qualify the query with a relative notion of now.