Steps to design a well organized and normalized Relational Database
I would recommend you this videos (about 9) about E/R modeling
http://www.youtube.com/watch?v=q1GaaGHHAqM
EDIT:
"how extensive must the diagrams for this model be ? must they include all the entities and attributes?? "
Yes, actually you have ER modeling and extend ER modeling,
The idea is to make the Extended ER modeling, because there you not only specify the entities, you also specify the PK and FK and the cardinality. Take a look to this link (see the graphics and the difference between both models).
there are two ways of modeling, one is the real scenario and the other one is the real structure of the DB, I.E:
When you create a E-ER Modeling you create even the relationship and cardinality for ALL entities, but when you are going to create the DB is not necessary to create relations with cardinality 1:N(The table with cardinality N create a FK from table with card. 1, and you don't need to create the relation Table into the DB) or when you have a 1:1 cardinality you know that one of your entities can absorb the other entity.
look this Graphic , only the N:M relations entities were create (when you see 2 or more FK, that's a relation table)
But remember those are just "rules" and you can break it if your design need to, for performance, security, etc.
about tools, there are a lot of them, But I recommended workbench, because you can use it to connect to your DBs (if you are in mysql) and create designs E/R modeling, with attributes, and he will auto-create the relations tables N:M.
EDIT 2:
here I put some links that can explain that a little bit better, it will take a lot of lines and will be harder to explain here and by myself, please review this links and let me know if you have questions:
type and subtype:
business rules (integrity constrain)
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx (please take a look specially to this one, I think it will help you with all this info)
I have reread the book and some articles online and have created a short list of steps in order to design a decent database (of course you need to understand the basics of database design first) Steps are described in greater detail below:
(A lot of steps are described in the book: Database Systems - Design, Implementation and Management (9th Edition)
and thats what the page numbers are refering too but i will try to describe as much as I can here and will edit this answer in the following days to make it more complete)
- Create a detailed narrative of the organization’s description of operations.
- Identify the business rules based from the description of operations.
- Identify the main entities and relationships from the business rules.
- Translate entities/relationships to EER model
- Check naming conventions
- Map ERR model to logical model (pg 400)*
- Normalize logical model (pg 179)
- Improve DB design (pg 187)
- Validate Logical Model Integrity Constraints (pg 402) (like length etc.)
- Validate the Logical Model against User Requirements
- Translate tables to mySQL code (in workbench translate EER to SQL file using export function then to mySQL)
*you can possibly skip this step if you using workbench and work of the ER model that you design there.
1. Describe the workings company in great detail. If you are creating personal project describe it in detail if you are working with a company ask for documents describing their company as well as interviewing the employees for information (interviews might generate inconsistent information make sure to check with supervisers which information is more important for design)
2. Look at the gathered information and start generating rules from them make sure to fill in any information gaps in your knowledge. Confirm with supervisers in the company before moving on.
3. Identify the main entities and relationships from the business rules. Keep in mind that during the design process, the database designer does not depend simply on interviews to help define entities, attributes, and relationships. A surprising amount of information can be gathered by examining the business forms and reports that an organization uses in its daily operations. (pg 123)
4. If the database is complex you can break down the ERD design into followig substeps
i) Create External Models (pg 46)
ii) Combine External Models to form Conceptual Model (pg 48)
Follow the following recursive steps for the design (or for each substep) I. Develop the initial ERD. II. Identify the attributes and primary keys that adequately describe the entities. III. Revise and review the ERD. IV. Repeat steps until satisfactory outputYou may also use entity clustering to further simplify your design process.
Describing database through ERD:Use solid lines to connect Weak Entities (Weak entities are those which cannot exist without parent entity and contain parents PK in their PK).Use dashed lines to connect Strong Entities (Strong entities are those which can exist independently of any other entity)
5. Check if your names follow your naming conventions. I used to have suggestions for naming conventions here but people didn't really like them. I suggest following your own standards or looking up some naming conventions online. Please post a comment if you found some naming conventions that are very useful.
6. Logical design generally involves translating the ER model into a set of relations (tables), columns, and constraints definitions.
Translate the ER to logical model using these steps:
- Map strong entities (entities that dont need other entities to exist)
- Map supertype/subtype relationships
- Map weak entities
- Map binary relationships
- Map higher degree relationships
7. Normalize the Logical Model. You may also denormalize the logical model in order to gain some desired characteristics. (like improved performance)
8.
Refine Attribute Atomicity -It is generally good practice to pay attention to the atomicity requirement. An atomic attribute is one that cannotbe further subdivided. Such an attribute is said to display atomicity. By improving the degree of atomicity, you also gain querying flexibility.
Refine Primary Keys as Required for Data Granularity - Granularity refers to the level of detail represented by the values stored in a table’s row. Data stored at their lowestlevel of granularity are said to be atomic data, as explained earlier. For example imagine ASSIGN_HOURS attribute to represent the hours worked by a given employee on a given project. However, arethose values recorded at their lowest level of granularity? In other words, does ASSIGN_HOURS represent the hourlytotal, daily total, weekly total, monthly total, or yearly total? Clearly, ASSIGN_HOURS requires more careful definition. In this case, the relevant question would be as follows: For what time frame—hour, day, week, month, andso on—do you want to record the ASSIGN_HOURS data?For example, assume that the combination of EMP_NUM and PROJ_NUM is an acceptable (composite) primary keyin the ASSIGNMENT table. That primary key is useful in representing only the total number of hours an employeeworked on a project since its start. Using a surrogate primary key such as ASSIGN_NUM provides lower granularityand yields greater flexibility. For example, assume that the EMP_NUM and PROJ_NUM combination is used as theprimary key, and then an employee makes two “hours worked” entries in the ASSIGNMENT table. That action violatesthe entity integrity requirement. Even if you add the ASSIGN_DATE as part of a composite PK, an entity integrityviolation is still generated if any employee makes two or more entries for the same project on the same day. (Theemployee might have worked on the project a few hours in the morning and then worked on it again later in the day.)The same data entry yields no problems when ASSIGN_NUM is used as the primary key.
Try to answer the questions: "Who will be allowed to use the tables and what portion(s) of the table(s) will be available to which users?" ETC.
Please feel free to leave suggestions or links to better descriptions in the comments below i will add it to my answer
One aspect of your question touched on representing subclass-superclass relationships in SQL tables. Martin Fowler discusses three ways to design this, of which my favorite is Class Table Inheritance. The tricky part is arranging for the Id field to propagate from superclasses to subclasses. Once you get that done, the joins you will typically want to do are slick, easy, and fast.