Hierarchical Data in MySQL Hierarchical Data in MySQL mysql mysql

Hierarchical Data in MySQL


You want to be given the parent ID:

So assume you are given

set @parentId = 1 /*toys*/select   *from  Items iinner join Categories c on c.id = i.categoryIdwhere  c.parentId = @parentId

This will give you the items you want - with one major design flaw: it doesn't handle multiple levels of hierarchical categories.

Let's say you had this Categories table:

*Categories table*id | name    | parentId1  | Toys    | 02  | Dolls   | 13  | Bikes   | 14  | Models  | 25  | Act.Fig.| 26  | Mountain| 37  | BMX     | 3

And Items:

*items table*item   | category_idBarbie | 4GIJoe  | 5Schwinn| 6Huffy  | 7

The only way to get all the relevant Items is do a self join:

select   *from  Items i inner join Categories c on c.id = i.categoryIdinner join Categories c2 on c.parentId = c2.idwhere  c2.parentId = @parentId

This pattern is not scalable - since you can have MULTIPLE levels of hierarchy.

One common way to deal with hierarchies is to build a "flattened" table: a row that links each node to ALL it's descendants.

In addition to a Categories table, you build a second table:

*CategoriesFlat table*  The Name column is here only for readabilityid | name    | parentId1  | Toys    | 1-----------------2  | Dolls   | 12  | Dolls   | 2-----------------4  | Models  | 14  | Models  | 24  | Models  | 45  | Act.Fig.| 15  | Act.Fig.| 25  | Act.Fig.| 5-----------------3  | Bikes   | 13  | Bikes   | 3-----------------6  | Mountain| 16  | Mountain| 36  | Mountain| 67  | BMX     | 17  | BMX     | 37  | BMX     | 7

So you can write:

select   *from  Items iinner join CategoriesFlat c on c.id = i.categoryIdwhere  c.parentId = @parentId

And get ALL the relevant Categories and Items.

Here's a great slideshow about SQL anti-patterns and solutions to them. (Hierarchical data in SQL is an anti-pattern, but don't be disheartened - we all run into this one)


But if I had like 10 categories under Toys, then I'd have to do this join and query 10 times. Is there a better way to handle this?

Yes, there's a way of storing data called "nested sets". It's a bit harder to insert the data, but simple to select an entire, multi-level branch using a single select statement.

Also, Celko has written a book about this subject, with a chapter about nested sets and other chapters about other methods.


I'm assuming you know how to get the ID number and that's not the point of the question. Also, parent_id should also be a FK referencing id, and I would use NULL for the topmost layer, not 0.

If your top-most categories have at most one level of sub-category, you can use this query to get all Toys:

SELECT *FROM itemsWHERE items.category_id IN (SELECT id FROM categories                            WHERE categories.parent_id = 1                            OR categories.id = 1);

If your categories can have nested sub-categories, you'll have to use a stored procedure and call it recursively. Pseudocode:

Procedure getItemsInCategoryInput: @category_id integerOutput: items rows{    For each item in (SELECT *                      FROM items                      WHERE items.category_id = @category_id):        return the row;    For each id in (SELECT id                     FROM categories                    WHERE categories.parent_id = @category_id):        return the rows in getItemsInCategory(id);}