When people talk about scaling a website with 'shards', what do they mean? When people talk about scaling a website with 'shards', what do they mean? database database

When people talk about scaling a website with 'shards', what do they mean?


Karl Seguin has a good blog post about sharding.

From the post:

Sharding is the separation of your data across multiple servers. How you separate your data is up to you, but generally it’s done on some fundamental identifier.


In brief, imagine seperating your users_tbl across several servers. So Users 1-5000 and on Server 1, Users 5000-10000 on Server 2; etc. If your data model is sufficiently abstract in code, it's often not a huge change in code.

Of course this approach becomes difficult if all your queries are similar to "SELECT COUNT(*) FROM users_tbl GROUP BY userType" but when your where is "WHERE userid = 5" then it makes more sense.


As 'sharding' is part of the architecture principles for large websites, you may be interested in listening to 'eBay's Architecture Principles with Randy Shoup' here.