Azure Cosmos DB - Understanding Partition Key

azure azure-cosmosdb

Honestly the video here* was a MAJOR help to understanding partitioning in CosmosDb.

But, in a nutshell:The PartitionKey is a property that will exist on every single object that is best used to group similar objects together.

Good examples include Location (like City), Customer Id, Team, and more. Naturally, it wildly depends on your solution; so perhaps if you were to post what your object looks like we could recommend a good partition key.

EDIT: Should be noted that PartitionKey isn't required for collections under 10GB. (thanks David Makogon)

* The video used to live on this MS docs page entitled, "Partitioning and horizontal scaling in Azure Cosmos DB", but has since been removed. A direct link has been provided, above.

azure azure-cosmosdb

Partition key acts as a logical partition.

Now, what is a logical partition, you may ask? A logical partition may vary upon your requirements; suppose you have data that can be categorized on the basis of your customers, for this customer "Id" will act as a logical partition and info for the users will be placed according to their customer Id.

What effect does this have on the query?

While querying you would put your partition key as feed options and won't include it in your filter.

e.g: If your query was

SELECT * FROM T WHERE T.CustomerId= 'CustomerId';

It will be Now

var options = new FeedOptions{ PartitionKey = new PartitionKey(CustomerId)};var query = _client.CreateDocumentQuery(CollectionUri,$"SELECT * FROM T",options).AsDocumentQuery();

azure azure-cosmosdb

CosmosDB can be used to store any limit of data. How it does in the back end is using partition key. Is it the same as Primary key? - NO

Primary Key: Uniquely identifies the dataPartition key helps in sharding of data(For example one partition for city New York when city is a partition key).

Partitions have a limit of 10GB and the better we spread the data across partitions, the more we can use it. Though it will eventually need more connections to get data from all partitions. Example: Getting data from same partition in a query will be always faster then getting data from multiple partitions.

CodeHunter

Azure Cosmos DB - Understanding Partition Key

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last