Should the event hub have same number of partitions as throughput units? Should the event hub have same number of partitions as throughput units? azure azure

Should the event hub have same number of partitions as throughput units?


Great question.

A few basics.

1 Throughput Unit (TU) means an ingress limit of 1 MB/sec or 1000 msgs/sec - whichever happens first. You pay for TUs and you can change TUs as per your load requirements. This is your knob to control the bill. And TUs are set on a given Event Hubs Namespace!

When you buy 1 TU for an EventHubs Namespace and create a number of EventHubs in it, the the limit of 1 MB/sec or 1000 msgs/sec applies cumulatively across them. The limit also applies to each partition individually. Although, sometimes you might get lucky in some regions where load is low.

Consider these principles while deciding on no. of partitions in eventhub for your service:

  1. The intent of Partitions is to offer high-availability. If you are sending to Eventhubs and you want the sends to succeed NO MATTER WHAT you should create multiple partitions and send using EventHubClient.Send (which doesn't confine the send to a particular partition).
  2. The no. of partitions will determine how fat the event pipe is & how fast/parallelly you can receive & process the events. If you have 10 partitions on your EventHub - it's capacity is effectively capped at 10 TUs. You can create 10 epoch receivers in parallel & consume & process events. If you envision that the EventHub that you are currently creating now can quickly grow 10-fold create as many partitions and keep the TU's matching the current load. Analogy here is having multiple lanes on a freeway!

Another thing to note is, a TU is configured at namespace level. And, one Event Hubs namespace can have multiple EventHubs in it and each EventHub can have a different no. of partitions.

Answers:

If you select 5 or more TUs on the Namespace and have only 1 EventHub with 4 partitions you will get a max. of 4 MB/sec or 4K msgs/sec.

Egress max will be 2X of ingress (8 MBPS or 8K msgs/sec). In other words, you could create 2 patterns of receives (e.g. slow and fast) by creating 2 consumer groups. If you need more than 2X parallel receives then you will need to by more TUs.

Yes, ideally you will need more partitions than TUs. First model your partition count as mentioned above. Start with 1 TU while you are developing your solution. Once done, when you are doing load testing or going live, increase TUs in tune with your load. Remember, you could have multiple EventHubs in a Namespace. So, having 20 TUs at Namespace level and 10 EventHubs with 4 partitions each can deliver 20 MB/sec across the Namespace.

More on EventHubs


One partition goes to one TPU. Think of TPUs as a processing engine. You can't take advantage of more TPUs than you have partitions. If you have 4 partitions, you can't use more than 4 TPUs.

It's typical to have more partitions than TPUs, for the following reasons

  • You can scale the number of TPUs up if you have a lot of traffic, but you can't change the number of partitions
  • You can't have more concurrent readers than you have partitions. If you want to have 5 concurrent readers, you need 5 partitions.

As for throughput, the limits are 1 MB ingerss/2 MB egress per TPU. This covers the typical scenario where each event is sent both to cold storage (eg a database) and Stream analytics or an event processor for analysis, monitoring etc.