EventHub have Server Busy Errors time to time

Question

EventHub have Server Busy Errors time to time

Yuanbo Lin 40 Microsoft Employee

Eventhub name space: data-ehns-edp-prod-prna02

As you can see, in the server busy errors happens, the incoming bytes is not very high, so we want to know what happens in that time. And what's the criteria to trigger that server busy error.

SAI JAGADEESH KUDIPUDI 2,540 Reputation points Microsoft External Staff Moderator

2026-04-15T08:02:18.5166667+00:00

Hi Yuanbo Lin,
Could you please provide the requested details via Private message?

1 answer

Your answer

SAI JAGADEESH KUDIPUDI 2,540 Reputation points Microsoft External Staff Moderator

2026-04-15T08:02:18.5166667+00:00

Hi Yuanbo Lin,
Could you please provide the requested details via Private message?

Answer 1

Server busy errors in Azure Event Hubs are throttling signals that are raised when the service detects that either a partition or the namespace is exceeding its provisioned throughput or certain operational limits, even if the aggregated Incoming Bytes metric does not look high.

The criteria are as follows:

Partition-level throughput hot spots (error code 50002)
- If traffic is unevenly distributed and one partition hits its local throughput-unit (TU) limit, Event Hubs returns a ServerBusyException even though the overall namespace Incoming Bytes may appear moderate.
- Resolution is to revise the partition distribution strategy or send without a partition key so that Event Hubs can balance load across partitions.
Namespace-level throughput-unit limits (error code 50002)
- Each TU provides up to 1 MB/s ingress (and ≤1000 ingress requests/s) and 2 MB/s egress (and ≤4096 egress events/s) across the namespace.
- When real-time ingress/egress or request rate exceeds TUs × limits, Event Hubs throttles and emits server busy errors.
- The portal’s metrics are 1‑minute aggregates; instantaneous spikes above the TU limits can trigger throttling even if the 1‑minute Incoming Bytes sum looks low.
Excessive management/metadata calls (error code 50008)
- If the container running the namespace is CPU constrained, or if there are too many calls to operations like GetRuntimeInformation/GetRuntimeInfo, Event Hubs can return ServerBusyException with error code 50008.
- Event Hubs supports up to 50 GetRuntimeInfo calls per second per consumer group; exceeding this can cause short bursts of server busy errors.
How to interpret and investigate the metric
- Use the Throughput and Requests tabs on the namespace Overview → Show metrics blade.
- For the time ranges where Server Busy Errors spike, check:
  - Incoming Bytes / Outgoing Bytes peaks and divide by 60 to approximate bytes per second.
  - Incoming Requests at peak and compare to TU limits (1000 ingress requests/s per TU).
- If peak values exceed the TU-based limits, increase TUs on the Scale page or enable Auto-inflate so the namespace can scale up automatically.
Client behavior and retries
- For AMQP/Event Hubs clients, ServerBusyException is a transient error that should be handled with retries and backoff. For example, the Java ServerBusyException documentation recommends waiting at least 4 seconds before retrying operations for that entity.

In summary, server busy errors are triggered when either a single partition or the namespace as a whole exceeds throughput or operational limits (bytes/s, requests/s, or certain management-call limits). Short spikes and partition hot spots can cause these errors even when the aggregated Incoming Bytes metric appears low.

References:

Share via

EventHub have Server Busy Errors time to time

1 answer

Your answer