Share via

Azure AI Document Undersstanding - Quotas and Metrics

Jack Halpern 20 Reputation points
2026-03-16T18:56:41.6833333+00:00

We have been analyzing documents with Custom Classifier and Custom schema successfully for almost 6 months. Using the Azure.AI.Contentunderstanding nuget with WaitUntil.Completed set on our AnalyzeAsync we've recently encountered situations in which the AnalyzeAsync never returns. We suspect that somehow we've exceeded out quota or the request is being lost or dropped.

We only have one subscription and deploy our projects in the East region. We believe we're in Tier2. Using the Metrics blade (see screenshot) I've tried to figure out if, in fact, we're reaching quota limit. But I don't know where to begin.

My question: Is there a simple screen or set of screens I can use to verify if it's a quota limit problem or something else? Or some combination of Metric and Aggregation which we should focus on? Any guidance you can give to help us determine why some documents never complete would be appreciated.

Thanks

Jack

User's image

Azure AI Metrics Advisor
Azure AI Metrics Advisor

An Azure artificial intelligence analytics service that proactively monitors metrics and diagnoses issues.


2 answers

Sort by: Most helpful
  1. Anshika Varshney 9,335 Reputation points Microsoft External Staff Moderator
    2026-03-16T19:41:08.24+00:00

    Hi Jack Halpern,

    This kind of behavior usually happens when the service is busy or throttled, not because the request is completely lost. For Document Understanding and Document Intelligence, quota is mainly enforced on pages processed and concurrent operations, not just on request count.

    Here are a few ways to check whether this is a quota or capacity issue.

    1. Focus on page‑based metrics, not request metrics. Document Intelligence processes documents by pages, and quotas are applied at that level. In the Azure portal Metrics blade for your Document Intelligence resource, look at metrics related to pages processed and analyze operations rather than raw API calls.
    2. Check concurrent analyze operations. If many AnalyzeAsync calls are running at the same time, new requests can wait longer or appear to hang until capacity frees up. This can look like AnalyzeAsync never returning when the system is under sustained load.
    3. Review the Document Intelligence service limits for custom models. If you are using Custom Classifier or Custom extraction models, make sure your usage stays within documented limits such as training size, number of pages, and model limits. Staying within these limits means the service should not reject requests due to quota. https://learn.microsoft.com/azure/ai-services/document-intelligence/service-limits
    4. Use Azure Monitor metrics to correlate timing. If AnalyzeAsync hangs, check whether page processing spikes or long running operations line up with the time the request was submitted. This helps distinguish between quota pressure and transient service load.
    5. Capture correlation IDs from SDK logs for the long‑running calls. Even when a request does not complete, the SDK logs usually include a correlation ID that can be used to trace the operation. This helps confirm whether the request reached the service and is still being processed. How Document Intelligence works and is monitored

    In short, there is no single screen that says quota exceeded. The best signal is to compare page processing metrics and concurrency against the documented limits. If those stay within limits, the behavior is more likely due to temporary service load rather than quota exhaustion.

    Hope this helps you narrow down where the issue is coming from.

    Thankyou!

    0 comments No comments

  2. Jack Halpern 20 Reputation points
    2026-03-16T19:06:02.3166667+00:00
    • Capture correlation IDs from SDK logs for failing calls.

    How do I get correlation ID from the AnalyzeAsync method?

    This answer is for Document Intelligence not Document Understanding.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.