An Azure artificial intelligence analytics service that proactively monitors metrics and diagnoses issues.
Hi Jack Halpern,
This kind of behavior usually happens when the service is busy or throttled, not because the request is completely lost. For Document Understanding and Document Intelligence, quota is mainly enforced on pages processed and concurrent operations, not just on request count.
Here are a few ways to check whether this is a quota or capacity issue.
- Focus on page‑based metrics, not request metrics. Document Intelligence processes documents by pages, and quotas are applied at that level. In the Azure portal Metrics blade for your Document Intelligence resource, look at metrics related to pages processed and analyze operations rather than raw API calls.
- Check concurrent analyze operations. If many AnalyzeAsync calls are running at the same time, new requests can wait longer or appear to hang until capacity frees up. This can look like AnalyzeAsync never returning when the system is under sustained load.
- Review the Document Intelligence service limits for custom models. If you are using Custom Classifier or Custom extraction models, make sure your usage stays within documented limits such as training size, number of pages, and model limits. Staying within these limits means the service should not reject requests due to quota. https://learn.microsoft.com/azure/ai-services/document-intelligence/service-limits
- Use Azure Monitor metrics to correlate timing. If AnalyzeAsync hangs, check whether page processing spikes or long running operations line up with the time the request was submitted. This helps distinguish between quota pressure and transient service load.
- Capture correlation IDs from SDK logs for the long‑running calls. Even when a request does not complete, the SDK logs usually include a correlation ID that can be used to trace the operation. This helps confirm whether the request reached the service and is still being processed. How Document Intelligence works and is monitored
In short, there is no single screen that says quota exceeded. The best signal is to compare page processing metrics and concurrency against the documented limits. If those stay within limits, the behavior is more likely due to temporary service load rather than quota exhaustion.
Hope this helps you narrow down where the issue is coming from.
Thankyou!