Azure Document Intelligence - Infinite Processing, no Errors

Christoph Zeiner 0 Reputation points
2024-09-24T11:28:53.7033333+00:00

Hello Community,

I am currently using the Document Intelligence Client (both prebuilt-read and prebuilt-document). However, it often happens that my request ends in an infinite loop and the OCR recognition does not complete - no matter how long I wait. I am not getting any error messages.

I only run the following simple code in Python (“data” is a PDF file in bytes):

document_analysis_client = DocumentAnalysisClient(endpoint=doc_intelligence_endpoint, credential=AzureKeyCredential(doc_intelligence_key))

poller = document_analysis_client.begin_analyze_document(“prebuilt-read”, data)

result = poller.result()

In my logs I keep getting the following messages:

2024-09-21 22:42:53,555:INFO - Response status: 200 Response headers: 'Content Length': '106' 'Content-Type': 'application/json; charset=utf-8' 'retry-after': '7' 'x-envoy-upstream-service-time': '15' 'apim-request-id': ' ' 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload' 'x-content-type-options': 'nosniff' 'x-ms-region': 'West Europe' 'Date': 'Sat, 21 Sep 2024 20:42:51 GMT'

This happens randomly - if I abort and re-run the same code, it works 80% of the time. How can I solve this problem? This is urgent – our software is already being used by customers. Is this potentially related to the API version? I am using 2023-07-31 (General availability).

Thanks!

Best regards Christoph

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,662 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 10,416 Reputation points
    2024-09-24T14:57:22.5533333+00:00

    Hello Christoph Zeiner,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you have Infinite Processing, without errors in Azure Document Intelligence.

    Since your software is already in use by customers, I will advise you to quickly implement a retry mechanism that waits for the specified retry-after period before attempting the request again, in similar to this:

    import time
    from azure.core.exceptions import ServiceRequestError
    def analyze_document_with_retry(client, model, data, max_retries=5):
        retries = 0
        while retries < max_retries:
            try:
                poller = client.begin_analyze_document(model, data)
                result = poller.result()
                return result
            except ServiceRequestError as e:
                retries += 1
                retry_after = int(e.response.headers.get('retry-after', 5))
                time.sleep(retry_after)
                if retries == max_retries:
                    raise e
    document_analysis_client = DocumentAnalysisClient(endpoint=doc_intelligence_endpoint, credential=AzureKeyCredential(doc_intelligence_key))
    result = analyze_document_with_retry(document_analysis_client, "prebuilt-read", data)
    

    Now, to troubleshoot:

    Consider testing with the latest version of the API version you are using (2023-07-31) is fully compatible https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-read?view=doc-intel-4.0.0

    Try to increase the timeout settings in your client configuration to allow more time for the OCR process to complete and check for service outage or ongoing maintenance. https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/use-sdk-rest-api?view=doc-intel-4.0.0

    Other things to do is to check the document size and log information https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/choose-model-feature?view=doc-intel-4.0.0 and https://learn.microsoft.com/en-us/python/api/overview/azure/ai-documentintelligence-readme?view=azure-python-preview

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.