Share via

Azure OpenAI create container timeout error

Wang, Xifan 0 Reputation points
2026-03-18T16:28:14.3566667+00:00

Hello,
I am trying to use the reponse API by first creating a container, then create the code_interpreter_tool, however this action openai_client.container.create often leads to the APITimeOutError, the timeout_seconds here is 15 secs.
Roughly 70 % percent of the time, container can be created with 1 second. For the rest, timeout.

The goal of first create one container, then asign the container id to the code interpreter tool, is that the container can be kept used within one thread. Whilke, different threads use different containers.

Would really appreciate some help here.


openai_client = OpenAI(
    base_url=f"{os.getenv('AZURE_OPENAI_ENDPOINT')}openai/v1/",
    api_key=token_provider(),
    )

def _init_code_interpreter_tool(self, openai_client: OpenAI):
        try:
            container = openai_client.containers.create(
                name="test",
                timeout=self.timeout_seconds,
                extra_headers={"memory_limit": "4g"},
            )
            code_interpreter_tool = {
                "type": "code_interpreter",
                "container": container.id,
            }
            return code_interpreter_tool, container
        except APITimeoutError as e:
            raise TimeoutError(
                f"OpenAI container creation timed out after {self.timeout_seconds}s."
            ) from e

        except APIConnectionError as e:
            raise ConnectionError(
                "OpenAI container creation failed due to a connection error."
            ) from e

        except OpenAIError as e:
            raise ValueError(f"OpenAI container creation failed: {e}") from e
Azure OpenAI Service
Azure OpenAI Service

An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.

0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Karnam Venkata Rajeswari 565 Reputation points Microsoft External Staff Moderator
    2026-03-26T13:39:42.7966667+00:00

    Hello Wang, Xifan,

    Welcome to Microsoft Q&A and Thank you for reaching out.

    In addition to the imputs provided by Anish Raj , please see if the following helps.

    The timeout is occurring during container creation for the Code Interpreter tool. Creating a container is not an instantaneous operation. It requires backend resource allocation and environment initialization. In many cases, this process completes quickly, but in some cases it takes longer than the configured client‑side timeout. When the client stops waiting before the operation completes, an APITimeoutError is raised even though the request itself is valid.

    Please check if the following helps:

    1. Consider increasing the client‑side timeout -container creation may legitimately require more time than the default timeout.
    • So, please increase the timeout value used by the SDK (for example, 30–60 seconds or higher where appropriate).
    • Please avoid aggressive timeout values for container provisioning operations.
    1. Implement retry logic with exponential backoff - provisioning delays are often transient.
    • So , wrap the container creation call in retry logic.
    • Use exponential backoff between retries to avoid repeated immediate failures.
    • Limit retries to a reasonable number (for example, 2–4 attempts).
    1. Reduce container creation frequency - creating a new container for every thread or request increases cold‑start overhead.
    • Consider reusing containers at a session or workflow level where isolation requirements allow.
    • Alternatively, maintain a small pool of pre‑created containers and assign them as needed.
    • Please avoid unnecessary teardown and recreation of containers.

     

    1. Review quota and capacity configuration -provisioning delays can occur if resource limits are being approached.
    • Please check and validate that the Azure OpenAI resource is within configured quotas.
    • Review usage limits and ensure sufficient capacity is available for concurrent operations.
    1. Check regional conditions -provisioning behavior may vary based on regional load.
    • So , please review Azure Service Health for any active or recent incidents.
    • If consistent delays are observed, testing the same configuration in another supported region may help determine whether the behavior is region‑specific.

     

    1. Validate optional request parameters -additional parameters can affect provisioning behavior.
    • Test container creation without optional headers such as custom memory limits to rule out configuration‑related delays.
    • Confirm that request payloads follow documented formats and supported values.

    References:

    Azure OpenAI in Microsoft Foundry Models Quotas and Limits - Microsoft Foundry | Microsoft Learn

    Thank you!

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

    0 comments No comments

  2. Anish Raj 0 Reputation points
    2026-03-18T20:27:17.87+00:00

    Fix: Azure OpenAI Container Timeout Error (APITimeOutError)

    The issue is that container creation on Azure OpenAI is non-deterministic

    in latency — 70% fast, 30% slow. A hardcoded 15-second timeout is too

    fragile for production use. Three fixes:


    Fix 1: Increase timeout + add retry logic

    
    import time
    
    def _init_code_interpreter_tool(self, openai_client, max_retries=3):
    
        for attempt in range(max_retries):
    
            try:
    
                container = openai_client.containers.create(
    
                    name="test",
    
                    timeout=30,  # increase from 15 to 30 seconds
    
                    extra_headers={"memory_limit": "4g"},
    
                )
    
                code_interpreter_tool = {
    
                    "type": "code_interpreter",
    
                    "container": container.id,
    
                }
    
                return code_interpreter_tool, container
    
            except APITimeoutError:
    
                if attempt < max_retries - 1:
    
                    time.sleep(2 ** attempt)  # exponential backoff: 1s, 2s, 4s
    
                    continue
    
                raise TimeoutError(f"Container creation failed after 
    
                                   {max_retries} attempts.")
    
    

    Fix 2: Reuse existing containers instead of creating new ones

    Creating a new container on every call is the root cause of the

    inconsistency. Cache the container ID and reuse it across threads:

    
    _container_cache = {}
    
    def _init_code_interpreter_tool(self, openai_client, session_id):
    
        if session_id in _container_cache:
    
            container_id = _container_cache[session_id]
    
        else:
    
            container = openai_client.containers.create(
    
                name="test",
    
                timeout=30,
    
                extra_headers={"memory_limit": "4g"},
    
            )
    
            _container_cache[session_id] = container.id
    
            container_id = container.id
    
        return {"type": "code_interpreter", "container": container_id}
    
    

    Fix 3: Check your Azure region

    Some Azure regions have higher container spin-up latency. If you are

    on eastus, try switching to westus2 or swedencentral — these

    tend to have lower cold-start times for container workloads.


    I ran into similar cold-start latency issues when deploying inference

    containers via Docker on AWS EC2 for an ML pipeline — retry with

    exponential backoff solved it completely.

    Reference: Azure OpenAI Containers documentation

    Hope this resolves it — let me know if the retry logic helps!

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.