An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Wang, Xifan,
Welcome to Microsoft Q&A and Thank you for reaching out.
In addition to the imputs provided by Anish Raj , please see if the following helps.
The timeout is occurring during container creation for the Code Interpreter tool. Creating a container is not an instantaneous operation. It requires backend resource allocation and environment initialization. In many cases, this process completes quickly, but in some cases it takes longer than the configured client‑side timeout. When the client stops waiting before the operation completes, an APITimeoutError is raised even though the request itself is valid.
Please check if the following helps:
- Consider increasing the client‑side timeout -container creation may legitimately require more time than the default timeout.
- So, please increase the timeout value used by the SDK (for example, 30–60 seconds or higher where appropriate).
- Please avoid aggressive timeout values for container provisioning operations.
- Implement retry logic with exponential backoff - provisioning delays are often transient.
- So , wrap the container creation call in retry logic.
- Use exponential backoff between retries to avoid repeated immediate failures.
- Limit retries to a reasonable number (for example, 2–4 attempts).
- Reduce container creation frequency - creating a new container for every thread or request increases cold‑start overhead.
- Consider reusing containers at a session or workflow level where isolation requirements allow.
- Alternatively, maintain a small pool of pre‑created containers and assign them as needed.
- Please avoid unnecessary teardown and recreation of containers.
- Review quota and capacity configuration -provisioning delays can occur if resource limits are being approached.
- Please check and validate that the Azure OpenAI resource is within configured quotas.
- Review usage limits and ensure sufficient capacity is available for concurrent operations.
- Check regional conditions -provisioning behavior may vary based on regional load.
- So , please review Azure Service Health for any active or recent incidents.
- If consistent delays are observed, testing the same configuration in another supported region may help determine whether the behavior is region‑specific.
- Validate optional request parameters -additional parameters can affect provisioning behavior.
- Test container creation without optional headers such as custom memory limits to rule out configuration‑related delays.
- Confirm that request payloads follow documented formats and supported values.
References:
Azure OpenAI in Microsoft Foundry Models Quotas and Limits - Microsoft Foundry | Microsoft Learn
Thank you!
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.