An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Pranit Awasthi,
Welcome to Microsoft Q&A and Thank you for reaching out.
The POST /openai/v1/realtime/client_secrets call is reaching the service successfully (authentication is valid), but the service is rejecting the request because the deployment being targeted is not recognized as a Realtime-capable deployment for the Realtime operation. The error text - realtime operation does not work with the specified model This is most commonly encountered when
- The deployment is not actually a GPT Realtime model deployment (even if the deployment name exists and looks correct). Realtime endpoints only work with the GPT Realtime model families listed as supported for Realtime.
- A non-supported region/resource is being used for Realtime, because Realtime model availability is region-dependent. The Realtime WebRTC doc lists supported regions for global deployments as East US 2 and Sweden Central.
- The request is passing a model name when the endpoint expects a deployment name .If the deployment name is different from gpt-realtime-mini, the call can be routed incorrectly and result in operation/model mismatch errors.
As asked if the deployment needs to be created specifically from a Realtime-capable model family (for example, gpt-realtime / gpt-realtime-mini) - yes. Only the Realtime API supports specific GPT Realtime models/versions.
As asked if there any additional configuration steps required to enable Realtime support on a deployment - No special “enable” toggle is required beyond meeting the documented prerequisites
No special “enable” toggle is required beyond meeting the documented prerequisites. The documented prerequisites are:
- A resource created in a supported region, and
- A deployment of a GPT Realtime model in that supported region, and
- Using the GA endpoint format with /openai/v1 in the URL
For WebRTC ephemeral tokens specifically, please use the GA client secrets endpoint (…/openai/v1/realtime/client_secrets) as the mechanism to get the ephemeral token
As asked if this error indicates that the deployment is based on a model that does not support the Realtime API, even if the deployment name is valid - Yes. This error is consistent with the Realtime operation being invoked against a deployment that is not a supported Realtime model deployment or is deployed in a non-supported region for Realtime.
Please note that for Azure OpenAI, model should refer to the deployment name chosen during deployment. If the deployment name is not exactly gpt-realtime-mini, set "model" to the deployment name instead.
Please consider the following troubleshooting steps
- Confirm the deployed base model is actually a GPT Realtime model
- Confirm the resource region is supported for Realtime
- Use the deployment name in the request payload
- Keep GA endpoint format exactly as documented
References:
- Use the GPT Realtime API via WebRTC - Microsoft Foundry | Microsoft Learn
- Use the GPT Realtime API for speech and audio with Azure OpenAI - Microsoft Foundry | Microsoft Learn
- Migration from Preview to GA version of Realtime API - Microsoft Foundry | Microsoft Learn
- How to switch between OpenAI and Azure OpenAI endpoints with Python - Azure OpenAI Service | Microsoft Learn
Thank you!
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.