Hi Tutorvi Admin,
First, it's important to clarify that there are two distinct real-time audio APIs in Azure, which may be causing some of your confusion:
- Azure OpenAI GPT Realtime API (WebRTC/WebSocket-based)
- Azure AI Speech Voice Live API (Speech Service-based)
The documentation link you mentioned that returns a 404 error is attempting to reference the Voice Live API, not the GPT Realtime API.
Here are the current official, Microsoft documentation links for the GPT Realtime API:
- GPT Realtime API for speech and audio - Quickstart - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/realtime-audio-quickstart
- How to use the GPT Realtime API for speech and audio - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/realtime-audio once you open this link you will find all the necessary documentation and answer to your confusion:
Note: Make sure to deploy in specified regions only.
On the other hand, Azure AI Speech "Voice Live" API:
- What it is: An integration that wraps the GPT model inside the Azure Speech service. It adds features like echo cancellation, custom neural voices, and advanced Voice Activity Detection (VAD).
- Best for: Telephony, complex voice agents, or when you need specific Azure Speech features.
- Correct Docs (New Link): Get started with Azure Speech Voice Live
If this answers your query, kindly "Accept and upvote the answer" so it benefits the other community members.
Happy to help! 😊