Change modalities mid session for voice live azure api

Vishal Rawat 40 Reputation points
2025-11-26T05:56:29.57+00:00

I tried to change modalities mid session from ["text", "audio"] to ["text"] and vice versa, to implement switch between text and voice feature for my voice+chat agent. I did this by sending session.update event to azure voice-live api with updated modalities but does not seem to work and even the session.updated event returned by azure does not seem to update the modalities.

Any help with this?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
{count} votes

Answer accepted by question author
  1. Aryan Parashar 3,380 Reputation points Microsoft External Staff Moderator
    2025-11-27T07:21:24.6866667+00:00

    Hi Vishal Rawat,

    Thank you for reaching out, and I completely understand your frustration with this issue. You're not alone. Several developers have encountered the same challenge when trying to switch modalities mid-session.

    Changing modalities mid-session is problematic in both OpenAI and Azure implementations.

    When you send a session.update event to switch between ["text", "audio"] and ["text"], the modalities configuration appears to be locked at session initialization. This is because audio and text processing use fundamentally different pipelines, and the underlying connections are established when the session starts.

    Here are some relevant references that discuss this behavior:

    https://community.openai.com/t/realtime-api-updating-modalities/996243/3

    https://community.openai.com/t/realtime-modalities-session-config-not-disabling-local-model-audio-channel/1279443

    https://learn.microsoft.com/en-us/answers/questions/5561090/azure-openai-gpt-realtime-generating-voice-respons

    Recommended Solution:

    Keep modalities: ["text", "audio"] enabled throughout the session and control the audio behavior at the application level essentially managing when audio features are actively used through your client-side logic rather than trying to reconfigure the session.

    Feel free to accept this as an answer.

    Thank you for reaching out to the Microsoft Q&A portal!

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.