Azure AI Speech

0 answers

Issue Creating Azure AI Language Resource in Custom Question Answering Lab

Hello, I am currently working on the Microsoft Applied Skills lab for Custom Question Answering. When attempting to create the Azure AI Language resource, the deployment fails with the following error: RequestDisallowedByPolicy – The resource was…

asked

Pornpra Chumnanvanichkul 0

edited the question

Pornpra Chumnanvanichkul 0

0 answers

CRITICAL ISSUE Azure AI Speech SDK – Numbers getting Added , Deleted and Substituted and sometimes Exceeds too much time while using the microsoft realtime speech to text conginitve services API

We are using Azure Speech Service with the browser Speech SDK for real-time speech-to-text transcription. We are observing an issue when users speak continuous digits. The recognizer sometimes returns a significantly different number of digits than were…

asked

Aravind ks 20

edited the question

Aravind ks 20

2 answers

Pronunciation Assessment with Language en-GB- Phoneme symbols

I am using the pronunciation assessment API for language en-GB Doing the assessment at phoneme level The documentation does mention this: AccuracyScore: Phoneme level， Syllable level (en-US only)， Word level， Full Text level I get a response with empty…

asked

Anju Aggarwal 0

commented

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

2 answers

Please clarify the conflicting information regarding permission to use the free tier of Azure Speech for commercial purposes, such as narration of a YouTube video.

Hello everyone, I had previously asked a question on this forum regarding whether the Free Tier F0 of Azure Speech can be used for commercial purposes such as narration of a YouTube video:…

asked

KRJ14 0

commented

Manas Mohanty 14,750 Microsoft External Staff Moderator

1 answer

Can the audio generated by Azure Speech Studio's free tier (monthly limit of 500,000 characters) be used for commercial purposes like for example: narration of a youtube video?

Hello! I've searched this Q&A site extensively but found conflicting answers and hence, I thought I should I ask directly. I've Azure Speech Studio's free tier (monthly limit of 500,000 characters) and can I use the audio generated by that for…

asked

KRJ14 0

commented

Marcin Policht 81,705 MVP Volunteer Moderator

2 answers

Custom Neural Voice (CNV Pro) model in East US and East US 2 is failing to train the model

Custom Neural Voice (CNV Pro) model in East US and East US 2, and the training consistently fails after several hours with an internal/unknown error. The dataset uploads successfully and passes validation, but the training job never completes. It fails…

asked

Ramachandran, Iaiswarya I 20 Microsoft Employee

commented

Ramachandran, Iaiswarya I 20 Microsoft Employee

1 answer

Special character ampersand (“&”) breaks word boundaries in Azure Text-to-Speech

Hello, I’m encountering an issue with word boundary events in Azure Text-to-Speech when the input text contains the ampersand character (&). Context Locale: fr-FR Neural French voice (e.g. fr-FR-Remy:DragonHDLatestNeural) Batch synthesis API …

asked

Soulaïman Marsou 0

answered

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

2 answers

Azure AI Foundry agents intermittently failing with JSON parsing error (empty response, no schema changes)

We are experiencing intermittent but increasingly frequent failures when running agents on Azure AI Foundry. Agents that were working correctly in the same environment, with no code or schema changes, suddenly started failing with the following…

asked

Maximiliano Gutierrez 0

edited the question

Jilakara Hemalatha 10,205 Microsoft External Staff Moderator

2 answers

Python code to generate ephemeral token for gpt-4o-mini-transcribe OR gpt-4o-transcribe

Hi Team, We're unable to find ways/python code to generate ephemeral token for gpt-4o-mini-transcribe OR gpt-4o-transcribe. Searched online & there are some references for generating such tokens for realtime API. But none for…

asked

GenixPRO 171

answered

Anshika Varshney 7,970 Microsoft External Staff Moderator

2 answers

can some one help, how to config voicelive sdk to recieve animation blendshapes and viseme_id

it try to add this but no animation data recieve. modalities: ["text", "audio", 'animation'], outputAudioTimestampYypes: ["word"], animation: { modelName: "default", outputs:…

asked

Dadong Hu 0

commented

Anshika Varshney 7,970 Microsoft External Staff Moderator

1 answer

Has MS abandoned human tech support?

Reading some of the nightmare scenarios on these forums and realizing that human tech support is a thing of the past really alarms me. It's obvious that since companies such as MS are pouring so much into AI, they've abandoned tech support from humans.…

asked

Ed Myers 0

answered

Jerald Felix 10,970

1 answer

Issues with Azure Speech Services: Incorrect transcription of "draft" as "draught" and "£" as "lbs" in UK English

I'm using Azure Speech Services with the language set to UK English, and I've noticed two recurring transcription issues: When I dictate the word "draft", it consistently transcribes as "draught", even when the context clearly favors…

asked

Niki Kariappa 0

answered

Mike Williams 0

1 answer

Azure: Deactivated Severity: 2 alert-0225185834

2 emails Azure: Deactivated Severity: 2 alert-0225185834

asked

Danny FitzGerald 0

answered

Q&A Assist

1 answer

High Initial Latency with Multi-Language Detection (3+ Languages)

Hello Azure Speech Team, We're experiencing significant initial latency when using Continuous Language Identification with 2+ languages in production. Configuration: Languages: 3 languages (en-IN, te-IN, hi-IN) Mode:…

asked

ello ai 5

commented

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

2 answers

gpt-4o-transcribe for real-time speech-to-text transcription ---slow speed

When I try to use gpt-4o-transcribe for real-time speech-to-text transcription, it takes about 1.5-2 seconds for a 2s mp3 file from sending the request to receiving the first token. Are there improved methods or other model options? Additionally,…

asked

yu.lili 0

answered

Karnam Venkata Rajeswari 280 Microsoft External Staff Moderator

1 answer

Custom Avatar Model Training Showing as Processing After 16 Hours

I created a Azure AI Service Resource in West US 2(Test Avatar) and then went to Speech Studio, uploaded all the required training Data and then started the model training. But its showing 1hr left estimated for last 8 Hours.

asked

Trinanjan Majumder 0 Microsoft Employee

answered

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

3 answers

Transcription using gpt-4o-transcribe with gpt-realtime is failing in useast2

Hello, I am trying to use gpt-4o-transcribe with gpt-realtime in useast2, and it is consistently failing. I am using gpt-realtime with websockets as per the documentation. I am seeing the following event:…

asked

PRABU WEERASINGHE 0

commented

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

2 answers

Pricing for Azure Voice Live API

We are evaluating Azure Voice live API for our Contact Center use case, automating with AI. However, we could not find the latest pricing of Azure Voice live API - we want to use Pro version - use Azure speech, GPT 5.2 Chat (or suitable chat models).…

asked

Sankar Ramakrishnan, Prathap 20

accepted

Sankar Ramakrishnan, Prathap 20

2 answers

Function Calling via Foundry Agent in Voice Live API

Below are the quickstarts for foundry agent with Voice Live API, function calling with Voice Live API and foundry agent with function calling, respectively: …

asked

Cem Işık Doğru 40

commented

Manoj Kumar Ragupathi 0

1 answer

Different English accent is not working

I’m running into something odd with the voice accents in my setup. Whenever I switch to different English accents—like English (US), English (India), English (Australia)—the voice doesn’t actually change. It just keeps sounding like the default English…

asked

Wie Dizon 0

commented

SRILAKSHMI C 14,815 Microsoft External Staff Moderator

Filter

Content

2,298 questions with Azure AI Speech tags

Issue Creating Azure AI Language Resource in Custom Question Answering Lab

CRITICAL ISSUE Azure AI Speech SDK – Numbers getting Added , Deleted and Substituted and sometimes Exceeds too much time while using the microsoft realtime speech to text conginitve services API

Pronunciation Assessment with Language en-GB- Phoneme symbols

Please clarify the conflicting information regarding permission to use the free tier of Azure Speech for commercial purposes, such as narration of a YouTube video.

Can the audio generated by Azure Speech Studio's free tier (monthly limit of 500,000 characters) be used for commercial purposes like for example: narration of a youtube video?

Custom Neural Voice (CNV Pro) model in East US and East US 2 is failing to train the model

Special character ampersand (“&”) breaks word boundaries in Azure Text-to-Speech

Azure AI Foundry agents intermittently failing with JSON parsing error (empty response, no schema changes)

Python code to generate ephemeral token for gpt-4o-mini-transcribe OR gpt-4o-transcribe

can some one help, how to config voicelive sdk to recieve animation blendshapes and viseme_id

Has MS abandoned human tech support?

Issues with Azure Speech Services: Incorrect transcription of "draft" as "draught" and "£" as "lbs" in UK English

Azure: Deactivated Severity: 2 alert-0225185834

High Initial Latency with Multi-Language Detection (3+ Languages)

gpt-4o-transcribe for real-time speech-to-text transcription ---slow speed

Custom Avatar Model Training Showing as Processing After 16 Hours

Transcription using gpt-4o-transcribe with gpt-realtime is failing in useast2

Pricing for Azure Voice Live API

Function Calling via Foundry Agent in Voice Live API

Different English accent is not working