Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The Ambient Audio Streaming (AAS) 2.0 gRPC API enables partners to stream ambient audio recordings in real time and submit them for downstream processing by Dragon Copilot.
The gRPC transport is one of two streaming options (alongside WebSocket). It exposes three RPCs:
| RPC | Type | Description |
|---|---|---|
| RetrieveConfiguration | Unary | Returns the service configuration for a given partner, including supported audio formats, locale settings, and operational limits. |
| RecordAmbient | Bidirectional streaming | Streams an ambient audio recording to the server. The client opens a session, streams audio data in chunks, and closes the recording. |
| StartProcessing | Unary | Signals Dragon Copilot to begin processing a previously recorded ambient session. |
Authentication
All gRPC RPCs require a bearer token in the authorization metadata key.
Supported token types:
- S2S (Server-to-Server): Machine-to-machine token issued via MISE. After authentication, the service validates the calling application's identity against a configured allowlist.
- Entra ID User Token: User-delegated token issued by Microsoft Entra ID.
- EIS Bearer Token: JWT issued by the EHR Integration Service (EIS). See Token launch integration for details.
Required metadata
| Key | Description |
|---|---|
authorization |
Bearer token (Bearer <token>) |
Conditionally required metadata
| Key | Condition | Description |
|---|---|---|
user-guid or external-user-id |
When using M2M (S2S) token | At least one must be provided. Returns 403 Forbidden if both are missing. |
Recommended metadata
| Key | Description |
|---|---|
x-ms-request-id |
Correlation identifier for tracing. |
customer-id |
Customer/environment identifier (used for logging context). |
Service definition
service AudioStreamingService {
rpc RetrieveConfiguration(RetrieveConfigurationRequest) returns (RetrieveConfigurationResponse);
rpc RecordAmbient(stream RecordAmbientRequest) returns (stream RecordAmbientResponse);
rpc StartProcessing(StartProcessingRequest) returns (StartProcessingResponse);
}
RPCs
RetrieveConfiguration
Type: Unary RPC
Retrieves the service configuration for a given product, partner, and customer. The response includes supported audio locales and recording duration limits. Call this before starting a recording session to determine available languages and duration constraints.
Request
| Field | Type | Required | Description |
|---|---|---|---|
product_id |
string | Yes | The Microsoft unique identifier of the product. Must be a valid GUID. |
partner_id |
string | Yes | The Microsoft unique identifier for the partner. Must be a valid GUID. |
customer_id |
string | Yes | The Microsoft unique identifier of the customer. Must be a valid GUID. |
external_identifiers |
ExternalIdentifier[] | No | List of external identifiers. Known type: "userId" (the partner identifier of the user). |
Example request (C#):
var request = new RetrieveConfigurationRequest
{
ProductId = "<product-guid>",
PartnerId = "<partner-guid>",
CustomerId = "<customer-guid>"
};
var response = await client.RetrieveConfigurationAsync(request);
Response
| Field | Type | Description |
|---|---|---|
encounter_warn_seconds |
uint32 | Duration in seconds at which processing quality may degrade. Warn the user that the recording is approaching the maximum duration. |
encounter_max_seconds |
uint32 | Maximum duration in seconds of audio allowed. Stop recording when this limit is reached. |
supported_recording_locales |
string[] | Locales accepted for audio recording input (IETF BCP 47, for example, en-US, de-DE). |
supported_encounter_report_locales |
string[] | Locales available for encounter report output (IETF BCP 47). |
Example response:
{
"encounterWarnSeconds": 2700,
"encounterMaxSeconds": 4500,
"supportedRecordingLocales": ["en-US", "fr-FR"],
"supportedEncounterReportLocales": ["en-US", "fr-FR"]
}
Errors
| gRPC status code | Cause |
|---|---|
INVALID_ARGUMENT |
Request is null, or a required field contains an invalid GUID. |
UNAUTHENTICATED |
Missing or invalid bearer token. |
INTERNAL |
Unexpected server error or failure communicating with the downstream system. |
RecordAmbient
Type: Bidirectional streaming
Streams an ambient audio recording to the server in three phases:
- Open - The client sends a
RecordingOpenRequestto initialize the session. - Stream - The client sends
DataChunkRequestmessages containing audio data. The server periodically sendsDataStorageResponsemessages indicating cumulative bytes stored. - Close - The client sends a
RecordingCloseRequestto end the recording. The server responds with aRecordingCloseResponseconfirming total bytes stored.
rpc RecordAmbient(stream RecordAmbientRequest) returns (stream RecordAmbientResponse);
Request messages
The client sends RecordAmbientRequest messages containing one of the following:
| Variant | Type | Description |
|---|---|---|
recording_open |
RecordingOpenRequest | Starts a new recording session with session metadata. Must be sent first. |
data_chunk |
DataChunkRequest | Streams a chunk of audio data. |
recording_close |
RecordingCloseRequest | Signals the end of the recording. |
RecordingOpenRequest
| Field | Type | Required | Description |
|---|---|---|---|
recording_id |
string | Yes | A caller-defined unique identifier for the recording. |
data_format |
DataFormat | Yes | The audio encoding format. Supported: PCM (signed 16-bit LE), Ogg Opus, WebM Opus. |
ambient_session_data |
AmbientSession | Yes | Metadata for the ambient session (see AmbientSession). |
actions |
string[] | No | AI actions to perform. Use "generate-draft" for draft generation. If omitted, only a transcript is generated. |
reason |
RecordingStartReason | No | Why the recording was started. Values: RECORDING_START_REASON_UI, RECORDING_START_REASON_WAKE_WORD, RECORDING_START_REASON_SYSTEM_RESUME. |
starting_offset |
uint32 | No | Byte offset for resuming after interruption. Set to the last confirmed data_stored value. |
previous_encounter_sessions |
EncounterSession[] | No | Previous sessions that were part of the encounter. Critical for pause/resume scenarios. |
output_form_ids |
string[] | No | Template form identifiers for note generation. |
Example:
{
"recordingOpen": {
"recordingId": "<recording-uuid>",
"dataFormat": {
"opus": {
"sampleRateHz": 16000
}
},
"ambientSessionData": {
"productId": "<product-guid>",
"partnerId": "<partner-guid>",
"customerId": "<customer-guid>",
"correlationId": "<correlation-guid>",
"practitionerInfo": {
"externalIdentifiers": [
{ "type": "fhirId", "identifier": "<practitioner-fhir-id>" }
],
"name": {
"givenName": "Jane",
"familyName": "Smith",
"suffix": "MD"
}
},
"externalIdentifiers": [
{ "type": "userId", "identifier": "<external-user-id>" }
],
"creationDate": "2026-02-20T14:30:00.000Z",
"localeInfo": {
"recordingLocales": ["en-US"],
"encounterReportLocale": "en-US",
"encounterUxLocale": "en-US"
}
},
"actions": ["generate-draft"],
"reason": "RECORDING_START_REASON_UI",
"startingOffset": 0
}
}
DataChunkRequest
| Field | Type | Required | Description |
|---|---|---|---|
data_start |
uint32 | Yes | Byte offset where this chunk begins within the overall recording. |
data |
bytes | Yes | Raw audio bytes for this chunk. |
Example:
{
"dataChunk": {
"dataStart": 0,
"data": "<base64-encoded-audio-bytes>"
}
}
RecordingCloseRequest
| Field | Type | Required | Description |
|---|---|---|---|
recording_id |
string | Yes | Must match the recording_id from RecordingOpenRequest. |
recording_length_seconds |
uint32 | Yes | Total recording duration in seconds. |
reason |
RecordingStopReason | No | Why the recording was stopped. Values: RECORDING_STOP_REASON_UI, RECORDING_STOP_REASON_VOICE_COMMAND, RECORDING_STOP_REASON_BT_DISCONNECTED, RECORDING_STOP_REASON_EXTERNAL_INTERRUPTION, RECORDING_STOP_REASON_UNEXPECTED_ERROR, RECORDING_STOP_REASON_MAX_DURATION_EXCEEDED. |
Example:
{
"recordingClose": {
"recordingId": "<recording-uuid>",
"recordingLengthSeconds": 120,
"reason": "RECORDING_STOP_REASON_UI"
}
}
Response messages
The server sends RecordAmbientResponse messages containing one of the following:
| Variant | Type | Description |
|---|---|---|
data_stored |
DataStorageResponse | Acknowledges cumulative bytes stored. Sent approximately every 10 KB. |
recording_closes |
RecordingCloseResponse | Confirms the recording was closed and the total bytes stored. |
DataStorageResponse example:
{
"dataStored": {
"dataStored": 32768
}
}
RecordingCloseResponse example:
{
"recordingCloses": {
"dataStored": 65536
}
}
Note
A DataStorageResponse is not returned for every DataChunkRequest. Data is stored in server-side chunks that are independent of the client-side chunk size. When streaming is resumed after an interruption, the first DataStorageResponse contains the byte position stored prior to the interruption.
Usage notes
- Send
RecordingOpenRequestas the first message. Sending data before opening a session results inINVALID_ARGUMENT. data_startshould reflect the byte offset from the beginning of the recording. Duplicate bytes (from reconnection) are silently discarded.- Set
starting_offsetwhen resuming after interruption to the last confirmeddata_storedvalue. previous_encounter_sessionsis critical for pause/resume scenarios to connect sessions in the backend.- Audio formats supported: PCM (signed 16-bit LE), Ogg Opus, and WebM Opus.
Errors
| gRPC status code | Cause |
|---|---|
INVALID_ARGUMENT |
Null request, empty data chunk, invalid format, or data sent before RecordingOpenRequest. |
FAILED_PRECONDITION |
Messages sent in invalid state order. |
RESOURCE_EXHAUSTED |
Unable to write to the internal data service. Retry the request. |
UNAUTHENTICATED |
Missing or invalid bearer token. |
INTERNAL |
Unexpected server error. |
StartProcessing
Type: Unary RPC
Signals Dragon Copilot to begin processing a previously recorded ambient session. Processing happens asynchronously; the response confirms the request was accepted.
rpc StartProcessing(StartProcessingRequest) returns (StartProcessingResponse);
Request
| Field | Type | Required | Description |
|---|---|---|---|
ambient_session_data |
AmbientSession | Yes | Full ambient session metadata. Must include product_id, partner_id, customer_id, and correlation_id. |
actions |
string[] | Yes | AI actions to perform. Must not be empty. Use "generate-draft" for draft generation. |
request_time |
google.protobuf.Timestamp | No | Time the user initiated the processing request. |
recordings_to_process |
string[] | No | Recording IDs to include in processing. If omitted, all recordings for the session are processed. |
Example request:
{
"ambientSessionData": {
"productId": "<product-guid>",
"partnerId": "<partner-guid>",
"customerId": "<customer-guid>",
"correlationId": "<correlation-guid>",
"practitionerInfo": {
"externalIdentifiers": [
{ "type": "fhirId", "identifier": "<practitioner-fhir-id>" }
],
"name": {
"givenName": "Jane",
"familyName": "Smith",
"suffix": "MD"
}
},
"externalIdentifiers": [
{ "type": "userId", "identifier": "<external-user-id>" }
],
"localeInfo": {
"recordingLocales": ["en-US"],
"encounterReportLocale": "en-US",
"encounterUxLocale": "en-US"
}
},
"requestTime": "2026-02-20T14:35:00.000Z",
"actions": ["generate-draft"],
"recordingsToProcess": ["<recording-uuid-1>", "<recording-uuid-2>"]
}
Response
| Field | Type | Description |
|---|---|---|
streaming_response.error_code |
uint32 | 0 indicates success. Non-zero indicates an error. |
streaming_response.error_message |
string | Human-readable message. |
streaming_response.detailed_error_information |
string | Additional diagnostic information. |
Success response:
{
"streamingResponse": {
"errorCode": 0,
"errorMessage": "SUCCESS",
"detailedErrorInformation": "SUCCESS"
}
}
Error response:
{
"streamingResponse": {
"errorCode": 1,
"errorMessage": "Error processing StartProcessing request",
"detailedErrorInformation": ""
}
}
Usage notes
actionsis required and must contain at least one action. An empty or null value results inINVALID_ARGUMENT.recordings_to_processis optional. When omitted, all recordings for the session (identified bycorrelation_id) are processed.- Processing is asynchronous: a successful response confirms the request was accepted, not that processing is complete.
- Downstream failures are returned in the response body as a non-zero
error_code, not as gRPC status exceptions.
Errors
| gRPC status code | Cause |
|---|---|
INVALID_ARGUMENT |
actions field is null or empty. |
UNAUTHENTICATED |
Missing or invalid bearer token. |
Common types
AmbientSession
Session metadata provided when opening a recording or starting processing.
| Field | Type | Required | Description |
|---|---|---|---|
product_id |
string | Yes | Microsoft unique identifier of the product. |
partner_id |
string | Yes | Microsoft unique identifier for the partner. |
customer_id |
string | Yes | Microsoft unique identifier of the customer. |
correlation_id |
string | Yes | Partner-assigned unique identifier of the session (GUID). Used to correlate results. |
practitioner_info |
PractitionerInfo | No | Practitioner metadata (identifiers, name, specialty). |
ehr_instance_id |
string | No | EHR instance identifier. |
external_identifiers |
ExternalIdentifier[] | No | External identifiers. Use type "userId" for the partner's user identifier. |
creation_date |
string (ISO 8601) | No | Session creation timestamp. |
dst_offset_seconds |
int32 | No | DST offset in seconds. |
client_info |
ClientInfo | No | Client application metadata (app ID, version, SDK version, device info). |
locale_info |
LocaleInfo | No | Locale preferences for recording and report generation. |
DataFormat
Audio encoding format. Specify exactly one of the following:
| Variant | Fields | Description |
|---|---|---|
pcm |
sample_rate_hz, bitcount, channels |
Signed 16-bit little-endian PCM. |
opus |
sample_rate_hz |
Ogg Opus encoding. |
webm_opus |
sample_rate_hz |
WebM Opus encoding. |
byte_stream |
format_specifier |
Opaque byte stream with custom format specifier. |
ExternalIdentifier
| Field | Type | Description |
|---|---|---|
type |
string | Identifier type (for example, "userId", "fhirId", "npi", "encounterId"). |
identifier |
string | The identifier value. |
Best practices
- Call RetrieveConfiguration before recording to confirm supported locales and duration limits.
- Implement reconnection logic with exponential backoff for
RecordAmbientstreaming. - Track
data_storedvalues to setstarting_offsetcorrectly when resuming after interruption. - Include
previous_encounter_sessionswhen splitting recordings across multiple sessions. - Call StartProcessing after closing the recording stream. It is a separate unary call.
- Monitor gRPC status codes to distinguish transient errors (
RESOURCE_EXHAUSTED) from permanent failures (INVALID_ARGUMENT).