Ambient audio streaming service

The ambient audio streaming (AAS) service provides REST, gRPC and WebSocket APIs that you can use to upload audio data and notify Dragon Copilot when audio recordings are available for processing:

  • gRPC API - Bidirectional streaming via gRPC. Supports retrieving configuration, streaming audio recordings in real time, and triggering processing. Best suited for server-side and native client integrations.
  • WebSocket API - Bidirectional streaming via WebSocket. Supports the same operations as gRPC. Well suited for browser-based and web client integrations.
  • REST API - Upload audio chunks as multipart form data and finalize recordings. Best suited for batch or non-real-time upload workflows.

All three APIs support the same core operations:

  • Retrieve configuration - Get supported audio formats, locale settings, and recording duration limits for a given partner and customer.
  • Send audio - Stream audio in real time (gRPC/WebSocket) or upload audio chunks (REST).
  • Trigger processing - Signal Dragon Copilot to begin processing a recorded ambient session.

For details, see the Ambient audio streaming API reference in the Azure REST API documentation.

Ambient audio upload workflows

Using REST

To upload audio for Dragon Copilot processing using the REST API, make the following calls:

  1. Create an ambient session - Call PUT /ambient-sessions to establish the session context, including the correlation ID, partner, customer, product, and EHR metadata.

  2. Upload audio chunks - Call PUT /audio/storeChunk for each audio chunk, referencing the same correlationId.

  3. Finalize the upload - Call POST /audio/finalizeUpload to declare the total chunk count and trigger processing.

Note

You can perform steps 2 and 3 in any order. The service reconciles chunks as they arrive and completes processing once it receives all declared chunks.

Using gRPC or WebSocket

For gRPC and WebSocket, session metadata is provided as part of the streaming handshake. The service creates the session internally when the client opens a recording, so you don't need a separate PUT /ambient-sessions call.

  1. Retrieve configuration (optional but recommended) - Call RetrieveConfiguration (gRPC) or connect to GET /ws/retrieveConfiguration (WebSocket) to get supported audio formats, locale settings, and recording duration limits.

  2. Stream audio - Open a recording session with full session metadata (partner, customer, product, correlation ID, and practitioner info), then stream audio data:

    • gRPC: Open a RecordAmbient bidirectional stream. Send a RecordingOpenRequest, followed by DataChunkRequest messages, and close with a RecordingCloseRequest.

    • WebSocket: Connect to GET /ws. Send a RecordingOpen text message, followed by binary DataChunk messages, and close with a RecordingClose text message.

  3. Trigger processing - Signal Dragon Copilot to begin processing:

    • gRPC: Call the StartProcessing unary RPC.

    • WebSocket: Connect to GET /ws/startProcessing and send a StartProcessing message.

See also