Tune accuracy

In the navigation pane of the AI customization manager, select Tune accuracy.

Here you test the flowsheet templates enabled for ambient recording. You can create, edit, and execute test cases made up of a transcript and an expected output. The transcript is a phrase you expect a Dragon Copilot user to say when working with patients, and the expected output is the flowsheet template and rows that are populated for that transcript.

This pairing of transcript phrases to expected outcomes makes up a test case library for an environment. You can execute the test cases and evaluate the results to assess the current accuracy of templates enabled for ambient recording. Tune accuracy performs an extraction based on the transcript and returns the actual output. It simulates actions taken by Dragon Copilot when the ambient recording is converted to a transcript and populates the flowsheet template rows. You can use this data to validate the flowsheet templates before they're used in Dragon Copilot.

If the actual output is as expected, the test case has a status of Passing and you can assume that Dragon Copilot will process the ambient recording correctly for this flowsheet template.

If the actual output isn't as expected, the test case has a status of Needs Review and you can address the issues. For example, the expected outcome could be incorrect based on the flowsheet templates, or the flowsheet templates in the EHR could be misconfigured. You can make updates in either the test case or in the source EHR, reimport flowsheet templates, and re-run these test cases. This is an iterative process, both during implementation and regular system use, to validate that accuracy remains consistent and is aligned with the latest configurations.

Benefits of using the Tune accuracy tool

Accessible testing

No test EHR device required; run tests directly on the schema from a test or production environment.
Any user with access to the AI customization manager can create and run tests.

Repeatable testing

Create a test once and re‑run it at any time.
Tests use consistent expected outputs tailored to your organization's flowsheets.
Run one, multiple, or all test cases on demand, no manual output review required.

Measure accuracy over time

Re‑run tests after template changes to assess impact.
Validate accuracy against the latest Microsoft updates.

Test Overview tab

The Test Overview tab provides a high-level dashboard of test case results across your environment. Use this tab to monitor accuracy at a glance without navigating into individual test cases.

You can toggle between the following views: Template profiles and Templates.

Template profiles

Lists all template profiles configured in the environment. Use this view to quickly identify which template profiles have accuracy issues that need attention.

The table displays the following columns:

Column	Description
Template Profile	The service line and role combination for the profile.
Templates	Comma-separated list of template names assigned to the profile.
Templates Without Tests	Number of templates in the profile that don't have any test cases.
Total Tests	Total number of test cases associated with the profile.
Passed	Number of test cases with a Passing status.
Needs Review	Number of test cases with the status Needs Review.
Pass Rate	Percentage of test cases that passed out of the total tests run.
Last Tested	Date of the most recent test run for the profile.
View	Link to view the template profile details.

You can update the service line and role assignments of a flowsheet template in the Flowsheet manager.

Templates

Lists all individual flowsheet templates enabled for ambient recording. Use this view to drill down and identify which specific templates are contributing to test cases that need review.

The table displays the following columns:

Column	Description
Template	The template ID and name (for example, "31020 / Complex Assessment").
Template Type	Category of the template (for example, Intake & Output, Basic Assessment, or Activities of Daily Living).
Template Profile	The service line and role combination the template is assigned to.
Total Tests	Total number of test cases for the template.
Passed	Number of test cases with the status Passing.
Needs Review	Number of test cases with the status Needs Review.
Pass Rate	Percentage of test cases that passed out of the total tests run.
Last Tested	Date of the most recent test run for the template.

Test Manager tab

Displays the list of test cases in your environment. When you enable flowsheet templates for ambient recording in the Flowsheet manager, a set of default test cases is deployed in the environment.

Important

Test cases are scoped to a single environment and are not shared across environments unless manually exported and imported. Because flowsheets can differ between your Test and Production environments, test cases created or annotated in one environment are only available in that environment. Default onboarding test cases are automatically loaded for each environment when you enable templates for ambient recording.

Tune accuracy - default templates view

Test Manager columns

Column	Description
Transcript	The transcript content for the test case.
Expected	Expected outcome of the extraction request.
Actual	Actual outcome of the extraction request.
Test status	Status indicator: New, Passing, Needs Review.
Created	Date the test case was created.
Template Profile	The template profile assigned to the test case.
Enabled	Indicates the state of the test case.

For the default test cases, only the Transcript and Test status columns display values.

This toolbar is at the top of the Test Manager tab and includes the following controls:

Search: Filter test cases by keyword.
Filter by tags: Filter the list by tags assigned to test cases.
Filter by profile: Filter the list by template profile.
Edit columns: Customize which columns are visible. The current row count is displayed next to this button.

When you select one or more test cases, the following action buttons are enabled:

Delete: Delete the selected test cases.
Assign profile: Assign or change the template profile for the selected test cases.
Trigger test run: Run the selected test cases.
Annotate with AI: Automatically generate expected output annotations using AI for the selected test cases.
Save all changes: Save any pending changes to the selected test cases.

The top-right corner of the tab includes:

Import: Import test cases from a file.
Export: Export test cases to a file.
New test case: Create a new test case.

Create a test case

You can create new test cases for your environment from scratch or use a sample transcript.

Note

This example assumes you have imported a standard template (such as Basic Vitals, Vital Signs, or Vitals) and it has been enabled for ambient recording.

Enter the transcript

Open the Test Manager tab and select New test case. The Enter transcript for your test case screen opens.
In the transcript field, enter or paste the transcript text. Select Enter to create a new speaker turn. Each speaker turn can be assigned to either Clinician or Other; select the corresponding toggle. Use the + button to add speaker turns and the trash icon to remove them.

Alternatively, select Sample transcripts to select from a prebuilt transcript, such as Nurse-patient interaction.

For example, enter the following as a Clinician speaker turn:

Patient blood pressure is 120 over 80. Blood pressure location is left arm. Blood Pressure method is manual. Pulse 68. Respiration 22. Height 71 inches. Weight 210 pounds.
Select Next.

Select a template profile

On the Select profile and confirm transcript screen, select the Template profile. The template profile defines the service line and role for the test case (for example, Registered Nurse (RN) - Med Surg (Medical-Surgical)).

If there are no options available in the Template profile menu, template profiles haven't been configured for this flowsheet schema. Configure the service lines and roles for your flowsheet templates in the Flowsheet manager.
Select Create Test. The test case overview page opens. See Test case overview page for details on the page layout.
Define the expected output and save the test case. See Define the expected output for the full annotation workflow.

Test case overview page

When you select a test case from the Test Manager list or create a new test case, the test case overview page opens. This page is the same whether you're creating a new test case or editing an existing one.

The following tabs are available:

Test Case Overview
Reference Info

The top-right corner contains the following buttons:

Delete test case
Save all changes

Test Case Overview tab

The Test Case Overview tab displays the following sections:

Template Profile: Shows the template profile assigned to the test case. If no template profiles are configured for the flowsheet schema, a warning is displayed: No template profiles are configured for this flowsheet schema. Would you like to continue with the static schema?
Tags: Select Add tags to add tags for categorizing and filtering test cases.
Transcript: Displays the transcript for the test case, with each speaker turn labeled (for example, [Clinician]). Select the Edit (pencil) icon to modify the transcript.
Expected output: Initially displays a prompt to Define expected output with the description: Expected output defines what you expect to be extracted from the transcript. This is used to show if the test is passing or not when run. Select + Add expected output to open the annotation window. After annotations are saved, the expected results display here. Select Edit expected output to modify existing annotations.

Test result: Before a test run, this section shows Run test with a prerequisites checklist (Transcript is provided and Changes are saved). After a test run is executed, this section displays a comparison table with the following columns:

Column	Description
Template	The template ID and name from the actual output.
Group	The group ID and name from the actual output.
Row	The row ID and name from the actual output.
Expected	The expected value defined in the test case annotations.
Actual	The actual value extracted by the system. A warning icon indicates a mismatch between expected and actual values.

Select Re-run Test to run this individual test case again.

Reference Info tab

Displays the Test Case Details section with the following fields:

Field	Description
Test Case ID	Read-only unique identifier for the test case.
Last Edited By	Read-only field showing the name of the user who last edited the test case.
Session ID	Optional field to enter a session ID for reference.
Request ID	Optional field to enter a request ID for reference.
Ticket number	Optional field to enter an ADO ticket number for tracking.

Define the expected output

The expected output defines what you expect the system to extract from the transcript. You define expected output through the Edit Expected Output full-screen window, which is the same whether you're creating a new test case or editing an existing one.

From the test case overview page, select Add expected output (for a new test case) or Edit expected output (for an existing test case). The Edit Expected Output screen opens.

The header displays the total number of annotations (for example, 0 annotations). The top-right corner contains Cancel and Save annotations buttons.
Collapse or expand the Transcript section at the top of the window shows the transcript content with the number of speaker turns (for example, 1 turn).
If a template profile is assigned, the templates available for annotation are filtered by that profile. If no template profiles are configured, you can select Continue with static schema to access all templates.

After the templates load, the Edit Expected Output window displays the following tabs: Annotate output and Expected output.

Annotate output tab

Define expected row-value pairs:

Templates sidebar (left): Lists the available templates for the assigned template profile. Each template name shows a count of annotations applied to it (for example, Basic Assessment (4)). Select a template to view its rows.
Filter bar (top): Includes an All menu to filter row types and a Search field to find specific rows.
Row and Value columns: The main area displays the template's rows organized in a hierarchical tree structure by group (for example, NEUROLOGICAL (4)). Groups can be expanded or collapsed. Each row shows its ID and name (for example, 301860 - Level of Consciousness - Updated).
Value selectors: Each row has a field input for its expected value. Select a value from the dropdown to annotate that row, enter a numeric value, or for more complex values use the popup modal that appears on selecting the field.
Annotated rows: Rows with assigned values are highlighted. Each annotated row shows edit (pencil) and delete (trash) icons. Rows that are inferred cascading parents of annotated rows are also highlighted and marked with an inferred indicator icon.

For each row you expect the system to extract from the transcript, select the row's value field and the expected value. For example, for a neurological assessment you might annotate Level of Consciousness with New agitation and R Pupil Size (mm) with 2.

Note

When you annotate a child row, its cascading parent rows are automatically inferred and highlighted. Ensure that the cascading parents are also included in your expected output, as the actual output will include these inferred parents.

Expected output tab

Review all annotated rows in a summary table. The tab label shows the count of annotations (for example, Expected output (4)).

The summary table includes the following columns:

Column	Description
Template	The template ID and name where the annotated row resides.
Group	The group ID and name containing the row.
Row	The row ID and name of the annotated observation.
Value	The expected value assigned to the row.
Is inferred	Indicates whether the row is an inferred cascading parent (Yes) or a directly annotated row (No).
Actions	Delete (trash) icon to remove the annotation.

The Template Profile indicator in the top-right corner shows which profile is in use (for example, Using static schema — all templates are available.).

Use this tab to verify your annotations are complete and correct before saving.

Save annotations

In the Edit Expected Output screen, select Save annotations (top-right) to save your expected output and return to the test case overview page.
On the test case overview page, select Save all changes to persist the test case.
To run the test immediately, select Re-run Test from the Test result card. Alternatively, return to the Test Manager tab and continue annotating other test cases before triggering a batch test run.

Note

Test cases that have not yet been configured with expected output won't be tagged with a status of Passing. You can continue to edit and update these remaining test cases to ensure proper coverage.

Edit a test case

Open the Test Manager tab and select a test case in the Transcript column. The test case overview page opens.
From the Test Case Overview tab, you can:
- Edit the transcript by selecting the Edit (pencil) icon on the Transcript card.
- Add or update tags in the Tags section.
- Change the template profile in the Template Profile section.
To update the expected output, select Edit expected output (or Add expected output if no expected output has been defined). Follow the steps in Define the expected output.
Select Save annotations on the annotation screen, then Save all changes on the test case overview page.
To run the test, select Re-run Test from the Test result card, or return to the Test Manager tab to run tests in batch.

Trigger a test run

After you annotate your test cases with their expected output, you can trigger a test run. The test run extracts the AI output for all selected test cases and compares it to the expected output, evaluating whether the AI output is correct.

Only selected test cases are included in the test run. To run all tests, select the option at the top left of the test case table to select all. To evaluate only a subset, select the option to the left of each test case you want to include.
Select Trigger test run. The test runs in the background and can take several minutes to complete.
Periodically check the progress indicator for the test run.
After the tests complete, sort the test cases by their Test status column to review results.

Test cases are assigned one of the following statuses:

Status	Description
New	The test case hasn't been run yet.
Passing	The actual AI output matches the expected output for this transcript.
Needs Review	The actual AI output doesn't match the expected output for this transcript.

Resolve test cases that need review

After triggering a test run, review the test cases by their status. Test cases marked Passing don't require further action — the system extracted exactly what you defined as the expected output. Ensure all test cases have been run and no tests remain with a status of New.

For each test case with the status Needs Review, work through the following checklist:

Verify that the test case is relevant to your organization: The default test cases are designed to be broad and all might not apply to your organization, unit, or workflow. If a test case isn't relevant, delete it and move to the next one.
Confirm the expected output is correctly annotated: Verify that all observations referenced in the transcript are annotated in the expected output. It's common to miss a row, enter the wrong value for a row, or annotate the wrong row.
Confirm that cascading parents are annotated: The actual output always includes cascading parents inferred from the spoken observations. If any cascading parents are missing from the expected output, the system marks the test case as needing review.
Confirm the correct template is selected: Rows can be shared across multiple templates. If the actual output has the correct row but a different template than the expected output, the test case is marked as needing review.
Check synonyms for missing rows: If an expected output row is missing from the actual output, check the synonyms for that row in the Flowsheet viewer. If no synonyms exist or none match how nurses speak, consider adding synonyms to the flowsheet row in your EHR. Updated synonyms also improve search functionality in the EHR for non-ambient users.

Resolution

After identifying the issue, take one or more of the following actions:

Delete the test case if it isn't relevant to your organization.
Update the test case expected output with missing or corrected values.
Update the transcript content to better match users' speaking patterns.
In the EHR, update the synonyms associated with the flowsheet row for better ambient support.
In the EHR, disable the row that isn't relevant.

After completing these actions , re-run the test cases to validate your changes.

Common causes for test cases that need review

The following scenarios can cause a test case to have a status of Needs Review:

The actual output contains an observation (row and value) that isn't in the expected output

Common reasons:

Cascading parents weren't annotated in the expected output, or the incorrect row was annotated.
The actual output row is semantically similar to the expected row, leading the system to select the wrong row.
The actual output row is shared across multiple templates, and the system selected the wrong template.

The actual output contains the wrong value for a row in the expected output

Common reasons:

The expected output annotation was incorrect, or the incorrect row was annotated.
The actual output row is semantically similar to the expected row, leading the system to select the wrong row.

The actual output is missing a row defined in the expected output

Common reasons:

The expected output annotation is incorrect, or a cascading parent row was annotated but isn't inferable by the system (for example, it shares multiple parents).
There's nothing semantically relevant in the target row compared to the test transcript.

Tip

Use the Flowsheet analyzer as a companion tool to Tune accuracy. The Flowsheet analyzer scans templates enabled for ambient recording and evaluates the schema against a set of rules, which can help identify the cause of failing test cases.

Frequently asked questions

Where do the default test cases come from?

The default test cases are developed by Microsoft AI Researchers and Clinical Integrity Nurses. They're designed to be a broad, organization-agnostic set of verbalizations commonly observed from nurses. The test set focuses on Med Surg RN verbalization patterns and doesn't include separate Med Surg CNA test cases because the RN verbalization patterns encompass CNA patterns.

What does it mean for a test case to pass vs. need review?

A test case with the status Passing means the system extracted exactly what you defined as the expected output — no further action is needed.

The status Needs Review means the actual extracted data doesn't match the expected output, and you should investigate using the resolution actions.

What can I do with the Tune accuracy tool in addition to running test cases?

Tune accuracy gives you visibility into the accuracy of the system by enabling you to test multiple scenarios at scale without needing access to a test EHR device. Current functionality includes creating new test cases, running tests, analyzing your flowsheet build, and reviewing changes in the import history. Informatics leads and EHR analysts can also use insights from Tune accuracy to adapt flowsheets for better ambient support. More functionality to natively complete tuning work within the AI customization manager is planned for future releases.

Feedback

Was this page helpful?

Last updated on 2026-02-19

Tune accuracy

Benefits of using the Tune accuracy tool

Accessible testing

Repeatable testing

Measure accuracy over time

Test Overview tab

Template profiles

Templates

Test Manager tab

Test Manager columns

Test Manager toolbar

Create a test case

Enter the transcript

Select a template profile

Test case overview page

Test Case Overview tab

Reference Info tab

Define the expected output

Annotate output tab

Expected output tab

Save annotations

Edit a test case

Trigger a test run

Resolve test cases that need review

Resolution

Common causes for test cases that need review

The actual output contains an observation (row and value) that isn't in the expected output

The actual output contains the wrong value for a row in the expected output

The actual output is missing a row defined in the expected output

Frequently asked questions

Where do the default test cases come from?

What does it mean for a test case to pass vs. need review?

What can I do with the Tune accuracy tool in addition to running test cases?

Feedback

Additional resources