Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In the navigation pane of the AI customization manager, select Tune accuracy.
Here you test the flowsheet templates enabled for ambient recording. You can create, edit, and execute test cases made up of a transcript and an expected output. The transcript is a phrase you expect a Dragon Copilot user to say when working with patients, and the expected output is the flowsheet template and rows that are populated for that transcript.
This pairing of transcript phrases to expected outcomes makes up a test case library for an environment. You can execute the test cases and evaluate the results to assess the current accuracy of templates enabled for ambient recording. Tune accuracy performs an extraction based on the transcript and returns the actual output. It simulates actions taken by Dragon Copilot when the ambient recording is converted to a transcript and populates the flowsheet template rows. You can use this data to validate the flowsheet templates before they're used in Dragon Copilot.
If the actual output is as expected, the test case has a status of Passing and you can assume that Dragon Copilot will process the ambient recording correctly for this flowsheet template.
If the actual output isn't as expected, the test case has a status of Needs Review and you can address the issues. For example, the expected outcome could be incorrect based on the flowsheet templates, or the flowsheet templates in the EHR could be misconfigured. You can make updates in either the test case or in the source EHR, reimport flowsheet templates, and re-run these test cases. This is an iterative process, both during implementation and regular system use, to validate that accuracy remains consistent and is aligned with the latest configurations.
Benefits of using the Tune accuracy tool
Accessible testing
- No test EHR device required; run tests directly on the schema from a test or production environment.
- Any user with access to the AI customization manager can create and run tests.
Repeatable testing
- Create a test once and re‑run it at any time.
- Tests use consistent expected outputs tailored to your organization's flowsheets.
- Run one, multiple, or all test cases on demand, no manual output review required.
Measure accuracy over time
- Re‑run tests after template changes to assess impact.
- Validate accuracy against the latest Microsoft updates.
Test Overview tab
The Test Overview tab provides a high-level dashboard of test case results across your environment. Use this tab to monitor accuracy at a glance without navigating into individual test cases.
You can toggle between the following views: Template profiles and Templates.
Template profiles
Lists all template profiles configured in the environment. Use this view to quickly identify which template profiles have accuracy issues that need attention.
The table displays the following columns:
| Column | Description |
|---|---|
| Template Profile | The service line and role combination for the profile. |
| Templates | Comma-separated list of template names assigned to the profile. |
| Templates Without Tests | Number of templates in the profile that don't have any test cases. |
| Total Tests | Total number of test cases associated with the profile. |
| Passed | Number of test cases with a Passing status. |
| Needs Review | Number of test cases with the status Needs Review. |
| Pass Rate | Percentage of test cases that passed out of the total tests run. |
| Last Tested | Date of the most recent test run for the profile. |
| View | Link to view the template profile details. |
You can update the service line and role assignments of a flowsheet template in the Flowsheet manager.
Templates
Lists all individual flowsheet templates enabled for ambient recording. Use this view to drill down and identify which specific templates are contributing to test cases that need review.
The table displays the following columns:
| Column | Description |
|---|---|
| Template | The template ID and name (for example, "31020 / Complex Assessment"). |
| Template Type | Category of the template (for example, Intake & Output, Basic Assessment, or Activities of Daily Living). |
| Template Profile | The service line and role combination the template is assigned to. |
| Total Tests | Total number of test cases for the template. |
| Passed | Number of test cases with the status Passing. |
| Needs Review | Number of test cases with the status Needs Review. |
| Pass Rate | Percentage of test cases that passed out of the total tests run. |
| Last Tested | Date of the most recent test run for the template. |
Test Manager tab
Displays the list of test cases in your environment. When you enable flowsheet templates for ambient recording in the Flowsheet manager, a set of default test cases is deployed in the environment.
Important
Test cases are scoped to a single environment and are not shared across environments unless manually exported and imported. Because flowsheets can differ between your Test and Production environments, test cases created or annotated in one environment are only available in that environment. Default onboarding test cases are automatically loaded for each environment when you enable templates for ambient recording.
Test Manager columns
| Column | Description |
|---|---|
| Transcript | The transcript content for the test case. |
| Expected | Expected outcome of the extraction request. |
| Actual | Actual outcome of the extraction request. |
| Test status | Status indicator: New, Passing, Needs Review. |
| Created | Date the test case was created. |
| Template Profile | The template profile assigned to the test case. |
| Enabled | Indicates the state of the test case. |
For the default test cases, only the Transcript and Test status columns display values.
Test Manager toolbar
This toolbar is at the top of the Test Manager tab and includes the following controls:
- Search: Filter test cases by keyword.
- Filter by tags: Filter the list by tags assigned to test cases.
- Filter by profile: Filter the list by template profile.
- Edit columns: Customize which columns are visible. The current row count is displayed next to this button.
When you select one or more test cases, the following action buttons are enabled:
- Delete: Delete the selected test cases.
- Assign profile: Assign or change the template profile for the selected test cases.
- Trigger test run: Run the selected test cases.
- Annotate with AI: Automatically generate expected output annotations using AI for the selected test cases.
- Save all changes: Save any pending changes to the selected test cases.
The top-right corner of the tab includes:
- Import: Import test cases from a file.
- Export: Export test cases to a file.
- New test case: Create a new test case.
Create a test case
You can create new test cases for your environment from scratch or use a sample transcript.
Note
This example assumes you have imported a standard template (such as Basic Vitals, Vital Signs, or Vitals) and it has been enabled for ambient recording.
Enter the transcript
Open the Test Manager tab and select New test case. The Enter transcript for your test case screen opens.
In the transcript field, enter or paste the transcript text. Select Enter to create a new speaker turn. Each speaker turn can be assigned to either Clinician or Other; select the corresponding toggle. Use the + button to add speaker turns and the trash icon to remove them.
Alternatively, select Sample transcripts to select from a prebuilt transcript, such as Nurse-patient interaction.
For example, enter the following as a Clinician speaker turn:
Patient blood pressure is 120 over 80. Blood pressure location is left arm. Blood Pressure method is manual. Pulse 68. Respiration 22. Height 71 inches. Weight 210 pounds.
Select Next.
Select a template profile
On the Select profile and confirm transcript screen, select the Template profile. The template profile defines the service line and role for the test case (for example, Registered Nurse (RN) - Med Surg (Medical-Surgical)).
If there are no options available in the Template profile menu, template profiles haven't been configured for this flowsheet schema. Configure the service lines and roles for your flowsheet templates in the Flowsheet manager.
Select Create Test. The test case overview page opens. See Test case overview page for details on the page layout.
Define the expected output and save the test case. See Define the expected output for the full annotation workflow.
Test case overview page
When you select a test case from the Test Manager list or create a new test case, the test case overview page opens. This page is the same whether you're creating a new test case or editing an existing one.
The following tabs are available:
Test Case Overview
Reference Info
The top-right corner contains the following buttons:
Delete test case
Save all changes
Test Case Overview tab
The Test Case Overview tab displays the following sections:
Template Profile: Shows the template profile assigned to the test case. If no template profiles are configured for the flowsheet schema, a warning is displayed: No template profiles are configured for this flowsheet schema. Would you like to continue with the static schema?
Tags: Select Add tags to add tags for categorizing and filtering test cases.
Transcript: Displays the transcript for the test case, with each speaker turn labeled (for example, [Clinician]). Select the Edit (pencil) icon to modify the transcript.
Expected output: Initially displays a prompt to Define expected output with the description: Expected output defines what you expect to be extracted from the transcript. This is used to show if the test is passing or not when run. Select + Add expected output to open the annotation window. After annotations are saved, the expected results display here. Select Edit expected output to modify existing annotations.
Test result: Before a test run, this section shows Run test with a prerequisites checklist (Transcript is provided and Changes are saved). After a test run is executed, this section displays a comparison table with the following columns:
Column Description Template The template ID and name from the actual output. Group The group ID and name from the actual output. Row The row ID and name from the actual output. Expected The expected value defined in the test case annotations. Actual The actual value extracted by the system. A warning icon indicates a mismatch between expected and actual values. Select Re-run Test to run this individual test case again.
Reference Info tab
Displays the Test Case Details section with the following fields:
| Field | Description |
|---|---|
| Test Case ID | Read-only unique identifier for the test case. |
| Last Edited By | Read-only field showing the name of the user who last edited the test case. |
| Session ID | Optional field to enter a session ID for reference. |
| Request ID | Optional field to enter a request ID for reference. |
| Ticket number | Optional field to enter an ADO ticket number for tracking. |
Define the expected output
The expected output defines what you expect the system to extract from the transcript. You define expected output through the Edit Expected Output full-screen window, which is the same whether you're creating a new test case or editing an existing one.
From the test case overview page, select Add expected output (for a new test case) or Edit expected output (for an existing test case). The Edit Expected Output screen opens.
The header displays the total number of annotations (for example, 0 annotations). The top-right corner contains Cancel and Save annotations buttons.
Collapse or expand the Transcript section at the top of the window shows the transcript content with the number of speaker turns (for example, 1 turn).
If a template profile is assigned, the templates available for annotation are filtered by that profile. If no template profiles are configured, you can select Continue with static schema to access all templates.
After the templates load, the Edit Expected Output window displays the following tabs: Annotate output and Expected output.
Annotate output tab
Define expected row-value pairs:
Templates sidebar (left): Lists the available templates for the assigned template profile. Each template name shows a count of annotations applied to it (for example, Basic Assessment (4)). Select a template to view its rows.
Filter bar (top): Includes an All menu to filter row types and a Search field to find specific rows.
Row and Value columns: The main area displays the template's rows organized in a hierarchical tree structure by group (for example, NEUROLOGICAL (4)). Groups can be expanded or collapsed. Each row shows its ID and name (for example, 301860 - Level of Consciousness - Updated).
Value selectors: Each row has a field input for its expected value. Select a value from the dropdown to annotate that row, enter a numeric value, or for more complex values use the popup modal that appears on selecting the field.
Annotated rows: Rows with assigned values are highlighted. Each annotated row shows edit (pencil) and delete (trash) icons. Rows that are inferred cascading parents of annotated rows are also highlighted and marked with an inferred indicator icon.
For each row you expect the system to extract from the transcript, select the row's value field and the expected value. For example, for a neurological assessment you might annotate Level of Consciousness with New agitation and R Pupil Size (mm) with 2.
Note
When you annotate a child row, its cascading parent rows are automatically inferred and highlighted. Ensure that the cascading parents are also included in your expected output, as the actual output will include these inferred parents.
Expected output tab
Review all annotated rows in a summary table. The tab label shows the count of annotations (for example, Expected output (4)).
The summary table includes the following columns:
| Column | Description |
|---|---|
| Template | The template ID and name where the annotated row resides. |
| Group | The group ID and name containing the row. |
| Row | The row ID and name of the annotated observation. |
| Value | The expected value assigned to the row. |
| Is inferred | Indicates whether the row is an inferred cascading parent (Yes) or a directly annotated row (No). |
| Actions | Delete (trash) icon to remove the annotation. |
The Template Profile indicator in the top-right corner shows which profile is in use (for example, Using static schema — all templates are available.).
Use this tab to verify your annotations are complete and correct before saving.
Save annotations
In the Edit Expected Output screen, select Save annotations (top-right) to save your expected output and return to the test case overview page.
On the test case overview page, select Save all changes to persist the test case.
To run the test immediately, select Re-run Test from the Test result card. Alternatively, return to the Test Manager tab and continue annotating other test cases before triggering a batch test run.
Note
Test cases that have not yet been configured with expected output won't be tagged with a status of Passing. You can continue to edit and update these remaining test cases to ensure proper coverage.
Edit a test case
Open the Test Manager tab and select a test case in the Transcript column. The test case overview page opens.
From the Test Case Overview tab, you can:
- Edit the transcript by selecting the Edit (pencil) icon on the Transcript card.
- Add or update tags in the Tags section.
- Change the template profile in the Template Profile section.
To update the expected output, select Edit expected output (or Add expected output if no expected output has been defined). Follow the steps in Define the expected output.
Select Save annotations on the annotation screen, then Save all changes on the test case overview page.
To run the test, select Re-run Test from the Test result card, or return to the Test Manager tab to run tests in batch.
Trigger a test run
After you annotate your test cases with their expected output, you can trigger a test run. The test run extracts the AI output for all selected test cases and compares it to the expected output, evaluating whether the AI output is correct.
- Only selected test cases are included in the test run. To run all tests, select the option at the top left of the test case table to select all. To evaluate only a subset, select the option to the left of each test case you want to include.
- Select Trigger test run. The test runs in the background and can take several minutes to complete.
- Periodically check the progress indicator for the test run.
- After the tests complete, sort the test cases by their Test status column to review results.
Test cases are assigned one of the following statuses:
| Status | Description |
|---|---|
| New | The test case hasn't been run yet. |
| Passing | The actual AI output matches the expected output for this transcript. |
| Needs Review | The actual AI output doesn't match the expected output for this transcript. |
Resolve test cases that need review
After triggering a test run, review the test cases by their status. Test cases marked Passing don't require further action — the system extracted exactly what you defined as the expected output. Ensure all test cases have been run and no tests remain with a status of New.
For each test case with the status Needs Review, work through the following checklist:
Verify that the test case is relevant to your organization: The default test cases are designed to be broad and all might not apply to your organization, unit, or workflow. If a test case isn't relevant, delete it and move to the next one.
Confirm the expected output is correctly annotated: Verify that all observations referenced in the transcript are annotated in the expected output. It's common to miss a row, enter the wrong value for a row, or annotate the wrong row.
Confirm that cascading parents are annotated: The actual output always includes cascading parents inferred from the spoken observations. If any cascading parents are missing from the expected output, the system marks the test case as needing review.
Confirm the correct template is selected: Rows can be shared across multiple templates. If the actual output has the correct row but a different template than the expected output, the test case is marked as needing review.
Check synonyms for missing rows: If an expected output row is missing from the actual output, check the synonyms for that row in the Flowsheet viewer. If no synonyms exist or none match how nurses speak, consider adding synonyms to the flowsheet row in your EHR. Updated synonyms also improve search functionality in the EHR for non-ambient users.
Resolution
After identifying the issue, take one or more of the following actions:
- Delete the test case if it isn't relevant to your organization.
- Update the test case expected output with missing or corrected values.
- Update the transcript content to better match users' speaking patterns.
- In the EHR, update the synonyms associated with the flowsheet row for better ambient support.
- In the EHR, disable the row that isn't relevant.
After completing these actions , re-run the test cases to validate your changes.
Common causes for test cases that need review
The following scenarios can cause a test case to have a status of Needs Review:
The actual output contains an observation (row and value) that isn't in the expected output
Common reasons:
- Cascading parents weren't annotated in the expected output, or the incorrect row was annotated.
- The actual output row is semantically similar to the expected row, leading the system to select the wrong row.
- The actual output row is shared across multiple templates, and the system selected the wrong template.
The actual output contains the wrong value for a row in the expected output
Common reasons:
- The expected output annotation was incorrect, or the incorrect row was annotated.
- The actual output row is semantically similar to the expected row, leading the system to select the wrong row.
The actual output is missing a row defined in the expected output
Common reasons:
- The expected output annotation is incorrect, or a cascading parent row was annotated but isn't inferable by the system (for example, it shares multiple parents).
- There's nothing semantically relevant in the target row compared to the test transcript.
Tip
Use the Flowsheet analyzer as a companion tool to Tune accuracy. The Flowsheet analyzer scans templates enabled for ambient recording and evaluates the schema against a set of rules, which can help identify the cause of failing test cases.
Frequently asked questions
Where do the default test cases come from?
The default test cases are developed by Microsoft AI Researchers and Clinical Integrity Nurses. They're designed to be a broad, organization-agnostic set of verbalizations commonly observed from nurses. The test set focuses on Med Surg RN verbalization patterns and doesn't include separate Med Surg CNA test cases because the RN verbalization patterns encompass CNA patterns.
What does it mean for a test case to pass vs. need review?
A test case with the status Passing means the system extracted exactly what you defined as the expected output — no further action is needed.
The status Needs Review means the actual extracted data doesn't match the expected output, and you should investigate using the resolution actions.
What can I do with the Tune accuracy tool in addition to running test cases?
Tune accuracy gives you visibility into the accuracy of the system by enabling you to test multiple scenarios at scale without needing access to a test EHR device. Current functionality includes creating new test cases, running tests, analyzing your flowsheet build, and reviewing changes in the import history. Informatics leads and EHR analysts can also use insights from Tune accuracy to adapt flowsheets for better ambient support. More functionality to natively complete tuning work within the AI customization manager is planned for future releases.