Share via

12.2604.1003.209 update failed at the CAU attempt step and will not retry

Richard Beyer 61 Reputation points
2026-05-23T08:10:53.96+00:00

I have a two node Dell Azure Local HCI system. I have successfully updated it 6 times since it was installed in September of 2025. Last month, the update process from 12.2603.1002.15 to 12.2604.1003.209 failed at the CAU attempt step. Clicking the "Try again" button results in a notification "Updates successfully started" but no additional attempt appears in the history. I waited until this month's updates should be available, assuming I would be able to resolve it with new updates but no available cluster updates are displaying.

The update readiness currently shows as "Healthy". The following pending updates are showing in the Azure Update Manager:

  • 2026-05 .NET Framework Security Update (KB5087051) - 2 machines
  • 2026-05 Security Update (KB5087539) (26100.32860) - 2 machines
  • Azure Connected Machine Agent (Arc-enabled servers agent) Version 1.64.03414.2961 - May 2026 Update - 2 machines
  • Security Intelligence Update for Microsoft Defender Antivirus - KB2267602 (Version 1.451.45.0) - Current Channel (Broad) - 1 machine
  • Security Intelligence Update for Microsoft Defender Antivirus - KB2267602 (Version 1.451.42.0) - Current Channel (Broad) - 1 machine

Are Cluster Aware updates going to continue to be unavailable due to the failed update last month? If so, how do I resolve that failure?

Azure Virtual Machines
Azure Virtual Machines

An Azure service that is used to provision Windows and Linux virtual machines.


2 answers

Sort by: Most helpful
  1. Jerald Felix 11,720 Reputation points Volunteer Moderator
    2026-05-24T14:32:59.3533333+00:00

    Hello Richard Beyer,

    Greetings! Thanks for raising this question in the Q&A forum.

    No worries this is actually a very well-known stuck CAU state issue on Azure Local HCI systems, and based on the additional details you've shared, we can now clearly see exactly what's happening and how to fix it!

    Looking at your Get-ClusterResource output, the root cause is clear the CAU cluster role resources CAUali-aqtw and CAUali-aqtwResource are both stuck in an Offline state from the failed April 27, 2026 update attempt. Because Azure Local's Update Service sees this CAU role as still "in progress" (but offline/stuck), it refuses to offer or start any new cluster updates until that stuck role is cleaned up. The Get-ClusterResource –Name CAUClusterRole error you got earlier was simply because the role name on your system is CAUali-aqtw, not the generic CAUClusterRole that's perfectly normal for Dell Azure Local systems.

    Here's your step-by-step fix:

    Open PowerShell as Administrator on one of your cluster nodes (ALI-AZNode01 is fine since it's the current orchestrator).

    First, let's take the CAU cluster role offline cleanly and remove it. Run these commands one at a time:

    Stop-ClusterResource -Name "CAUali-aqtw"
    

    Then remove the entire CAU cluster role group:

    Remove-CauClusterRole -ClusterName ALI-AZNODE01 -Force
    

    If the above remove command errors, try removing the cluster group directly:

    Remove-ClusterGroup -Name "CAUali-aqtw" -RemoveResources -Force
    

    Once removed, confirm the old CAU resources are fully gone by running:

    Get-ClusterResource | Where-Object {$_.Name -like "CAU*"}
    

    This should return nothing. If any CAU entries still appear, remove them individually using:

    Remove-ClusterResource -Name "CAUali-aqtwResource" -Force
    

    Now locate and rename (don't delete) the stuck CAU run attempt tracking file. Based on your system, the correct path is:

    C:\ClusterStorage\Infrastructure_1\Shares\SU1_Infrastructure_1\CloudMedia\SBE\Installed\metadata\
    

    Find the CAURunAttempts.json file in that path and rename it to CAURunAttempts.json.bak — this forces CAU to treat the next run as a completely fresh attempt.

    Now restart the Azure Stack HCI Update Service and Orchestrator Service to clear any cached state:

    Stop-ClusterResource -Name "Azure Stack HCI Update Service"
    Stop-ClusterResource -Name "Azure Stack HCI Orchestrator Service"
    Start-ClusterResource -Name "Azure Stack HCI Orchestrator Service"
    Start-ClusterResource -Name "Azure Stack HCI Update Service"
    

    Wait about 5 minutes for the services to come back online, then check the cluster update status from the Azure portal under Azure Update Manager or from Windows Admin Center — the new cluster updates including 12.2604.1003.209 should now appear as available.

    Before triggering the update again, run a quick health check to make sure everything is clean:

    Test-Cluster
    Get-ClusterResource
    

    All critical resources should show as Online and the old CAUali-aqtw entries should be gone.

    Once the health check looks clean, go ahead and trigger the cluster update from the Azure portal → Azure Update Manager or from Windows Admin Center → Updates. This time it should proceed normally and log new attempts properly.

    As a helpful tip since you mentioned you're new to the Azure Local platform (welcome from VMware!), it's worth bookmarking the Azure Stack HCI event logs for future troubleshooting. You can find them in Event Viewer under Applications and Services Logs → Microsoft → Windows → ClusterAwareUpdating. Those logs will show you the exact step where any future CAU run fails, which makes diagnosing issues much faster.

    If this answer helps you kindly accept the answer which will help others who have similar questions.

    Best Regards,

    Jerald Felix.

    Was this answer helpful?


  2. Richard Beyer 61 Reputation points
    2026-05-23T08:56:11.9066667+00:00

    Additional info: All tests passed for AKS Arc Support Tool command Test-SupportAksArcKnownIssues

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.