An Azure service that is used to provision Windows and Linux virtual machines.
Hello Richard Beyer,
Greetings! Thanks for raising this question in the Q&A forum.
No worries this is actually a very well-known stuck CAU state issue on Azure Local HCI systems, and based on the additional details you've shared, we can now clearly see exactly what's happening and how to fix it!
Looking at your Get-ClusterResource output, the root cause is clear the CAU cluster role resources CAUali-aqtw and CAUali-aqtwResource are both stuck in an Offline state from the failed April 27, 2026 update attempt. Because Azure Local's Update Service sees this CAU role as still "in progress" (but offline/stuck), it refuses to offer or start any new cluster updates until that stuck role is cleaned up. The Get-ClusterResource –Name CAUClusterRole error you got earlier was simply because the role name on your system is CAUali-aqtw, not the generic CAUClusterRole that's perfectly normal for Dell Azure Local systems.
Here's your step-by-step fix:
Open PowerShell as Administrator on one of your cluster nodes (ALI-AZNode01 is fine since it's the current orchestrator).
First, let's take the CAU cluster role offline cleanly and remove it. Run these commands one at a time:
Stop-ClusterResource -Name "CAUali-aqtw"
Then remove the entire CAU cluster role group:
Remove-CauClusterRole -ClusterName ALI-AZNODE01 -Force
If the above remove command errors, try removing the cluster group directly:
Remove-ClusterGroup -Name "CAUali-aqtw" -RemoveResources -Force
Once removed, confirm the old CAU resources are fully gone by running:
Get-ClusterResource | Where-Object {$_.Name -like "CAU*"}
This should return nothing. If any CAU entries still appear, remove them individually using:
Remove-ClusterResource -Name "CAUali-aqtwResource" -Force
Now locate and rename (don't delete) the stuck CAU run attempt tracking file. Based on your system, the correct path is:
C:\ClusterStorage\Infrastructure_1\Shares\SU1_Infrastructure_1\CloudMedia\SBE\Installed\metadata\
Find the CAURunAttempts.json file in that path and rename it to CAURunAttempts.json.bak — this forces CAU to treat the next run as a completely fresh attempt.
Now restart the Azure Stack HCI Update Service and Orchestrator Service to clear any cached state:
Stop-ClusterResource -Name "Azure Stack HCI Update Service"
Stop-ClusterResource -Name "Azure Stack HCI Orchestrator Service"
Start-ClusterResource -Name "Azure Stack HCI Orchestrator Service"
Start-ClusterResource -Name "Azure Stack HCI Update Service"
Wait about 5 minutes for the services to come back online, then check the cluster update status from the Azure portal under Azure Update Manager or from Windows Admin Center — the new cluster updates including 12.2604.1003.209 should now appear as available.
Before triggering the update again, run a quick health check to make sure everything is clean:
Test-Cluster
Get-ClusterResource
All critical resources should show as Online and the old CAUali-aqtw entries should be gone.
Once the health check looks clean, go ahead and trigger the cluster update from the Azure portal → Azure Update Manager or from Windows Admin Center → Updates. This time it should proceed normally and log new attempts properly.
As a helpful tip since you mentioned you're new to the Azure Local platform (welcome from VMware!), it's worth bookmarking the Azure Stack HCI event logs for future troubleshooting. You can find them in Event Viewer under Applications and Services Logs → Microsoft → Windows → ClusterAwareUpdating. Those logs will show you the exact step where any future CAU run fails, which makes diagnosing issues much faster.
If this answer helps you kindly accept the answer which will help others who have similar questions.
Best Regards,
Jerald Felix.