An Azure service that is used for adding real-time communications to web applications.
Hi @pipi
Thank you for the clarification and apologies again for any confusion.
The likely cause is a mismatch in your scale-in rule: Time grain statistic = Maximum but Time aggregation = Average. This evaluates the average of per-minute maximums over 5 minutes, so even minor spikes can keep the value >930, blocking scale-in despite low averages.
Quick fix:
- Change Time grain statistic to Average (recommended for consistent low-load detection).
- Optionally set Time aggregation to Average too, and increase Duration to 10 minutes.
Microsoft recommends using Connection Quota Utilization (%) for rules instead of raw Connection Count (e.g., scale in <20–30%) for better reliability and headroom.
Save, then monitor Autoscale history under low load. If it still doesn't trigger, check Metrics (Max aggregation, 1-min grain) and recent scale events.
Kindly let us know If you have any additional details or concerns in the meantime.
Please "upvote" if the information helped you. This will help us and others in the community as well.