Share via

Possible Causes of 502 bad gateway: upstream connect error.. reason: protocol error

Samuel Tomko 0 Reputation points
2026-03-30T07:50:46.98+00:00

I have a Function Application that my customers use to provision their users into my primary SaaS with SCIM from EntraID, and some have recently reported:

"502 bad gateway: upstream connect error or disconnect/reset before headers. retried and the latest reset reason: protocol error"

I have tested the whole provisioning flow with my own Enterprise Application, and don't receive the error.

Are there any known causes, or troubleshooting steps to take?

Azure Functions
Azure Functions

An Azure service that provides an event-driven serverless compute platform.


2 answers

Sort by: Most helpful
  1. Pravallika KV 12,575 Reputation points Microsoft External Staff Moderator
    2026-03-30T18:25:52.2166667+00:00

    Hi @Samuel Tomko ,

    Thanks for reaching out to Microsoft Q&A.

    "502 bad gateway: upstream connect error or disconnect/reset before headers. retried and the latest reset reason: protocol error"

    This error means the gateway isn’t able to successfully connect or keep a healthy TCP/HTTPS session with your backend.

    Here are some troubleshooting steps you can try:

    1. Check Backend Health in App Gateway
      • In the Azure Portal, go to your Application Gateway → Backend health.
      • If you see failing probes, no healthy instances = 502.
    2. Validate Health Probe Configuration
      • By default App Gateway probes the root (“/”) with GET. If your SCIM endpoint is at, say, /scim/Health or requires a specific header, configure a custom probe to hit that path and expect a 200.
      • Ensure the probe uses the correct protocol (HTTP vs HTTPS), port, and host name.
    3. Review NSG / UDR / DNS Settings
      • If your AGW sits in a VNet, make sure any NSGs or User-Defined Routes aren’t blocking outbound to your Function App (especially if you’re using VNet Integration).
      • If you use a custom DNS in the VNet, confirm it correctly resolves your Function’s FQDN.
    4. Check TLS / Protocol Compatibility
      • A “protocol error” reset often indicates a TLS mismatch.
      • Ensure your Function App supports TLS 1.2 (App Gateway v2 doesn’t support TLS 1.3 yet).
      • If you require client certs or unusual ciphers, verify App Gateway supports them.
    5. Increase Backend HTTP Settings Timeout
    • The default request timeout is 20 seconds (v1) or it retries on a second backend (v2). If your SCIM calls occasionally take longer, bump the timeout:
         New-AzApplicationGatewayBackendHttpSettings … ‑RequestTimeout 60
      
    1. Confirm Backend Address Pool Configuration
      • Make sure your Function’s host/IP or FQDN is correctly listed in the backend pool and that it’s not empty.
    2. Scale Considerations
      • If you’re seeing these errors under load, your AGW tier/instance count might be insufficient. Consider scaling up/out or adding alerts for high request counts.
    3. Enable and Analyze Logs
      • Turn on Application Gateway Access logs and Firewall logs (if WAF is enabled).
      • Look for “502” entries and see if the reset reason shows up in the logs. This gives you insight into whether it’s a probe failure, network block, or protocol drop.

    Hope this helps!


    If the resolution was helpful, kindly take a moment to click on User's imageand click on Yes for was this answer helpful. And, if you have any further query do let us know.


  2. Q&A Assist
    2026-03-30T07:51:03.85+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The error message indicates intermittent 5xx/“Bad Gateway” behavior between Entra ID’s SCIM provisioning service (client) and the Azure Function (backend). From the available information, only general 5xx/502 guidance for Azure App Service–based workloads (including Functions on App Service plans) is documented.

    Based on that guidance, likely causes and concrete troubleshooting steps are:

    1. Check for app-level performance issues
      1. In the Function App’s blade in the Azure portal, go to Monitoring > Metrics.
      2. Add metrics such as CPU Time, Average memory working set, Memory working set, and Requests.
      3. Correlate spikes in CPU/memory or request count with the times customers report 502 errors.
      4. If high CPU/memory or long-running executions coincide with the errors, the Function host may be slow to respond or recycling, causing upstream 502s.
    2. Look for intermittent outbound connection problems
      1. If the Function calls external services (databases, other APIs, storage, etc.) as part of SCIM provisioning, intermittent outbound failures or timeouts can surface as 5xx/Bad Gateway to the caller.
      2. Symptoms documented for such issues include:
        • Slow response times on some or all instances.
        • Intermittent 5xx or Bad Gateway errors.
        • Time-out error messages.
        • Inability to connect to external endpoints.
      3. Review logs/telemetry for failed outbound calls or timeouts during provisioning requests.
    3. Mitigate by scaling or stabilizing the Function App If metrics or logs show resource pressure or intermittent failures:
      1. Scale the app
        • Scale up the App Service plan to a higher pricing tier to get more CPU/memory and better throughput.
        • Or scale out to more instances so that load is distributed and a single slow instance is less likely to cause 502s.
        • Configure manual or automatic scaling as appropriate. See Scale an app in Azure App Service.
      2. Use auto-healing (if running on App Service plan)
        • Configure auto-healing rules in the root web.config of the Function App to recycle the worker process when certain conditions are met (for example, high memory, long request duration, or specific status codes).
        • This helps the app recover automatically from transient issues that might otherwise cause repeated 502s.
      3. Restart the app
        • For one-off or transient issues, restart the Function App from the Azure portal. This can clear stuck states or transient connectivity problems.
    4. Compare failing vs. non-failing tenants Since testing with a personal Enterprise Application does not reproduce the error, compare:
      • Request volume and concurrency from affected tenants vs. the test tenant.
      • Any differences in SCIM payload size or complexity that might increase execution time.
      • Time windows when affected tenants provision (for example, scheduled bulk syncs) and whether those align with resource spikes or outbound connection issues.
    5. Collect more data for precise diagnosis
      • Enable detailed application logging and request tracing for the Function App.
      • Capture timestamps and correlation IDs from Entra ID provisioning logs for failing calls and match them to Function logs.
      • Use this to determine whether the Function is timing out, throwing exceptions, or failing outbound calls when the 502 is returned upstream.

    If, after these steps, intermittent 502s persist and metrics/logs indicate SNAT or outbound connection exhaustion, follow the documented troubleshooting for intermittent outbound connection errors in App Service and consider opening a support ticket with the collected evidence.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.