An Azure service that provides a cloud content delivery network with threat protection.
Based on the provided information, the failures are occurring before Traefik (no ingress logs), so focus on Azure Front Door → Private Link → internal Load Balancer → AKS as the likely path. The context material provides general Azure Front Door troubleshooting and performance guidance that can be applied to narrow down the source and add diagnostics.
Recommended steps:
- Confirm whether the issue is at the origin vs. Front Door
- Use response headers from failing and successful requests to see how Front Door is interacting with the origin.
- Specifically, inspect the
x-cacheheader on responses as described in the performance troubleshooting guidance:-
TCP_MISS/CONFIG_NOCACHEindicates the request is being forwarded to the origin. -
TCP_HITindicates the response is served from Front Door cache.
-
- If the 502s occur only on
TCP_MISS(forwarded) requests and not onTCP_HIT, this strongly points to the origin path (Private Link → internal LB → AKS) rather than Front Door itself.
- Check origin configuration and host/SNI behavior
- Ensure the origin in Front Door is configured with the correct FQDN rather than just an IP address. Front Door uses the origin host name as the SNI header during TLS handshake.
- If the origin is configured as an IP address, origin-side certificate logic can reject requests that do not have a host header matching the certificate, which can manifest as intermittent 502/connection aborts.
- If applicable, change the origin from an IP address to an FQDN for which a valid certificate exists that matches the origin certificate.
- Validate timeouts and long‑running request behavior end to end
- Front Door can close connections if the origin takes longer than the configured timeout to respond. Even though the observed
timeTakenis short, verify that:- The forwarding timeout on Front Door is set appropriately for the AKS/Traefik response times.
- There are no long‑running requests that might intermittently hit a timeout.
- Consider explicitly setting and tuning the origin timeout on Front Door according to the application’s needs, as recommended in the reliability guidance. Timeouts that are too aggressive can cause sporadic failures.
- Front Door can close connections if the origin takes longer than the configured timeout to respond. Even though the observed
- Check host header handling between Front Door and Traefik
- Front Door can rewrite the host header when forwarding to the origin. If the host header expected by Traefik or the AKS services does not match what Front Door sends, this can cause issues with routing, cookies, or redirects.
- Use the same host name on Front Door and the origin where possible, or configure host header preservation so that the origin (Traefik/AKS) sees the expected host.
- Investigate the origin (AKS/Traefik) for performance or connectivity issues
- Follow the “Investigate the origin” scenario:
- Collect environment information: Front Door endpoint name, endpoint host name/custom domain, and origin host name.
- For affected URLs, repeatedly request them while observing
x-cacheand status codes. - If performance or failures improve once responses are cached (
TCP_HIT), the problem is likely on the origin path.
- If the problem persists even when responses are cached or appears unrelated to cache state, the issue may be in the network path (Private Link, internal LB, node health) rather than Traefik itself.
- Follow the “Investigate the origin” scenario:
- Add and review diagnostics/telemetry
- On Front Door:
- Enable and review Front Door access logs and diagnostic logs to correlate 502s with origin status,
x-cache, and connection details. - Look for patterns in origin response codes, connection reset/abort flags, or specific POPs.
- Enable and review Front Door access logs and diagnostic logs to correlate 502s with origin status,
- On the origin side:
- Check AKS node and pod logs around the timestamps of 502s to see if there are any restarts, readiness probe failures, or resource pressure that could cause the internal LB to drop or reset connections before they reach Traefik.
- On Front Door:
- Consider rate limiting and platform limits
- If any 429s or throttling are observed in addition to 502s, verify that Front Door platform rate limits are not being exceeded. If legitimate traffic is being rate‑limited, open an Azure support request to discuss quota increases.
- If the issue remains unresolved
- After collecting:
- Front Door logs (including
x-cache, origin status, and timing), - AKS/Traefik logs and node health around failure times,
- Confirmation of origin FQDN/SNI and host header configuration,
- Front Door logs (including
- Use this data to open an Azure support request so the platform team can inspect backend telemetry for the Private Link and internal LB path.
- After collecting:
These steps help determine whether the source of OriginConnectionAborted is Front Door timing out or aborting, TLS/SNI or host header mismatch to the origin, or instability in the Private Link/internal Load Balancer/AKS path.
References: