Hello Vijay Sutaria,
Thanks for confirming the details. Given there’s no VNET, custom DNS, or recent configuration changes, this behavior most likely points to short, transient connectivity issues between APIM and the Function App’s public endpoint that usually resolve on their own.
Analysis and Recommendations:
- Transient Backend Connection Drops Azure Function Apps, when accessed over public endpoints, can experience short-lived connection drops due to load balancer changes, scaling operations at the platform layer, or DNS propagation delays. These are typically self-healing and may appear intermittently in APIM traces.
- You can validate backend health using the “Availability and Performance” metrics under Function App → Monitoring → Metrics → Availability.
- Check APIM diagnostic logs to confirm whether the failures are due to DNS resolution or timeout exceptions. References: Monitor Azure Functions - Metrics, Azure Status
- Review APIM Timeout and Retries
- Ensure your APIM backend request timeout settings are sufficient for the Function App’s response time.
- Consider adding retry policies within your API Management policies to handle transient failures gracefully. Reference: Set retry policies in Azure API Management
- Monitor Network Path and Latency
- Use Network Watcher Connection Monitor to periodically test connectivity between APIM and the Function App endpoint.
- This will help confirm if packet loss or transient failures occur at the network layer. Reference: Diagnose connectivity using Connection Monitor
- Enable Application Insights Correlation
- Enable Application Insights for both APIM and the Function App to track end-to-end dependency calls.
- Correlation logs can help identify whether the failure originates from the APIM side, network, or Function App itself. Reference: Monitor and troubleshoot API Management
- Platform Maintenance or Transient Faults Even without any configuration changes on your side, such transient connectivity issues may occur during underlying platform maintenance or brief backend reallocation. These usually recover automatically within sometime.
Additional Considerations:
- Function Apps on Consumption plans share outbound IPs that can change due to platform events, which adds to potential instability. If feasible, consider moving to a Premium or App Service plan and integrate both APIM and Function Apps within a shared VNET to gain stable IP addresses and improved routing.
- If you have high request volumes or spikes, SNAT port exhaustion can occur on APIM's public IPs, causing intermittent failures. Reducing connection churn via HTTP keep-alives or scaling APIM tier and backend may help mitigate this. References: Deploy Azure API Management Instance to External or Internal VNET, For Azure API Management SNAT port limitations and mitigation
Additional References:
Hope this helps clarify and let us know if you have further questions. Thank you!