Intermittent "Can't reach database server" on Azure Cosmos DB for PostgreSQL (port 6432) from Azure Functions

Pavle Zikic 10 Reputation points
2026-07-02T09:07:20.3933333+00:00

We're running an Azure Functions app (Node.js, timer-triggered) that uses Prisma to connect to an Azure Cosmos DB for PostgreSQL cluster. Intermittently, our function fails with a connection error indicating the database server is unreachable. The database appears to go offline briefly and then recovers on its own, but this has been happening more frequently lately.

Environment

  • Azure Functions (Node.js, timer trigger)
  • Prisma Client (@prisma/client)
  • Azure Cosmos DB for PostgreSQL (coordinator connection on port 6432 / connection pooling endpoint)
  • Region: [add your region]

Error (recurring)

Invalid prisma.queueJob.findMany() invocation

Can't reach database server at

c-<cluster>.<id>.postgres.cosmos.azure.com:6432

Please make sure your database server is running at

c-<cluster>.<id>.postgres.cosmos.azure.com:6432.

PrismaClientKnownRequestError

at Mn.handleRequestError (@prisma/client/runtime/library.js)

at Mn.request (@prisma/client/runtime/library.js)

Wrapped at the Functions host level as:

Microsoft.Azure.WebJobs.Host.FunctionInvocationException:

Exception while executing function: Functions.queue_worker

---> RpcException: Result: Failure

What we've observed

  • The failures are transient — the same function succeeds on subsequent runs.
  • No deployment or config change correlates with the onset; frequency has simply increased over time.
  • The endpoint uses the pooled connection port 6432.

Questions for the community / Microsoft

  1. Are there known causes of brief coordinator-node unavailability on Azure Cosmos DB for PostgreSQL (e.g., maintenance windows, failovers, automatic scaling, node restarts) that would produce short "can't reach server" windows on port 6432?
  2. Is port 6432 (managed PgBouncer/pooler) more susceptible to these drops than the direct 5432 port, and is one recommended over the other for serverless/Functions workloads?
  3. What is the recommended way to diagnose whether these are node restarts/failovers vs. client-side connection pool exhaustion? Which metrics/logs should we check (e.g., in Azure Monitor / cluster metrics)?
  4. Best-practice guidance for resilient connections from Azure Functions + Prisma (connection limits, timeouts, retry strategy) against this service?

Any pointers appreciated. We also plan to open a support ticket with Azure for the underlying availability investigation.We're running an Azure Functions app (Node.js, timer-triggered) that uses Prisma to connect to an Azure Cosmos DB for PostgreSQL cluster. Intermittently, our function fails with a connection error indicating the database server is unreachable. The database appears to go offline briefly and then recovers on its own, but this has been happening more frequently lately.

Environment

  • Azure Functions (Node.js, timer trigger)
  • Prisma Client (@prisma/client)
  • Azure Cosmos DB for PostgreSQL (coordinator connection on port 6432 / connection pooling endpoint)
  • Region: [add your region]

Error (recurring)

Invalid prisma.queueJob.findMany() invocation

Can't reach database server at

c-<cluster>.<id>.postgres.cosmos.azure.com:6432

Please make sure your database server is running at

c-<cluster>.<id>.postgres.cosmos.azure.com:6432.

PrismaClientKnownRequestError

at Mn.handleRequestError (@prisma/client/runtime/library.js)

at Mn.request (@prisma/client/runtime/library.js)

Wrapped at the Functions host level as:

Microsoft.Azure.WebJobs.Host.FunctionInvocationException:

Exception while executing function: Functions.queue_worker

---> RpcException: Result: Failure

What we've observed

  • The failures are transient — the same function succeeds on subsequent runs.
  • No deployment or config change correlates with the onset; frequency has simply increased over time.
  • The endpoint uses the pooled connection port 6432.

Questions for the community / Microsoft

  1. Are there known causes of brief coordinator-node unavailability on Azure Cosmos DB for PostgreSQL (e.g., maintenance windows, failovers, automatic scaling, node restarts) that would produce short "can't reach server" windows on port 6432?
  2. Is port 6432 (managed PgBouncer/pooler) more susceptible to these drops than the direct 5432 port, and is one recommended over the other for serverless/Functions workloads?
  3. What is the recommended way to diagnose whether these are node restarts/failovers vs. client-side connection pool exhaustion? Which metrics/logs should we check (e.g., in Azure Monitor / cluster metrics)?
  4. Best-practice guidance for resilient connections from Azure Functions + Prisma (connection limits, timeouts, retry strategy) against this service?

Any pointers appreciated. We also plan to open a support ticket with Azure for the underlying availability investigation.

Azure Database for PostgreSQL

1 answer

Sort by: Most helpful
  1. Vinodh247-1375 43,346 Reputation points Volunteer Moderator
    2026-07-03T16:03:36.95+00:00
    1. Cause : Yes. Short outages can happen due to failover, maintenance, scaling, or PgBouncer (6432) restarts. Your pattern = transient platform events.
    2. 6432 vs 5432

    6432 (pooler): slightly more drops, but recommended for Functions

    • 5432 (direct): more stable, but risk of connection exhaustion -> Stay on 6432
    1. How to diagnose Check:

    Azure Monitor: node restarts, failover events, connection drops

    Activity log: maintenance/failover

    Metrics: connections, CPU spikes

    App logs: timeout vs immediate failure

    1. Best practices
    • Retry (must): 3 - 5 attempts with backoff
    • Low connection limit (prisma?)

    Use pooler (6432)

    Avoid burst connections on startup

    Set sane timeouts (5–10s)

    Normal transient behaviour. Fix with retry + connection tuning, not by switching ports

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.