Issue connecting to primary replica on AAG environment
Hello.
I have an always on availability group environment with three replica: SQL01 (as primary), SQL02 (as secondary), SQL03 (as secondary), using Microsoft SQL Server 2019 (RTM) - 15.0.2000.5 (X64).
I encounter an issue where i could not connect to SQL01 via SQL Server Management Studio (SSMS), then i did failover to SQL02 as primary and i tried again to connect to SQL01 and it successfully connected to SQL01.
Upon the issue encounter, i made sure that:
-i can remote the the three server of three SQL node
-SQL server services on the three SQL are up
-Cluster AAG is up (Failover Cluster Manager)
-i could connect directly to SQL02 and SQL03, but not SQL01
Below are detail chronology:
-11:55 AM: encountered error login database from application backend
-12:00 PM: got log
"A time-out occurred while waiting for buffer latch -- type 2, bp xxx, page xxx, stat xxx, database id: xxx, allocation unit Id: xxx, task xxx : 0, waittime 300 seconds, flags xxx, owning task xxx. Not continuing to wait."
-12:01 PM: got log
"Windows Server Failover Cluster did not receive a process event signal from SQL Server hosting availability group 'AAG' within the lease timeout period."
"Always On Availability Groups connection with secondary database terminated for primary database 'xxx' on the availability replica 'SQL03' with Replica ID: {xxx}. This is an informational message only. No user action is required. "Always On Availability Groups connection with secondary database terminated for primary database 'xxx' on the availability replica 'SQL02' with Replica ID: {xxx}. This is an informational message only. No user action is required."
After that, I tried to connect to SQL01 few times but failed (still failed until 30mins after the issue). Then I manually failed over to SQL02 as primary. Then I tried to connect to SQL01 again and succeed.
Any idea on why i can't connect to SQL01 at first, but after doing failover to SQL02, i tried to connect to SQL01 again and succeed? Usually when the network disruptions happened between SQL nodes, after the connection is established between these nodes, i can connect to the SQL nodes.
Thank you.