How A Forced Failover Can Cause Data Loss In SQL Server?

I was working with one my clients in Always-On and unfortunately in the process of fixing an error, one of the members in our team forcefully failed over Always-On availability group from one instance to another. There is a need to understand the risks associated with force failover as it can lead to data loss.

This occurs because the target replica may be unable to communicate with the primary replica, making it impossible to ensure that the databases are synchronized. Figure below illustrates how a forced failover can cause data loss on the primary replica and how this loss can extend to a secondary replica.

Forced Failover Data Loss In SQL Server

1. In the previous example, the primary replica's last hardened LSN is 100 before it goes offline, while the asynchronous secondary replica's LSN is 50.

2. After a forced failover, the secondary replica becomes the new primary and sets its last hardened LSN to 50.

3. When the old primary comes back online, its synchronization status is suspended.

4. If synchronization is resumed, the old primary will sync with the new primary, sending its last hardened LSN as 100. Upon finding the new primary's last hardened LSN is 50, it will roll back its transaction log to LSN 50 and start accepting transaction log blocks from the new primary from that point onward. Thus, data loss is propagated from the primary to the secondary replica if synchronization is resumed.

Refer the related articles -
Data Synchronization Modes In Always On In SQL Server
How Does A Synchronous And Asynchronous Secondary Replica Resynchronize With The Primary Replica?

0 comments:

Post a Comment