Automatic, Planned, And Forced Failover Sequence Of Actions In SQL Server

This article covers the sequence of actions the Always-On high availability takes when the primary replica goes offline. When a primary replica goes offline, the failover target assumes the primary role, recovers its databases, and brings them online as the new primary databases. When the former primary replica becomes available, it switches to the secondary role, and its databases become secondary databases.

Let's look at the three types of failover:

  • Automatic failover (without data loss)
  • Planned manual failover (without data loss)
  • Forced failover (with possible data loss)

You can check the below table that summarizes the supported failover types under various availability and failover modes. For each pair of availability replicas, the effective availability and failover mode is determined by the combination of the modes of the primary replica and one or more secondary replicas.

Failover Sequence Of Action Always On SQL Server

Automatic Failover Internal Actions

To enable automatic failover, the conditions outlined in the table with common topologies must be met. An automatic failover triggers the following sequence of actions:

  • The state of the primary databases is changed to DISCONNECTED if the primary replica is still running, and all clients are disconnected.
  • The secondary database rolls forward any log entries waiting in the recovery queue.
  • At this point, the secondary replica becomes the new primary replica, and its databases become the primary databases. Immediately the new replica rolls back uncommitted transactions. Rollback occurs in the background while clients continue to use the database. As part of the internal process by the Always-On, the committed transactions are not rolled back.
  • Until the secondary databases connect and resynchronize with the new primary database, they are marked as NOT SYNCHRONIZED. Before rollback recovery starts, secondary databases can connect to the new primary databases and quickly transition to the SYNCHRONIZED state.
  • When the original primary replica comes back online, it becomes the secondary replica, and its databases become secondary databases. It resynchronizes with the corresponding primary replica and databases. Once the databases have resynchronized, failover is possible again, but in the reverse direction this time.

Planned Failover Internal Actions

A planned failover triggers the following sequence of actions:

  • At first, the Windows Server Failover Cluster (WSFC) is notified in order to ensure that there are no any new user transactions occur on the original primary replica databases.
  • The database on the secondary replica rolls forward any logs waiting in the recovery queue.
  • Now, the secondary replica becomes primary replica, and the primary replica turns into secondary replica.
  • The new primary replica rolls back any uncommitted transactions and brings the databases online as the primary databases.
  • As part of the process, none of the committed transactions will be rolled back. Additionally, the database will remain marked as “NOT SYNCHRONIZED” until the secondary databases connect and resynchronizes with the primary replica database.
  • Once the new secondary replica comes online, it resynchronizes with the new primary replica databases. When everything is back online and synchronizing as before, it becomes eligible for the reverse failover. Meaning, from new primary replica to the old primary replica. 


Forced Failover Internal Actions


A forced failover triggers the following sequence of actions:

  • The WSFC sends a request to the primary replica to go offline, ensuring no new user transactions occur on the original primary databases.
  • The secondary database rolls forward any logs waiting in its recovery queue.
  • The secondary replica then becomes the new primary replica, while the original primary replica assumes the role of the new secondary replica.
  • The new primary replica rolls back any uncommitted transactions and brings the databases online as primary databases.
  • Until the secondary databases connect and resynchronize with the new primary database, they are marked as NOT SYNCHRONIZING. Committed transactions are not rolled back during this process.
  • When the original primary replica comes back online, it becomes the secondary replica, and its databases become secondary databases. It resynchronizes with the corresponding primary replica and its databases. After resynchronization, failover is possible again, but in the reverse direction.

Note: Force failover can be risky and can cause data loss. Perform forced failover only if you want to restore service to your availiability database immediately. 

You may also refer: Data Synchronization Modes In Always On In SQL Server

0 comments:

Post a Comment