SQLArena

ASYNC_NETWORK_IO Wait Type In SQL Server

Like the ASYNC_IO_COMPLETION wait type, ASYNC_NETWORK_IO pertains to throughput; however, it focuses on the throughput of the network connection between your SQL Server instance and its clients, rather than the storage subsystem. It's important to note that ASYNC_NETWORK_IO waits are normal and do not necessarily indicate a network issue, as these waits can occur even when querying the SQL Server instance directly on the server.

What is the ASYNC_NETWORK_IO Wait Type?

ASYNC_NETWORK_IO waits typically happen when client applications are unable to process query results quickly enough or when there are network performance issues. Most often, the former is the case, as many applications handle SQL Server results row-by-row or struggle with large data volumes, causing SQL Server to wait to send query results over the network. This waiting period is logged as the ASYNC_NETWORK_IO wait type. Additionally, ASYNC_NETWORK_IO waits can occur when querying remote databases via a linked server.

Sys.dm_os_wait_stats

You can check the ASYNC_NETWORK_IO wait type using the sys.dm_os_wait_stats DMV. We might not require a complicated environment to test the ASYNC_NETWORK_IO wait type.

Before running the DMV, lets clear the information from the DMV with a simple DBCC command.

use master
go
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

Once the information is cleared from the DMV, lets run a simple SELECT statement from the SQL Server instance using SSMS.

select * from DBA_Async_Test

After querying from the above table, we can see a spike in the ASYNC_NETWORK_IO wait in the sys.dm_os_wait_stats.

select * from sys.dm_os_wait_stats where wait_type='Async_network_io'

We saw a spike in the wait type because the SQL Server Instance was not able to get all the result in the SSMS as fast as the instance itself. When the instance is not able to get the results fast on to the SSMS, we start to see more of this wait type.

How To Lower ASYNC_NETWORK_IO Waits?

The answer is pretty simple. To lower ASYNC_NETWORK_IO Wait, we just must modify the query by limiting the result set as much as possible. Let’s look at the simple example as to how we can lower the wait type.

We will be clearing the wait stat information from sys.dm_os_wait_stats again to record the reading one more time.

use master
go
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

This time, we will be limiting the records from to take to just TOP 100.

select TOP 100 * from DBA_Async_Test

If you notice that the ASYNC_NETWORK_IO wait has been minimized. The SSMS was able to get the results fast enough and had not to wait for the completion of the query for longer duration. We can also reduce the ASYNC_NETWORK_IO wait by adding WHERE conditions to our query.

Note: If you still think that the above test case failed in your scenario, then please check the network configuration. At times, there are possibilities that the network configuration as well can slow down fetching the result sets.

We can check the network configuration by opening the task manager and navigating on to the “Ethernet” option.

If you observe high network utilization along with increased ASYNC_NETWORK_IO wait times, the network might be the bottleneck. In such instances, consulting your network administrator could be beneficial. Network configurations are complex, involving various components such as switches, routers, firewalls, network cables, drivers, firmware, and possibly virtualized operating systems. Each of these elements can impact network throughput and contribute to ASYNC_NETWORK_IO waits.

Bottomline

ASYNC_NETWORK_IO waits happen when an application requests query results from a SQL Server instance over the network and is unable to process the returned results quickly enough. While these waits are generally normal, unusually high wait times might indicate changes in query results or network issues. To reduce application related ASYNC_NETWORK_IO wait times, consider decreasing the number of rows and/or columns returned to the application.

You may also refer :

ASYNC_IO_COMPLETION Wait Type In SQL Server

Categories: DBA

FIX –‘ The specified @proxy_name ('Cmd_Exec') does not exist.’ In SQL Server

Leave a Reply

I was working on a client project to move the database jobs from one server to another. While we were in the middle of the job migration, one of the job prompted an error saying –‘ The specified @proxy_name ('Cmd_Exec') does not exist.’ Once I found the solution and fixed in the production, I wanted to share the same fix on my blog as well so that all of us are aware of this simple fix.

The CmdExec is a feature by Microsoft to define executable program or operating system command by using SQL Server Management Studio. My client had a job step that was executing the powershell script. That same step was generating the error for them.

However, for more details on CmdExec, please refer to this document.

How to fix “The specified @proxy_name(‘Cmd_Exec’) does not exist”?

I am sure that, when you had received the error message, the respective credentials were not present. Please see the image below of my sample credential.

Steps to fix the error:

1. To fix the error, navigate to the credentials option (Expand Instance --> Security --> Credentials) in your SSMS.

2. Create the credential. In my case, I have created MyCreds as the credential name and tagged MyCred_Login as the identity. Set a password for the credential you are creating.

3. Now, navigate to the proxies option (Expand SQL Server Agent --> Proxies --> Operating System (CmdExec)) in your SSMS.

4. Create a new proxy either by using the GUI or through script. The SQL Server uses sp_add_proxy system stored procedure to create the proxy internally.

Creating Proxy using script:

USE [msdb]
GO
EXEC msdb.dbo.sp_add_proxy @proxy_name=N'Cmd_Exec',@credential_name=N'MyCreds',@enabled=1
GO
EXEC msdb.dbo.sp_grant_proxy_to_subsystem @proxy_name=N'Cmd_Exec', @subsystem_id=3
GO
EXEC msdb.dbo.sp_grant_proxy_to_subsystem @proxy_name=N'Cmd_Exec', @subsystem_id=12
GO

Once the above steps are followed and your credential and proxies are set right, the SSMS will no longer give the error message.

Bottomline

Cmd_Exec error gets generated when any job step has an execution command either at the operting system level or powershell. Following the above steps can get rid of the error message.

Categories: Errors

ASYNC_IO_COMPLETION Wait Type In SQL Server

Leave a Reply

The ASYNC_IO_COMPLETION wait type is commonly encountered by a database administrator. It occurs every time SQL Server performs a file-related action at storage level and waits for it to complete. As the wait type is related to storage, you might encounter more often when a backup happens. However, if you encounter the ASYNC_IO_COMPLETION wait type.

What is the ASYNC_IO_COMPLETION Wait Type?

The ASYNC_IO_COMPLETION wait type occurs when a task is waiting for storage-related action to complete. In general, this wait type is more seen while a backup is happening in your server. The SQL Server monitors the signal from the storage subsystem for faster completion. In case of a delay from the storage side, the ASYNC_IO_COMPLETION wait type occurs. To be simpler, the faster your storage subsystem, the lower you see the ASYNC_IO_COMPLETION wait type.

If you are seeing this wait type in your server, it means that your SQL Server task is communicating with the storage subsystem and there should not be any matter of concern. It generally flags the wait type when a database creation or a database backup operation takes place. You should take the wait type into consideration when the wait times are higher than the baseline measurements.

Sys.dm_os_wait_stats

You can check the ASYNC_IO_COMPLETION wait type using the sys.dm_os_wait_stats DMV.

Before running the DMV, lets clear the information from the sys.dm_os_wait_stats DMV using a simple DBCC command.

use master
go
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

I will now run a full backup in my server. As soon as I execute the full backup, the information gets loaded into the DMV.

select * from sys.dm_os_wait_stats where wait_type='ASYNC_IO_COMPLETION'

ASYNC_IO_COMPLETION Wait Type SQL Server

How to Lower ASYNC_IO_COMPLETION Wait Type?

In most cases, high ASYNC_IO_COMPLETION wait is due to a database backup. To find if the wait type was because of a backup, then query the DMV with other backup waits.

select * from sys.dm_os_wait_stats where wait_type
in ('ASYNC_IO_COMPLETION', 'BACKUPIO','BACKUPBUFFER')

ASYNC_IO_COMPLETION Wait Type SQL Server 3

You can also try to lower the ASYNCH_IO_COMPLETION waits by configuring instant file initialization. Instant file initialization is a feature introduced by Microsoft for Windows 2003 to speed up the disk space allocation.

Bottomline

The ASYNC_IO_COMPLETION wait type is triggered by actions related to the storage subsystem within your SQL Server instance, such as database backups and new database creation. While these waits are generally normal, unusually high wait times can suggest storage issues. Before involving your storage administrator, confirm there is a true performance issue. One way to do this is by checking storage latency, as high latency can increase ASYNC_IO_COMPLETION wait times. Also, verify if the increased wait times coincide with database backups. Enabling instant file initialization by adding your SQL Server service account to the Perfmon volume maintenance tasks local policy can help reduce ASYNC_IO_COMPLETION wait times.

You may also refer:

BACKUPBUFFER Wait Type In SQL Server

Categories: Interview Questions

Automatic, Planned, And Forced Failover Sequence Of Actions In SQL Server

Leave a Reply

This article covers the sequence of actions the Always-On high availability takes when the primary replica goes offline. When a primary replica goes offline, the failover target assumes the primary role, recovers its databases, and brings them online as the new primary databases. When the former primary replica becomes available, it switches to the secondary role, and its databases become secondary databases.

Let's look at the three types of failover:

Automatic failover (without data loss)
Planned manual failover (without data loss)
Forced failover (with possible data loss)

You can check the below table that summarizes the supported failover types under various availability and failover modes. For each pair of availability replicas, the effective availability and failover mode is determined by the combination of the modes of the primary replica and one or more secondary replicas.

Failover Sequence Of Action Always On SQL Server

Automatic Failover Internal Actions

To enable automatic failover, the conditions outlined in the table with common topologies must be met. An automatic failover triggers the following sequence of actions:

The state of the primary databases is changed to DISCONNECTED if the primary replica is still running, and all clients are disconnected.
The secondary database rolls forward any log entries waiting in the recovery queue.
At this point, the secondary replica becomes the new primary replica, and its databases become the primary databases. Immediately the new replica rolls back uncommitted transactions. Rollback occurs in the background while clients continue to use the database. As part of the internal process by the Always-On, the committed transactions are not rolled back.
Until the secondary databases connect and resynchronize with the new primary database, they are marked as NOT SYNCHRONIZED. Before rollback recovery starts, secondary databases can connect to the new primary databases and quickly transition to the SYNCHRONIZED state.
When the original primary replica comes back online, it becomes the secondary replica, and its databases become secondary databases. It resynchronizes with the corresponding primary replica and databases. Once the databases have resynchronized, failover is possible again, but in the reverse direction this time.

Planned Failover Internal Actions

A planned failover triggers the following sequence of actions:

At first, the Windows Server Failover Cluster (WSFC) is notified in order to ensure that there are no any new user transactions occur on the original primary replica databases.
The database on the secondary replica rolls forward any logs waiting in the recovery queue.
Now, the secondary replica becomes primary replica, and the primary replica turns into secondary replica.
The new primary replica rolls back any uncommitted transactions and brings the databases online as the primary databases.
As part of the process, none of the committed transactions will be rolled back. Additionally, the database will remain marked as “NOT SYNCHRONIZED” until the secondary databases connect and resynchronizes with the primary replica database.
Once the new secondary replica comes online, it resynchronizes with the new primary replica databases. When everything is back online and synchronizing as before, it becomes eligible for the reverse failover. Meaning, from new primary replica to the old primary replica.

Forced Failover Internal Actions

A forced failover triggers the following sequence of actions:

The WSFC sends a request to the primary replica to go offline, ensuring no new user transactions occur on the original primary databases.
The secondary database rolls forward any logs waiting in its recovery queue.
The secondary replica then becomes the new primary replica, while the original primary replica assumes the role of the new secondary replica.
The new primary replica rolls back any uncommitted transactions and brings the databases online as primary databases.
Until the secondary databases connect and resynchronize with the new primary database, they are marked as NOT SYNCHRONIZING. Committed transactions are not rolled back during this process.
When the original primary replica comes back online, it becomes the secondary replica, and its databases become secondary databases. It resynchronizes with the corresponding primary replica and its databases. After resynchronization, failover is possible again, but in the reverse direction.

Note: Force failover can be risky and can cause data loss. Perform forced failover only if you want to restore service to your availiability database immediately.

You may also refer: Data Synchronization Modes In Always On In SQL Server

Categories: DBA

How A Forced Failover Can Cause Data Loss In SQL Server?

Leave a Reply

I was working with one my clients in Always-On and unfortunately in the process of fixing an error, one of the members in our team forcefully failed over Always-On availability group from one instance to another. There is a need to understand the risks associated with force failover as it can lead to data loss.

This occurs because the target replica may be unable to communicate with the primary replica, making it impossible to ensure that the databases are synchronized. Figure below illustrates how a forced failover can cause data loss on the primary replica and how this loss can extend to a secondary replica.

1. In the previous example, the primary replica's last hardened LSN is 100 before it goes offline, while the asynchronous secondary replica's LSN is 50.

2. After a forced failover, the secondary replica becomes the new primary and sets its last hardened LSN to 50.

3. When the old primary comes back online, its synchronization status is suspended.

4. If synchronization is resumed, the old primary will sync with the new primary, sending its last hardened LSN as 100. Upon finding the new primary's last hardened LSN is 50, it will roll back its transaction log to LSN 50 and start accepting transaction log blocks from the new primary from that point onward. Thus, data loss is propagated from the primary to the secondary replica if synchronization is resumed.

Categories: Interview Questions

How Does A Synchronous And Asynchronous Secondary Replica Resynchronize With The Primary Replica?

Leave a Reply

It is always annoying to see either of the replicas failing, be it a primary or a secondary. However, Always-On Availability Group has an excellent feature to cover us from the situation.

In this article, we will be covering as to how a synchronous or an asynchronous secondary replica resynchronizes with primary replica.

Synchronous And Asynchronous Secondary Resynchronization With Primary Replica

Synchronous Secondary Replica Resynchronizing With The Primary Replica

1. When the synchronous secondary replica goes offline, its status changes from Synchronized to 'Not Synchronizing.' Once this status change occurs, the primary replica stops waiting for an acknowledgment from the secondary that it has hardened a commit and begins treating it as an asynchronous replica. This ensures that commits on the primary replica are not delayed by an unhealthy synchronous secondary replica.

2. When the secondary replica is brought back online, it establishes a connection with the primary replica and sends its End of Log (EOL) Log Sequence Number (LSN) to the primary replica.

3. Upon receiving this, the primary replica begins sending the log blocks that were hardened after the EOL LSN to the secondary replica.

4. As the secondary replica starts receiving and hardening these log blocks, its status changes to Synchronizing. This indicates that the secondary replica is connected to the primary and is catching up, essentially functioning as an asynchronous replica.

5. The secondary replica continues to harden the log blocks, apply the hardened transactions with the REDO thread, and send this information back to the primary replica.

6. This process continues until the Last Hardened (LH) LSN of both the primary and secondary replicas match. Once they do, the status of the secondary replica changes to Synchronized, and the primary replica starts treating it as a synchronous replica.

7. The primary replica begins waiting for an acknowledgment from the secondary replica for the commit before informing the user that the transaction has been committed successfully.

Asynchronous Secondary Replica Resynchronize With The Primary Replica

1. When the asynchronous secondary replica goes offline, its status changes from Synchronizing to 'Not Synchronizing.' The primary replica responds in the same way.

2. Once the secondary replica is brought back online, it establishes a connection with the primary replica and sends its End of Log (EOL) LSN to the primary replica.

3. Upon receiving this, the primary replica begins sending the log blocks that were hardened after the EOL LSN to the secondary replica.

4. As soon as the secondary replica starts receiving and hardening these log blocks, its status changes to Synchronizing. This indicates that the secondary replica is connected to the primary and is catching up.

Refer - Data Synchronization Modes In Always On In SQL Server

Categories: DBA