Azure Kubernetes .NET Core App to Azure SQL Database Intermittent Error 258 Azure Kubernetes .NET Core App to Azure SQL Database Intermittent Error 258 kubernetes kubernetes

Azure Kubernetes .NET Core App to Azure SQL Database Intermittent Error 258


The problem was an infrastructure issue at Azure.

There is a known issue within Azure Network where the dhcp lease islost whenever a disk attach/detach happens on some VM fleets. There isa fix rolling out at the moment to regions. I'll check to see whenAzure Status update will be published for this.

The problem disappeared, so it appears as if the fix has been rolled out globally.

For anyone else running into this issue in the future, you can identify it by establishing an SSH connection into the node (not the pod). Do an ls -al /var/log/ and identify all the syslog files and run the following grep on each file.

cat /var/log/syslog | grep 'carrier'

If you have any Lost carrier and Gained carrier messages in the log, there is a some sort of a network issue. In our case it was the DHCP lease.

enter image description here


from my research, it appears as if this specific timeout is related to the connection timeout rather than the command timeout

I don't think so. The callstack goes through System.Data.SqlClient.SqlCommand.ExecuteScalar() so it's running a query, after a successful connection.

This is a CommandTimeout, caused by the client abandoning a long-running command. The default CommandTimeout is 30sec.

To troubleshoot why the command is taking a long time, start with the Query Store and the related Query Performance Insight.

There's some noise about this error on GitHub, but I don't see any evidence that there's anything other than ordinary Command Timeouts going on. Eg if you run

using (var con = new SqlConnection(constr)){    con.Open();    var sql = @"waitfor delay '01:00:00'";    var cmd = con.CreateCommand();    //cmd.CommandTimeout = 0;    cmd.CommandText = sql;    try    {        Console.WriteLine(DateTime.Now);        cmd.ExecuteNonQuery();    }    catch (Exception  ex)    {        Console.WriteLine(DateTime.Now);        Console.WriteLine(ex);    }}

You'll get (With Microsoft.Data.SqlClient):

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean isAsync, Int32 timeout, Boolean asyncWrite)   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String methodName)   at Microsoft.Data.SqlClient.SqlCommand.ExecuteNonQuery()   at SqlClientTest.Program.Main(String[] args) in C:\Users\david\source\repos\SqlClientTest\SqlClientTest\Program.cs:line 34

Or slightly different for System.Data.SqlClient (which you appear to be using):

System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.   --- End of inner exception stack trace ---   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)   at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite, String methodName)   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()

The difference between

System.ComponentModel.Win32Exception (258): The wait operation timed out.

and

System.ComponentModel.Win32Exception (258): Unknown error 258

is probably just the availability of the Win32Exception descriptions on Windows vs Linux.