Bug #15831 SQL_ATTR_CONNECTION_DEAD fails to report dead connections when server goes down
Submitted: 17 Dec 2005 20:52 Modified: 14 Aug 2007 14:55
Reporter: Jonathan Gilbert Email Updates:
Status: Duplicate Impact on me:
None 
Category:Connector / ODBC Severity:S2 (Serious)
Version:3.51.12-2 OS:Windows (Windows XP)
Assigned to: CPU Architecture:Any

[17 Dec 2005 20:52] Jonathan Gilbert
Description:
As part of testing my application's graceful recovery from the rare event of the database server connection being unexpectedly dropped in the middle of a command executing, I forcibly shut down the server while my application was waiting for a command to complete. The application got an error described as "Lost connection while executing command", native error 2013 and SQL state "HYT00". However, when I called SQLGetConnAttr with SQL_ATTR_CONNECTION_DEAD, it reported that the connection was in fact not dead.

How to repeat:
I have not put together a recipe for this, but the cause can be easily discerned by investigating the mechanism whereby myodbc3/utility.c avoids calling mysql_ping() repeatedly. A timestamp is stored in the DBC structure (the last_query_time member), and if the elapsed time since the last call has not exceeded one hour (by default -- the CHECK_IF_ALIVE macro defined in myodbc3/myodbc3.h), it assumes the connection is still alive. This mechanism does not take errors into account, and so, since one successful call was made within the last hour, it returns false, bypassing the call to mysql_ping().

Suggested fix:
In every function exposed by the ODBC API, if an error is returned, reset the DBC structure's last_query_time member to 0, so that the next call to check_if_server_is_alive() will result in an actual call to mysql_ping(). The mysql_ping() may actually reconnect to the server, and this is fine, but it will provide the opportunity for the application to detect when the server really is down.
[18 Dec 2005 11:39] Vasily Kishkin
Sorry...I was not able to reproduce the bug. Could you please create and attach some simple test case ?
[18 Dec 2005 16:36] Jonathan Gilbert
Test case to demonstrate the bug. (Win32)

Attachment: test.c (text/plain), 3.30 KiB.

[18 Dec 2005 16:39] Jonathan Gilbert
I have attached a test case. To compile it, make sure your build environment is configured, and then run:

cl test.c odbc32.lib user32.lib

This should produce test.exe (at least, it does for me, using a Visual Studio .NET 2003 Command Prompt).

When run, the output looks like this:

-------8<----snip----8<-------
[x:\myodbctestcase]test
Dummy window handle: 0x089C0B6C

] SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv): SQL_SUCCESS
] SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, pSQL_OV_ODBC3, 0): SQL_SUCCESS
] SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc): SQL_SUCCESS
] SQLDriverConnect(hDbc, hWnd, CONNSTR, sizeof(CONNSTR), OutConnectionString, Bu
fferLength, &BufferLength, SQL_DRIVER_COMPLETE_REQUIRED): SQL_SUCCESS
] SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt): SQL_SUCCESS

Now, please shut down the database server, or disconnect from the network, or
otherwise make the database connection become unavailable. When you are
certain the server is no longer reachable, press enter.

] SQLExecDirect(hStmt, SAMPLE_QUERY, sizeof(SAMPLE_QUERY)): SQL_ERROR

^-- If the above returned SQL_SUCCESS, then the test will need to be re-run,
as the connection is either still available, or its loss has not been
detected by the ODBC driver.

If it returned an error as expected, then the following call should indicate
that the ODBC driver is aware that the connection is lost:

] SQLGetConnectAttr(hDbc, SQL_ATTR_CONNECTION_DEAD, &IsDead, BufferLength, &Buff
erLength): SQL_SUCCESS

The value returned by SQLGetConnectAttr is: IsDead == 0

The ODBC driver is not indicating that the connection was dropped.

[x:\myodbctestcase]
-------8<----snip----8<-------

I did shut down the server between the initialization code and the call to SQLExecDirect, which is why it returned SQL_ERROR instead of SQL_SUCCESS. However, the value of SQL_ATTR_CONNECTION_DEAD is returned as 0 -- not dead.

If the bug is not showing itself, then the last line of output should read:

-------8<----snip----8<-------
The ODBC driver is correctly reporting that the connection was dropped.
-------8<----snip----8<-------
[18 Dec 2005 20:43] Vasily Kishkin
Thank for the test case. I was able to reproduce the bug.
[31 Dec 2005 6:58] Jonathan Gilbert
Work-around for people smitten by this bug, until an official update is posted:

The bug is caused by utility.c:check_if_server_is_alive() not taking into account errors returned in response to previous commands. Instead, all it takes into account is the amount of time that has elapsed since the last call to mysql_ping(). The code for this looks like this:

int check_if_server_is_alive(DBC FAR *dbc)
{
    time_t seconds= (time_t) time((time_t*) 0);
    int result= 0;
    if ( (ulong) (seconds - dbc->last_query_time) >= CHECK_IF_ALIVE )
    {
        if ( mysql_ping(&dbc->mysql) &&
             mysql_errno(&dbc->mysql) == CR_SERVER_GONE_ERROR )
            result= 1;
    }
    dbc->last_query_time= seconds;
    return result;
}

The key line here is the 'if':

    if ( (ulong) (seconds - dbc->last_query_time) >= CHECK_IF_ALIVE )

On x86, this gets compiled down to a CMP opcode with the value of CHECK_IF_ALIVE as an immediate parameter. CHECK_IF_ALIVE is defined to be 3600, so CMP instruction's operand will be 0x00000E10.

I located this instruction in the myodbc3.dll file (which gets installed to %SystemRoot%\System32) and patched it to render the above 'if' to:

    if ( (ulong) (seconds - dbc->last_query_time) >= 0 )

This is obviously always true, so mysql_ping always gets called. This may result in a small decrease in performance, especially when the server is remote, but if you need the SQL_ATTR_CONNECTION_DEAD to work, perhaps for testing, you can apply the following byte changes to the file:

[x:\windows\system32]fc /b myodbc3.dll.old myodbc3.dll
Comparing files myodbc3.dll.old and MYODBC3.DLL
0000A88D: 10 00
0000A88E: 0E 00

[x:\windows\system32]

This small hack eliminated the bug and enabled me to test my code for handling dropped connections :-)
[8 Feb 2007 7:16] Mikko metsola
Hi,

is there a fix for this in 3.51.x?

We are experiencing similar kind of problems with win32 Falcon5.2.0-alpha.

rgds, MMe
[14 Aug 2007 14:55] Bogdan Degtyariov
This bug is a duplicate of bug#14639