Bug #12898 query causes core dump in libmysql after mysql server restart
Submitted: 31 Aug 2005 6:48 Modified: 9 Apr 2008 9:33
Reporter: Matt Provost Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.10a OS:Linux (Fedora Core 2)
Assigned to: CPU Architecture:Any

[31 Aug 2005 6:48] Matt Provost
Description:
Several users have reported crashes in the program rtgpoll (part of rtg.sourceforge.net) when the mysql server restarts while it's running. As far as I can tell, the crash is occuring in the mysql driver.

On Fedora, it dies with:
*** glibc detected *** double free or corruption (!prev): 0x088e7848 ***        
Aborted (core dumped)                                                           

Here is a backtrace from the produced core:

(gdb) bt                                                                        
#0  0x003fb7a2 in ?? () from /lib/ld-linux.so.2                                 
#1  0x0043b7d5 in raise () from /lib/tls/libc.so.6                              
#2  0x0043d149 in abort () from /lib/tls/libc.so.6                              
#3  0x0046f27a in __fsetlocking () from /lib/tls/libc.so.6                      
#4  0x00475abf in malloc_trim () from /lib/tls/libc.so.6                        
#5  0x00475e3a in free () from /lib/tls/libc.so.6                               
#6  0x00aec6c3 in my_no_flags_free () from /usr/lib/libmysqlclient_r.so.12      
#7  0x00afe2d6 in vio_delete () from /usr/lib/libmysqlclient_r.so.12            
#8  0x00ae77f6 in end_server () from /usr/lib/libmysqlclient_r.so.12            
#9  0x00ae739f in simple_command () from /usr/lib/libmysqlclient_r.so.12        
#10 0x00ae9b40 in mysql_close () from /usr/lib/libmysqlclient_r.so.12           
#11 0x00ae9c94 in mysql_close () from /usr/lib/libmysqlclient_r.so.12           
#12 0x00ae9865 in mysql_reconnect () from /usr/lib/libmysqlclient_r.so.12       
#13 0x00ae73a8 in simple_command () from /usr/lib/libmysqlclient_r.so.12        
#14 0x00ae9fac in mysql_send_query () from /usr/lib/libmysqlclient_r.so.12      
#15 0x00aea1a8 in mysql_real_query () from /usr/lib/libmysqlclient_r.so.12      
#16 0x003692e7 in __db_insert (table=0x8875bf0 "ifInOctets_45", iid=801, 
    insert_val=11003836, insert_rate=183345.38603411079) at                     
librtgmysql.c:173                                                               
#17 0x08049ebe in poller (thread_args=0xbfe765dc) at rtgsnmp.c:317              
#18 0x005ee341 in __pthread_initialize_minimal () from                          
/lib/tls/libpthread.so.0                                                        
#19 0x004dafee in clone () from /lib/tls/libc.so.6

The code that is calling it is:

int __db_insert(char *table, int iid, unsigned long long insert_val, double insert_rate) {
        MYSQL mysql = getmysql();

        char *query;

        char *table_esc;

        int result;

        table_esc = escape_string(table_esc, table);
        
        asprintf(&query,
                "INSERT INTO `%s` (id,dtime,counter,rate) VALUES (%i,NOW(),%llu,%.6f)",         
                table_esc, iid, insert_val, insert_rate);
       
        free(table_esc);
        debug(HIGH, "Query = \"%s\"\n", query);

        /* now execute the query */

        result = mysql_real_query(&mysql, query, strlen(query));

        free(query);

        if (result == 0) {
                return TRUE;
        } else {
                debug(LOW, "MySQL error: %s\n", mysql_error(&mysql));
                return FALSE;
        }
}

You can also see the project source at:
http://cvs.sourceforge.net/viewcvs.py/rtg/rtg/src/

The mysql driver is librtgmysql.c. The rtg mysql driver is a shared library that is loaded with libtool at runtime.

Could it be some interaction with mysql manipulating the connection handle MYSQL which is stored in thread local storage? I don't see any problems with this on FreeBSD but it happens every time on Fedora. You can see that it's using the thread safe mysql library, but maybe the reconnect functionality isn't thread safe? I'd like to determine whether the problem is in our code or the mysql driver.

How to repeat:
Run the rtgpoll daemon. While it's running, restart the mysql server. The next query will attempt to reconnect and core dump.
[31 Aug 2005 9:08] Andrey Hristov
Hi,
as you have a reproducing case with everything set up, could you build a debug version of libmysqlclient_r and then post again the backtrace (which should include line numbers). You can build from source by using compile-pentium-debug-no-bdb script from the BUILD directory and the install the client libraries from libmysql/. 

Thank you in advance!
[31 Aug 2005 15:24] Matt Provost
The user rebuilt a debug version and now gets this:

[08/31 10:05:58 Queue ready, broadcasting thread go condition.]                 
Error: Freeing unallocated data at line 2281, 'libmysql.c'                      
Error: Freeing unallocated data at line 2282, 'libmysql.c'                      
Error: Freeing unallocated data at line 2283, 'libmysql.c'                      
Error: Freeing unallocated data at line 2284, 'libmysql.c'                      
Error: Freeing unallocated data at line 127, 'net.c'                            
Error: Freeing unallocated data at line 2281, 'libmysql.c'                      
Error: Freeing unallocated data at line 2282, 'libmysql.c'                      
Error: Freeing unallocated data at line 2283, 'libmysql.c'                      
Error: Freeing unallocated data at line 2284, 'libmysql.c'                      
Error: Freeing unallocated data at line 34, 'viosocket.c'                       
Error: Freeing unallocated data at line 2281, 'libmysql.c'                      
Error: Freeing unallocated data at line 2282, 'libmysql.c'                      
Error: Freeing unallocated data at line 2283, 'libmysql.c'                      
Error: Freeing unallocated data at line 2284, 'libmysql.c'                      
Error: Freeing unallocated data at line 2281, 'libmysql.c'                      
Error: Freeing unallocated data at line 2282, 'libmysql.c'                      
Error: Freeing unallocated data at line 2283, 'libmysql.c'                      
Error: Freeing unallocated data at line 2284, 'libmysql.c'                      
Error: Freeing unallocated data at line 34, 'viosocket.c'                       
Error: Freeing unallocated data at line 2281, 'libmysql.c'                      
Error: Freeing unallocated data at line 2282, 'libmysql.c'        
Error: Freeing unallocated data at line 2283, 'libmysql.c'                      
Error: Freeing unallocated data at line 2284, 'libmysql.c'                      
                                                                                
<at this point, output stops, and program hangs>

There's no core dump anymore. Did they build it correctly, or is this showing the error?
[31 Aug 2005 15:29] Brandon Ewing
While I don't have much experience with the debug version of MySQL, I believe I followed the instructions correctly:

I downloaded the mysql source, and ran BUILD/compile-pentium-debug-no-bdb, then copied libmysql_r/.libs/libmysqlclient_r.* to the correct directories (.so's to /usr/lib, .la and .a to /usr/lib/mysql).  I then rebuilt rtg, started the poller, and restarted the MySQL server.  It created the following output to stdout (and this is where I don't know if it's normal, because it never actually crashed and generated a core):

[No Resp = 0] [SNMP Errs = 0] [Slow = 0] [PollTime = 44.617s] Next Poll: 10...9...8...7...6...5...4...3...2...1...
[08/31 10:05:58 Queue ready, broadcasting thread go condition.]
Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'
Error: Freeing unallocated data at line 127, 'net.c'
Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'
Error: Freeing unallocated data at line 34, 'viosocket.c'
Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'
Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'
Error: Freeing unallocated data at line 34, 'viosocket.c'
Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'
 
<at this point, output stops, and program hangs>

Like I said, I'm not sure this is what you're asking for, as it never generates a core to backtrace. :/

This is on MySQL 4.0.25, CentOS-4
[1 Sep 2005 17:05] Hartmut Holzgraefe
to me this looks like either an invalid pointer passed to mysql_close()
or mysql_close() being called twice for the same MYSQL *?
[2 Sep 2005 5:45] Hartmut Holzgraefe
well, looking at the backtrace again i see

#10 0x00ae9b40 in mysql_close () from /usr/lib/libmysqlclient_r.so.12          

#11 0x00ae9c94 in mysql_close () from /usr/lib/libmysqlclient_r.so.12    

mysql_close() only calls itself when freeing master or slave entries 
that have been set with the mysql_set_master() and mysql_add_slave()
calls. as these are not used at all here this is another indication of
memory corruption

maybe several threads are trying to perform the 

   mysql_close(); exit();

sequence in rtgpoll.c at the same time? could wrapping it up with a mutex help? 
anyway, looks like an application problem and not a mysql bug to me ...
[2 Sep 2005 5:46] Matt Provost
Yes I think it might be closing the connection twice, but it's happening in the driver not in our code. We just call mysql_real_query, but if it detects that the server restarted then it looks like it closes the connection (twice) and tries to reconnect. We aren't calling mysql_close anywhere in this code path. The only place we call mysql_close is from a function pushed onto the pthread_cleanup stack for when the program exits. Is there something in the driver that is closing it twice when it detects that it's been disconnected from the server?
[2 Sep 2005 7:13] Matt Provost
Each thread has it's own connection that it stores in thread-local storage, so they shouldn't be using the same MYSQL *.

Looking at the debug output:

Error: Freeing unallocated data at line 2281, 'libmysql.c'
Error: Freeing unallocated data at line 2282, 'libmysql.c'
Error: Freeing unallocated data at line 2283, 'libmysql.c'
Error: Freeing unallocated data at line 2284, 'libmysql.c'

Those lines are:

my_free((gptr) mysql->host_info,MYF(MY_ALLOW_ZERO_PTR));
my_free(mysql->user,MYF(MY_ALLOW_ZERO_PTR));
my_free(mysql->passwd,MYF(MY_ALLOW_ZERO_PTR));
my_free(mysql->db,MYF(MY_ALLOW_ZERO_PTR));

in mysql_close. Those pointers are freed on line 2299 also in mysql_close:

mysql->host_info=mysql->user=mysql->passwd=mysql->db=0;

The other debug line that is interesting is:

Error: Freeing unallocated data at line 34, 'viosocket.c'

That is:

my_free((gptr) vio,MYF(0));

in vio_delete, which is called by end_server which is called by mysql_close. So to me it looks like it's trying to close the same MYSQL twice, but I don't see how it's failing either of the two cases for master or slave.

If it's the in RTG code I'd love to fix it but I don't see how what we're doing could affect this.
[24 Sep 2005 17:50] Valeriy Kravchuk
Please, look at the http://bugs.mysql.com/bug.php?id=13457. It is a bug report about similar problem with calling mysql_close after server shutdown. 

Note the last comment from Miguel - compare the sequence of calls in your code and in mysql.cc: mysql_close is called in mysql_end (normal exit) and sql_real_connect (upon reconnection, if needed) functions ONLY! You application should do the same.

Inform, please, about the results of comparison and change you application if needed.
[26 Sep 2005 1:46] Matt Provost
We're not calling mysql_close in our code. It looks like it's being called from within the mysql driver in mysql_reconnect when it discovers that the connection has gone away and tries to reconnect. This is the entire function:

int __db_insert(char *table, int iid, unsigned long long insert_val, double insert_rate) {
	MYSQL mysql = getmysql();

	char *query;

	char *table_esc;

	int result;

	table_esc = escape_string(table_esc, table);

	asprintf(&query, 
		"INSERT INTO `%s` (id,dtime,counter,rate) VALUES (%i,NOW(),%llu,%.6f)",
		table_esc, iid, insert_val, insert_rate);

	free(table_esc);

	debug(HIGH, "Query = \"%s\"\n", query);

	/* now execute the query */

	result = mysql_real_query(&mysql, query, strlen(query));

	free(query);

	if (result == 0) {
		return TRUE;
	} else {
		debug(LOW, "MySQL error: %s\n", mysql_error(&mysql));
		return FALSE;
	}
}

As you can see, we only call mysql_real_query and that's where it crashes, but I don't see how we can do anything to affect the mysql_close that is happening inside mysql_real_query.
[12 Oct 2005 3:00] MySQL Verification Team
I was unable for to repeat the behavior reported testing simples
application with my own. It is possible for you to create a
standalone C test case which can shows the issue ?

Thanks in advance.
[13 Nov 2005 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[9 Apr 2008 9:33] Susanne Ebrecht
I closed this bug. There aren't any corruption anymore by using newer versions of MySQL and operating system.