Bug #50399 mysql-client hangs on read()
Submitted: 17 Jan 2010 20:21 Modified: 13 Apr 2013 9:56
Reporter: Alexey Vlasov Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.0.70, 5.1.47 OS:Linux
Assigned to: CPU Architecture:Any
Tags: hang mysql-client read()

[17 Jan 2010 20:21] Alexey Vlasov
Description:
While correcting awfull work of my clients' sites I noticed that the problem concerns the hanging up of the mysql-client, when it is busy with requests. By the way it doesn't even matter if the request goes to the base through TCP or through the local socket. Some request can hardly be hung up, for example, the test request:
SELECT SQL_BUFFER_RESULT SQL_NO_CACHE * FROM mysql.db
(50 kb data in answer)
It hangs with probability one of 50.
There are requests which hang with probability of 100%.

If I send the SIGTSTP or SIGTRAP signal to the hung up process it revives and continues to read normally further the data from a socket. By the way, if you take those requests that hang from 100% probability and for example simply put in any place a blank (i.e. to change the quantity byte) then they gonna stop hanging. I checked it on 10 servers, with different versions of mysql-clients (from 5.0.37 and to 5.1.42), different versions lib from 2.6 to 2.10, the kernels 2.6.24-28, and everywhere all results appeared equally pitiable.

How to repeat:
mysql_connect ('host', 'user', 'pass');
mysql_select_db ('db');

for (;;)
{
        PRINT date('Y-m-d H:i:s')."\t...\t"; _flush();
        $beg = time();
        $q = mysql_query ("SELECT SQL_BUFFER_RESULT SQL_NO_CACHE * FROM db");
        $diff = time() - $beg;
        PRINT "$diff\t";
        if (mysql_fetch_array ($q))
                PRINT "OK\n";
        else
                PRINT "ERROR\n";
        _flush();
        sleep (1);
}

2010-01-17 23:15:49     ...     1       OK
2010-01-17 23:15:51     ...     0       OK
2010-01-17 23:15:52     ...     0
[17 Jan 2010 20:23] Alexey Vlasov
trace

Attachment: trace.txt (text/plain), 10.01 KiB.

[18 Jan 2010 11:02] Sveta Smirnova
Thank you for the report.

Which exact version do you use? Please aslo provide output of SHOW PROCESSLIST during hang and your configuration file.
[18 Jan 2010 17:19] Alexey Vlasov
mysql -V
mysql  Ver 14.12 Distrib 5.0.70, for pc-linux-gnu (x86_64) using readline 5.2

mysqld -V
mysqld  Ver 5.0.70-log for pc-linux-gnu on x86_64 (Gentoo Linux mysql-5.0.70-r1)
[18 Jan 2010 17:25] Alexey Vlasov
My hung up process 63325889.

Attachment: ps_list.txt (text/plain), 6.83 KiB.

[18 Jan 2010 17:28] Alexey Vlasov
[mysqld]
user=mysql
bind-address = 111.222.111.222
default-character-set = cp1251
default-collation = cp1251_general_ci
basedir=/usr
datadir=/var/db/mysql_aux2
flush_time = 1800
pid-file=/var/run/mysqld/mysqld_aux2.pid
socket=/var/run/mysqld/mysqld_aux2.socket

tmpdir=/tmp
;skip-innodb
innodb_file_per_table
innodb_data_home_dir = 
innodb_data_file_path = /var/db/mysql_aux2/innodb/ibdata1:100M:autoextend
innodb_log_group_home_dir = /var/db/mysql_aux2/innodb
innodb_flush_log_at_trx_commit = 0
innodb_buffer_pool_size = 256M
innodb_log_buffer_size = 256M
;innodb_log_file_size = 512M
innodb_additional_mem_pool_size = 128M

default-table-type = InnoDB

log-error=/var/db/mysql_aux2/mysql-err.log
;log-long-format
log-slow-queries
;log-bin=/var/db/mysql_aux2/mysql-bin.log
;log=/var/db/mysql_aux2/mysql-gen.log

old-passwords
skip-name-resolve

myisam_recover

set-variable = long_query_time=10

set-variable = back_log=1024
set-variable = max_connect_errors=10000000
set-variable = max_connections=640000
set-variable = connect_timeout=20
set-variable = wait_timeout=600
set-variable = interactive_timeout=600

set-variable = table_cache=1000
set-variable = thread_cache_size=16
set-variable = max_tmp_tables=8192
set-variable = max_heap_table_size=64M
set-variable = tmp_table_size=256M
set-variable = max_join_size=50000000

set-variable = key_buffer_size=512M
set-variable = read_buffer_size=128K
set-variable = read_rnd_buffer_size=64K
set-variable = sort_buffer=128M
set-variable = join_buffer_size=64M
set-variable = net_buffer_length=64K

set-variable = query_cache_type=1
set-variable = query_cache_size=256M

set-variable = max_allowed_packet=16M

set-variable = ft_min_word_len=3

[client]
character-sets-dir=/usr/share/mysql/charsets
default-character-set = cp1251
;socket=/var/run/mysqld/mysqld.socket
[19 Jan 2010 7:45] Sveta Smirnova
Thank you for the feedback.

I can not repeat described behavior in my environment. Could you please try to rewrite test in C, so we are sure this is not the problem of PHP. This can look like:

#include <stdio.h>
#include <mysql.h>
#include <assert.h>
#include <string.h>
#include <time.h>

int main(int argc, char **argv)
{
	MYSQL conn;
	MYSQL_RES *result;
	MYSQL_FIELD *field;
	int OK;
	int count = 0;
	int timeout = 10;
	
	struct tm *current;
	time_t now;

	const char* query1= "SELECT SQL_BUFFER_RESULT SQL_NO_CACHE * FROM air_schedule_price";
	int query1len = strlen(query1);

	mysql_init(&conn);

	if (!mysql_real_connect(&conn, "127.0.0.1", "root", "", "test", 3350, NULL,0)) {
		printf("Error: %s\n", mysql_error(&conn));
		exit(1);
	}

	while (1 == 1) {
		OK = mysql_real_query (&conn, query1, query1len);
		assert(0 == OK);
		time(&now);
		current = localtime(&now);
		printf("%.4d-%.2d-%.2d %.2d:%.2d:%.2d\t.....\t", current->tm_year+1900, current->tm_mon+1, current->tm_mday, current->tm_hour, current->tm_min, current->tm_sec);
		result = mysql_store_result(&conn);
		if (mysql_fetch_row(result))
			printf("OK\n");
		else
			printf("ERROR\n");
		mysql_free_result(result);
		sleep(1);
	}
}

Also I assume this is the same case: http://dmih.livejournal.com/524026.html?style=mine
[19 Jan 2010 7:46] Sveta Smirnova
Please also try with flush_time=0 and query_cache_size=0
[19 Jan 2010 15:12] Alexey Vlasov
Actually I did not say that a problem was only in the client, which compiles in PHP. It is peculiar to all clients, including console mysql.

> dmih.livejournal.com
It is the same what I write here. We are trying to repair it with Dmitry ourselves.

Your code can't be compiled, here the errors:
mysql_bug.c: In function 'main':
mysql_bug.c:26: warning: incompatible implicit declaration of built-in function 'exit'
/tmp/ccQm2V1w.o: In function `main':
mysql_bug.c:(.text+0x42): undefined reference to `mysql_init'
mysql_bug.c:(.text+0x7a): undefined reference to `mysql_real_connect'
mysql_bug.c:(.text+0x8b): undefined reference to `mysql_error'
mysql_bug.c:(.text+0xbd): undefined reference to `mysql_real_query'
mysql_bug.c:(.text+0x153): undefined reference to `mysql_store_result'
mysql_bug.c:(.text+0x160): undefined reference to `mysql_fetch_row'
mysql_bug.c:(.text+0x184): undefined reference to `mysql_free_result'
collect2: ld returned 1 exit status
[19 Jan 2010 15:27] Alexey Vlasov
With flush_time=0 and query_cache_size=0 also hangs up.
[19 Jan 2010 18:44] Sveta Smirnova
Thank you for the feedback.

> mysql_bug.c:(.text+0x42): undefined reference to `mysql_init'

This means you didn't provide path to mysql includes. Easiest way to do it is use mysql_config program: `mysql_config --include --libs`

Could you also please attach gdb to problem server and run `thread apply all bt full`? Probably something with server too.
[20 Jan 2010 11:42] Alexey Vlasov
$ ./mysql_bug_l11
2010-01-20 14:30:15     .....   OK
2010-01-20 14:30:16     .....   OK
2010-01-20 14:30:17     .....   OK
2010-01-20 14:30:18     .....   OK
2010-01-20 14:30:19     .....   OK
2010-01-20 14:30:20     .....   OK

$ strace -p 22964
...
read(3, "\0\2\1%\7gb_peru\7gb_peru\1Y\1Y\1Y\1Y\1Y\1Y\1"..., 16384) = 3888
read(3, "\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y@\0\0>\1%\rgb_x_smo"..., 16384) = 1448
read(3, "\1Y\1Y\1Y\1Y\1Y\1N\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y0"..., 16384) = 16384
brk(0x7b7000)                           = 0x7b7000
read(3, "x_minju487\1Y\1Y\1Y\1Y\1Y\1Y\1N\1Y\1Y\1Y\1Y\1"..., 16384) = 5336
read(3, "se\1Y\1Y\1Y\1Y\1Y\1Y\1N\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1"..., 16384) = 1448
read(3, "evro\tgb_shevro\1Y\1Y\1Y\1Y\1Y\1Y\1N\1Y\1Y\1"..., 16384) = 1448
read(3, "\1Y\1Y\1Y\1Y\1N\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y@\0\0"..., 16384) = 1448
read(3,
hung

$ kill -SIGTRAP 22964

"18\1Y\1Y\1Y\1Y\1Y\1Y\1N\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1Y\1"..., 16384) = 16119
brk(0x7d9000)                           = 0x7d9000
write(1, "2010-01-20 14:35:42\t.....\tOK\n"..., 29) = 29
brk(0x799000)                           = 0x799000
brk(0x795000)                           = 0x795000
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})
...
[20 Jan 2010 13:17] Alexey Vlasov
after hanging request
+----+------+---------------------+-------+---------+------+-------+------------------+
| Id | User | Host                | db    | Command | Time | State | Info             |
+----+------+---------------------+-------+---------+------+-------+------------------+
| 1  | root | 192.168.0.15:59142 | mysql | Sleep   | 137  |       |                  |
| 14 | root | 192.168.0.11:36918  |       | Query   | 0    |       | show processlist |
+----+------+---------------------+-------+---------+------+-------+------------------+
and backtrace
[20 Jan 2010 13:18] Alexey Vlasov


Attachment: bt_full.txt (text/plain), 10.66 KiB.

[26 Jan 2010 12:03] Sveta Smirnova
Thank you for the feedback.

I still can not repeat described behavior. Please indicate which package do you use (filename you downloaded). If you compiled mysqld yourself provide configuration options.
[1 Feb 2010 9:51] Alexey Vlasov
I don't know, may be the problem concerns Gentoo. Try to install the latest stage3 and to compile MySQL 5.0.70 there via emerge. Optimization options don't matter at all, I got query hanging with and without them.

By the way, to help queries that always hang try FLUSH TABLES. After this command it stops hanging at all.
[1 Feb 2010 10:00] Sveta Smirnova
Thank you for the feedback.

I am afraid I have no suitable box where I can install Gentoo. Also we have several bug which were repeatable only with binaries provided by OS vendors.

Please try our latest 5.0.90 sources/binaries available from http://dev.mysql.com/downloads/mysql/5.0.html#downloads and inform us if problem still repeatable for you.
[2 Mar 2010 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[7 Apr 2010 11:28] Alexey Vlasov
Here the trace for queries that always hang.

Server:
[pid 24606] <... write resumed> )       = 4240
[pid 24606] write(36356, "um wp-image-4231\" style=\"margin-"..., 4240 <unfinished ...>
[pid 24606] <... write resumed> )       = 4240
[pid 24606] write(36356, "\223\320\273\320\260\320\262\320\260 18:</em><a href=\"http:";..., 4248 <unfinished ...>
[pid 24606] <... write resumed> )       = 4248
[pid 24606] write(36356, "://slovobelova.ru/?p=  \"> </a>\r\n"..., 456 <unfinished ...>
[pid 24606] <... write resumed> )       = 456
[pid 24606] sched_setscheduler(24606, SCHED_OTHER, { 8 } <unfinished ...>
[pid 24606] <... sched_setscheduler resumed> ) = -1 EINVAL (Invalid argument)
[pid 24606] read(36356,  <unfinished ...>
[pid 24606] <... read resumed> 0x2aab181368b0, 4) = -1 EAGAIN (Resource temporarily unavailable)
[pid 24606] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1],  <unfinished ...>
[pid 24606] <... rt_sigprocmask resumed> [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
[pid 24606] rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP],  <unfinished ...>
[pid 24606] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid 24606] fcntl(36356, F_SETFL, O_RDWR <unfinished ...>
[pid 24606] <... fcntl resumed> )       = 0
[pid 24606] read(36356,  <unfinished ...>

Client:
read(6, "\320\260\321\217</a>\r\n<a href=\"http://slovobe"..., 2112) = 1448
read(6, "\272\321\203\321\201\321\202\320\260 </a>\r\n\r\n<a href=\"http:/"..., 16384) = 1448
read(6, "\214\321\207\320\270\320\272-\321\201-\320\277\320\260\320\273\321\214\321\207\320\270\320\272</a>\r\n<a"..., 16384) = 488
read(6, ">\r\n<a href=\"http://slovobelova.ru"..., 16384) = 1448
read(6, "\321\213</a>\r\n<a href=\"http://slovobelo"..., 6254) = 2792
read(6, "\320\260 \321\210\320\272\320\276\320\273\321\214\320\275\320\276\320\274 \321\203\321\202\321\200\320\265\320\275\320\275\320"..., 3462) = 3462
read(6, "\336\35\0&\0044230\0011\0232010-03-16 12:44:40\0232"..., 16384) = 778
read(6, "um wp-image-4231\" style=\"margin-t"..., 6872) = 4240
read(6, "\223\320\273\320\260\320\262\320\260 18:</em><a href=\"http:/"..., 2632) = 2632
read(6, "M\t\0'\003916\0011\0232009-12-03 22:30:41\02320"..., 16384) = 1616
read(6, "://slovobelova.ru/?p=  \"> </a>\r\n<"..., 16384) = 456
read(6,
[7 Apr 2010 11:35] Alexey Vlasov
By the way my Gentoo and MySQL are compiled from the sources.
[26 May 2010 20:30] DJ Miller
For what it's worth, I'm seeing a similar issue on CentOS 5.3, MySql 5.0.77; but not on another CentOS 5.3 box with MySql 5.0.45.

My problem case is different, though:

mysql_real_connect()
mysql_use_result() (on a query that selects 80-some rows)
mysql_fetch_row() x3
mysql_free_result()
mysql_close()

repeated about four times.

(There's a lot of processing in between these statements that's not included here.)

The fourth time I get through this process, the mysql_free_result() call hangs.

I read through the docs and noticed that you're not supposed to call mysql_free_result on a result set you've opened with mysql_use_result until you've reached the end with mysql_fetch_row, but if I try to throw a loop in front of mysql_free_result to call mysql_fetch_row first, that statement hangs.
[3 Jun 2010 15:13] Alexey Vlasov
MySQL-client-5.1.47-1.glibc23.x86_64.rpm also hangs.
[17 Aug 2010 20:07] Sveta Smirnova
Thank you for the feedback.

Could you also please test with version 5.1.49: this can be same as bug #52528 which is not repeatable in version 5.1.49.
[17 Sep 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[1 Oct 2010 18:31] Alexey Vlasov
If the client is on the server with the kernel 2.6.32 (MySQL server can be on any kernel) the bug of the first type it is not played back (when it hangs 50/50). It is actual for all MySQL versions.

The bug of the second type (100% hanging) can't be played back yet as there're still not enough websites on servers with 2.6.32 kernel.
[26 Apr 2011 9:56] Alexey Vlasov
In general, I can say there is no bugs on kernels >= 2.6.32.
[27 Apr 2011 10:10] Valeriy Kravchuk
So, looks like this was more a problem of older Linux kernels than of MySQL code.
[11 Apr 2013 18:49] Angie Hinrichs
Yes, it's 2013 but my group is developing on kernel version 2.6.18 (mysql 5.0.67), and I encountered this problem a couple days ago, after developing on this system for years with no such problems.  

In case it helps anyone else, after I did a 'RESET QUERY CACHE', the problem disappeared!  No more hanging mysql_fetch_row (... __read_nocancel).
[13 Apr 2013 9:56] Alexey Vlasov
You can also make OPTIMIZE TABLE it will help.
Actually I had such hangings also on new linux kernels but very, very rare.