| Bug #14851 | killing a long-running sub-query crashes mysqld | ||
|---|---|---|---|
| Submitted: | 11 Nov 2005 0:40 | Modified: | 14 Feb 2006 2:55 |
| Reporter: | Timothy Smith | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S1 (Critical) |
| Version: | 4.1.16 | OS: | Any (all) |
| Assigned to: | Oleksandr Byelkin | CPU Architecture: | Any |
[14 Nov 2005 4:01]
Timothy Smith
I've repeated this with a simpler test case: create table a (id int primary key); create table b (id int unsigned not null); insert into a values (1); set @c = (select count(*) from a); insert into a select id + @c from a; -- Hit <Up><Enter> enough times to have 4096 rows in table a insert into b select id from a; -- This query will take forever; if you kill it from mysqladmin, -- it will crash mysqld with a seg fault select id from a where id in (select distinct id from b); Note that the "distinct" is important. If that is removed, then the query still takes forever, but it will not crash the server to kill it. Regards, Timothy
[14 Nov 2005 4:17]
Timothy Smith
I said "forever" in the previous comment, but that's not true. It just takes a long time....
[16 Nov 2005 15:13]
Oleksandr Byelkin
Thank you for bugreport!
I can't repeat this bug any more. I tried following mysql-test script:
create table t1 (id int primary key);
create table t2 (id int unsigned not null);
let $1 = 4097;
while ($1)
{
eval insert into t1 values ($1);
dec $1;
}
insert into t2 select id from t1;
connect (con1, localhost, root,,);
connection con1;
create table t3 (kill_id int);
insert into t3 values(connection_id());
select id from t1 where id in (select distinct id from t2);
connection default;
select ((@id := kill_id) - kill_id) from t3;
sleep 3;
kill @id;
connection con1;
-- error 2013
reap;
disconnect con1;
connection default;
drop table t1, t2, t3;
Also I tried your short test suite on my linux comp, and even full data test suite on debx86. server kill connection smoothly. If I lost something in the procedure please reopen bug with some comments.
[16 Nov 2005 22:09]
Timothy Smith
I could not repeat this with the mysqltest binary. However, I
removed the "drop table" statement from the end of the test.
After running the test (and it returned "ok"), I then connected
with the 'mysql' client and ran the select from the test case,
then suspended that mysql and did 'mysqladmin proc' and
'mysqladmin kill <ID>', and it crashed the server.
I have set this up on mysqldev@production so you can use the
exact setup I have.
I just did 'bk pull' in my 4.1 tree, and built it with this
command (as user tsmith):
#################################################
cd ~/m/41/a
~/m/build > log 2>&1
make install
#################################################
Then, as user mysqldev, I did:
#################################################
d=/nfstmp1/tsmith/production/bug14851
rm -rf $d
mkdir $d
cd ~tsmith/m/x/41/bk
./bin/mysql_install_db --no-defaults --basedir="$PWD" --datadir="$d"
./libexec/mysqld --no-defaults --basedir="$PWD" \
--datadir="$d" --skip-networking --socket=mysql.sock &
./bin/mysqltest --no-defaults --socket="$d/mysql.sock" \
--user=root test <<'EOF'
drop table if exists t1, t2, t3;
create table t1 (id int primary key);
create table t2 (id int unsigned not null);
let $1 = 4097;
while ($1)
{
eval insert into t1 values ($1);
dec $1;
}
insert into t2 select id from t1;
connect (con1, localhost, root,,);
connection con1;
create table t3 (kill_id int);
insert into t3 values(connection_id());
select id from t1 where id in (select distinct id from t2);
connection default;
select ((@id := kill_id) - kill_id) from t3;
sleep 3;
kill @id;
connection con1;
-- error 2013
reap;
disconnect con1;
connection default;
/* drop table t1, t2, t3; */
EOF
./bin/mysql --no-defaults --user=root \
--socket="$d/mysql.sock" test
#################################################
-- Now at the mysql> prompt...
select id from t1 where id in (select distinct id from t2);
-- Now hit ^Z, and at the shell prompt:
#################################################
./bin/mysqladmin --no-defaults --user=root \
--socket="$d/mysql.sock" proc
./bin/mysqladmin --no-defaults --user=root \
--socket="$d/mysql.sock" kill 4
#################################################
Regards,
[16 Nov 2005 22:21]
Timothy Smith
log of a session showing the exact commands run as mysqldev@production
Attachment: typescript (application/octet-stream, text), 146.27 KiB.
[16 Nov 2005 22:32]
Timothy Smith
By the way, I tried this on 5.0.15 and it does *not* crash there. It seems that only 4.1 is affected. Timothy
[21 Nov 2005 15:24]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/32466
[21 Nov 2005 15:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/32468
[15 Jan 2006 20:09]
Konstantin Osipov
Approved with a few comments by email.
[18 Jan 2006 11:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/33363
[18 Jan 2006 11:32]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/33364
[18 Jan 2006 11:49]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/33366
[12 Feb 2006 11:51]
Oleksandr Byelkin
bugfix pushed into 5.1.7, 5.0.19, 4.1.19 Problem was that killing a longruning query with subqueries may lead to a mysql server crash due to incorrect attempt to free resources in case of kill
[14 Feb 2006 2:55]
Paul DuBois
Noted in 4.1.19, 5.0.19, 5.1.7 changelogs.
Killing a long-running query containing a subquery could cause
a server crash. (Bug #14851)

Description: Using data from the user (see private comment for details), the following query is run: SELECT ID, Text FROM vocez WHERE ID IN (SELECT DISTINCT( ValID) FROM atr WHERE AtrID='A00630'); This query never returns. When this query is killed from another session, mysqld seg faults in JOIN::join_free(). How to repeat: I tested this on Linux using a fresh pull from mysql-4.1 tree. I ran it with --no-defaults, so there are no my.cnf settings required. See following private comments for where to get user's data. Back trace is below. Just copy the data files into the test db, run the server, and then issue the above select query. If I issue just the inside select statement, it returns ~5800 values in ~60 seconds. (gdb) run Starting program: /nfstmp1/tsmith/m/41/m/bin/mysqld --no-defaults --basedir=/users/tsmith/m/41/m --datadir=/users/tsmith/m/41/m/data --tmpdir=/users/tsmith/m/41/m/tmp --log-error=/users/tsmith/m/41/m/data/log.err --socket=mysql.sock --port=33410 --server-id=33410 --log-bin --gdb [Thread debugging using libthread_db enabled] [New Thread 1302656 (LWP 3043)] [New Thread 31341488 (LWP 3046)] [New Thread 71216048 (LWP 3047)] [New Thread 50719664 (LWP 3048)] [New Thread 81705904 (LWP 3049)] [New Thread 107908016 (LWP 3050)] [New Thread 92195760 (LWP 3051)] [New Thread 118537136 (LWP 3052)] [New Thread 6314928 (LWP 3053)] [New Thread 6564784 (LWP 16656)] [Thread 6564784 (zombie) exited] [New Thread 6564784 (LWP 30209)] [New Thread 6900656 (LWP 12180)] [Thread 6900656 (zombie) exited] [New Thread 6900656 (LWP 12182)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 6564784 (LWP 30209)] JOIN::join_free (this=0x8b31ce8, full=true) at sql_string.h:146 146 if (alloced) (gdb) bt full #0 JOIN::join_free (this=0x8b31ce8, full=true) at sql_string.h:146 tab = (JOIN_TAB *) 0x69137af4 end = (JOIN_TAB *) 0x69137b20 _db_func_ = 0x8aee148 "h\026�\b0\036d" _db_file_ = 0x8af0fd8 "��0\b" _db_level_ = 145956072 _db_framep_ = (char **) 0x6420a8 #1 0x081720a0 in JOIN::cleanup (this=0x8b31ce8) at sql_select.cc:1534 _db_func_ = 0x8aee148 "h\026�\b0\036d" _db_file_ = 0x8af0fd8 "��0\b" _db_level_ = 145678664 _db_framep_ = (char **) 0x6420e8 #2 0x08172086 in JOIN::cleanup (this=0x8aee148) at sql_select.cc:1530 _db_func_ = 0x8aed260 "\b�0\b" _db_file_ = 0x8af0fd8 "��0\b" _db_level_ = 145676072 _db_framep_ = (char **) 0x642138 #3 0x081fb351 in st_select_lex_unit::cleanup (this=0x8aed260) at sql_union.cc:602 error = 0 _db_func_ = 0x82eaf01 "�\003" _db_file_ = 0x8ae8a88 '\217' <repeats 200 times>... _db_level_ = 138038105 _db_framep_ = (char **) 0xec join = (JOIN *) 0x8aee148 sl = (SELECT_LEX *) 0x8aed148 #4 0x0817214f in JOIN::cleanup (this=0x8aed728) at sql_select.cc:1546 lex_unit = (SELECT_LEX_UNIT *) 0x8aed260 _db_func_ = 0x81722d7 "\203�0\205�uw\203�\fS�q���\203�\020\205�ug\213\207L\005" _db_file_ = 0x8aed728 "h\017�\b�\037d" _db_level_ = 145690820 _db_framep_ = (char **) 0x8aed080 #5 0x08172372 in mysql_select (thd=0x8af0eb0, rref_pointer_array=0x8af10c4, tables=0x8aed080, wild_num=0, fields=@0x8af1040, conds=0x8aed5a0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2156153344, result=0x8aed718, unit=0x8af0ef8, select_lex=0x8af0fd8) at sql_select.cc:1633 err = 6562260 free_join = true _db_func_ = 0x6421d8 "" _db_file_ = 0x8af0ec4 "\030\r�\b�ή\b�ή\b " _db_level_ = 8 _db_framep_ = (char **) 0x6421d0 join = (JOIN *) 0x8aed728 #6 0x0816eaeb in handle_select (thd=0x8af0eb0, lex=0x8af0eec, result=0x8aed718) at sql_select.cc:177 res = 145690584 select_lex = (SELECT_LEX *) 0x8af0fd8 _db_func_ = 0x812790d "\203�\020\211C\004\213]���U\211�S\203�\020\213]\b�\003\b�2\b�5L�I\b�4��\203�\020\211C\004\213]���U\211�\203�\f�u\020�u\f\213E\b�p\004�qO" _db_file_ = 0x1 <Address 0x1 out of bounds> _db_level_ = 12 _db_framep_ = (char **) 0xaf0eec #7 0x0814654d in mysql_execute_command (thd=0x8af0eb0) at sql_parse.cc:2096 result = (class select_result *) 0x8aed718 res = -1 lex = (LEX *) 0x8af0eec select_lex = (SELECT_LEX *) 0x8af0fd8 tables = (TABLE_LIST *) 0x8aed6b0 unit = (SELECT_LEX_UNIT *) 0x8af0ef8 _db_func_ = 0x84afac0 "" _db_file_ = 0x642bb0 "�+d" _db_level_ = 1 _db_framep_ = (char **) 0x82f4c89 #8 0x0814b5af in mysql_parse (thd=0x8af0eb0, inBuf=0x8aecef0 "SELECT ID, Text FROM vocez WHERE ID IN (SELECT DISTINCT( ValID) FROM atr WHERE AtrID='A00630')", length=145690348) at sql_parse.cc:4339 lex = (LEX *) 0x8af0eec _db_func_ = 0x8af0eb0 "h�2\b\b-�\b`�I\bx�2\b�&�\b\030\r�\b�ή\b�ή\b " _db_file_ = 0x3 <Address 0x3 out of bounds> _db_level_ = 145690288 _db_framep_ = (char **) 0x642998 #9 0x081450b0 in dispatch_command (command=COM_QUERY, thd=0x8af0eb0, packet=0x8ae8eb9 "", packet_length=95) at sql_parse.cc:1499 packet_end = 0x8aecf4e "" net = (NET *) 0x8af1408 error = false _db_func_ = 0x4274acd7 "\201�\035\203" _db_file_ = 0x4274b090 "]�\215\001�G/" _db_level_ = 6563880 _db_framep_ = (char **) 0x0 #10 0x08144ae2 in do_command (thd=0x8af0eb0) at sql_parse.cc:1312 packet = 0x8ae8eb8 "\001" old_timeout = 30 packet_length = 95 net = (NET *) 0x8af1408 command = COM_QUERY _db_func_ = 0x8125c7e "\203�\020\213]���U\211�VS\213]\b\203�\fS�m" _db_file_ = 0x8af202c "\020�\b" _db_level_ = 8192 _db_framep_ = (char **) 0x1000 #11 0x0814408a in handle_one_connection (arg=0xc0c0c07) at sql_parse.cc:1044 error = 145955588 net = (NET *) 0x8af1408 thd = (class THD *) 0x8af0eb0 launch_time = 202116103 set = {__val = {0 <repeats 32 times>}} #12 0x4274898c in start_thread () from /lib/tls/libpthread.so.0 No symbol table info available. #13 0x426b67da in clone () from /lib/tls/libc.so.6 No symbol table info available. This was just after I did 'kill 2' to kill the sub-select query. Suggested fix: Unknown. Perhaps the crash is related to the fact that the query never finishes.