Bug #27119 server crash with integer division by zero during filesort on huge result
Submitted: 14 Mar 2007 8:41 Modified: 27 May 2007 9:38
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S1 (Critical)
Version:5.0.38BK OS:Microsoft Windows (windows)
Assigned to: Damien Katz CPU Architecture:Any
Tags: bfsm_2007_04_05, crash, filesort

[14 Mar 2007 8:41] Shane Bester
Description:
A select causes a server crash due to division by zero, when the table contains a few billion records.

The query is:

create table `t2` as SELECT SQL_BIG_RESULT c1, c2, count(c3) as overlap FROM t1 GROUP BY c1, c2 HAVING count(c3) >=5;

The stack trace contained the following functions:

merge_buffers()
merge_many_buff()
filesort()
create_sort_index()
JOIN::exec()
mysql_select()
handle_select()
mysql_execute_command()
mysql_parse()
dispatch_command()
do_command()
handle_one_connection()
pthread_start()
_callthreadstart()
_threadstart()

See the attached files for more debugging information..

How to repeat:
##warning: large testcase. need huge tmpdir and datadir.
##be sure myisam_max_sort_file_size is large (500G)
##be sure myisam_sort_buffer_size is large (300M)
##be sure sort_buffer_size is large (300M).

drop table if exists `t1`;
create table `t1` (`c1` tinyint unsigned not null,`c2` tinyint unsigned not null,`c3` tinyint unsigned not null, key `c1` (`c1`,`c2`,`c3`)) engine=myisam pack_keys=1 default charset=utf8;
alter table `t1` disable keys;
insert into t1 values (1,2,3),(4,5,6),(6,7,8),(1,4,7);
insert into t1 select * from t1; #8
insert into t1 select * from t1; #16
insert into t1 select * from t1; #32
insert into t1 select * from t1; #64
insert into t1 select * from t1; #128
insert into t1 select * from t1; #256
insert into t1 select * from t1; #512
insert into t1 select * from t1; #1024
insert into t1 select * from t1; #2048
insert into t1 select * from t1; #4096
insert into t1 select * from t1; #8172
insert into t1 select * from t1; #16384
insert into t1 select * from t1; #32768
insert into t1 select * from t1; #65536
insert into t1 select * from t1; #131072
insert into t1 select * from t1; #262144
insert into t1 select * from t1; #524288
insert into t1 select * from t1; #1048576
insert into t1 select * from t1; #2097152
insert into t1 select * from t1; #4194304
insert into t1 select * from t1; #8388608
insert into t1 select * from t1; #16777216
insert into t1 select * from t1; #33554432
insert into t1 select * from t1; #67108864
insert into t1 select * from t1; #134217728
insert into t1 select * from t1; #268435456
insert into t1 select * from t1; #536870912
insert into t1 select * from t1; #1073741824
insert into t1 select * from t1; #2147483648
insert into t1 select * from t1 limit 2147483646; #4294967294
alter table t1 enable keys;
drop table if exists `t2`;
create table `t2` as SELECT SQL_BIG_RESULT c1, c2, count(c3) as overlap FROM t1 GROUP BY c1, c2 HAVING count(c3) >=5;  #crashes after ~45 minutes..

Suggested fix:
.
[14 Mar 2007 8:43] Shane Bester
some debugger outputs from 5.0.38

Attachment: bug27119_debugger_info_5.0.38_windows.txt (text/plain), 10.65 KiB.

[15 Mar 2007 14:59] Shane Bester
i couldn't repeat a crash on linux 5.0.38, so maybe it's windows specific, again:

mysql> create table `t2` as SELECT SQL_BIG_RESULT c1, c2, count(c3) as overlap FROM t1
    -> GROUP BY c1, c2 HAVING count(c3) >=5;
Query OK, 4 rows affected (14 hours 2 min 34.14 sec)
Records: 4  Duplicates: 0  Warnings: 0
[12 Apr 2007 20:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24425

ChangeSet@1.2437, 2007-04-12 16:01:52-04:00, dkatz@damien-katzs-computer.local +2 -0
  Bug #27119  	server crash with integer division by zero during filesort on huge result
  
  Fixed by changing an index variable from a signed int to unsigned int (uint).
[12 Apr 2007 20:23] Damien Katz
I've been unable to reproduce this bug myself, but from the stack traces and
dumps, it was easy to find the culprit. I'm going to see if Shane can apply my patches and varify it fixes the problem.

Because of the difficulty and length of time required to reproduce this (many hours) we won't be adding an explicit test for it.
[19 Apr 2007 18:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24932

ChangeSet@1.2436, 2007-04-19 14:52:09-04:00, dkatz@damien-katzs-computer.local +2 -0
  Polishing: note added.
  ---
  Bug #27119  	server crash with integer division by zero during filesort on huge result
  
  Fixed by changing an index variable from a signed int to unsigned int (uint).
[19 Apr 2007 19:13] Damien Katz
While I've found the code flaw, I've not been able to reproduce this bug. I let my test machine run for week before giving up (it appeared to spend most all of its time in "alter table t1 enable keys;" statement). Reviewers keep this in mind.

Because of the extremely long times to reproduce the bug (> 1 week), I wont be submitting any test case.
[30 Apr 2007 20:14] Damien Katz
The previous commit does fix potential bugs, however it's not clear it fixes this bug.

Based on the stack trace provided, the value of *maxbuffer must have been a number N where: (0x07FFFFFF & N) == 0x07FFFFFF. But those values should be illegal, so guarding against them. Need to figure out how its arriving at that value.

I'm investigating ways to duplicate the bug to answer these questions.
[1 May 2007 19:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25843

ChangeSet@1.2436, 2007-05-01 15:03:47-04:00, dkatz@damien-katzs-computer.local +2 -0
  Bug #27119  	server crash with integer division by zero during filesort on huge result
[16 May 2007 20:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26877

ChangeSet@1.2485, 2007-05-16 16:14:13-04:00, dkatz@damien-katzs-computer.local +2 -0
  Bug #27119  	server crash with integer division by zero during filesort on huge result
  
  Added checks to detect integer overflow and fixed other bugs on the error path.
[17 May 2007 21:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26932

ChangeSet@1.2487, 2007-05-17 17:54:31-04:00, dkatz@damien-katzs-computer.local +1 -0
  Bug #27119  	server crash with integer division by zero during filesort on huge result
  
  Fixed a compiler warning on platforms where uint != ulong from the first pushed fix.
[18 May 2007 0:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26936

ChangeSet@1.2490, 2007-05-17 20:45:33-04:00, dkatz@damien-katzs-computer.local +1 -0
  Bug #27119  	server crash with integer division by zero during filesort on huge result
  
  Fixed a problem and compiler warning on 64bit platforms so that they only allocated UINT_MAX number of BUFFPEKS.
[22 May 2007 17:01] Bugs System
Pushed into 5.1.19-beta
[22 May 2007 17:03] Bugs System
Pushed into 5.0.44
[27 May 2007 9:38] Paul Dubois
Noted in 5.0.44, 5.1.19 changelogs.

A large filesort could result in a division by zero error and a
server crash.