Bug #26943 64bit mysql 5 hangs on removing duplicate after query distinct,group by,order by
Submitted: 7 Mar 2007 23:57 Modified: 21 Nov 2009 15:18
Reporter: Vic Parat Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.27, 5.0.22, 5.0.68 OS:Linux (linux(centos 4.4 server))
Assigned to: CPU Architecture:Any

[7 Mar 2007 23:57] Vic Parat
Description:
After running the following query the 64 bit version of mysql5 hangs on "Removing Duplicates" state in show processlist:(php5)
	  $qry_displaycustomerproducts = "SELECT DISTINCTROW
	  ".TABLE_ORDERITEMS.".var_prodno_items FROM ".TABLE_ORDERITEMS."  GROUP BY ".TABLE_ORDERITEMS.".var_prodno_items,".TABLE_ORDERITEMS.".int_custno_items having ".TABLE_ORDERITEMS.".var_prodno_items like 'lb%' and  ".TABLE_ORDERITEMS.".int_custno_items='".$int_custno."' order by ".TABLE_ORDERITEMS.".var_prodno_items desc";

This query works fine on same version of mysql but 32bit.
None working machine:
Connection id:          459861
Current database:
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.22-log
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    latin1
Conn.  characterset:    latin1
UNIX socket:            /var/lib/mysql/mysql.sock

show processlist\G (truncated)
     Id: 458290
   User: traindata
   Host: localhost
     db: lumin_train_db
Command: Query
   Time: 17
  State: Removing duplicates
   Info: SELECT DISTINCT
          tbl_orderitems.var_prodno_items FROM tbl_orderitems  GROUP BY tbl_orderitems.var_

working machine:
Connection id:          134
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.22-Debian_0ubuntu6.06.2
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    latin1
Conn.  characterset:    latin1

How to repeat:
	  $qry_displaycustomerproducts = "SELECT DISTINCTROW
	  ".TABLE_ORDERITEMS.".var_prodno_items FROM ".TABLE_ORDERITEMS."   where ".TABLE_ORDERITEMS.".var_prodno_items like 'lb%' and  ".TABLE_ORDERITEMS.".int_custno_items='".$int_custno."' order by ".TABLE_ORDERITEMS.".var_prodno_items desc";

Suggested fix:
Fixed by changing query to:
	  $qry_displaycustomerproducts = "SELECT DISTINCTROW
	  ".TABLE_ORDERITEMS.".var_prodno_items FROM ".TABLE_ORDERITEMS."   where ".TABLE_ORDERITEMS.".var_prodno_items like 'lb%' and  ".TABLE_ORDERITEMS.".int_custno_items='".$int_custno."' order by ".TABLE_ORDERITEMS.".var_prodno_items desc";

Now works in both 64bit and 32bit.
[8 Mar 2007 19:32] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version, 5.0.27 at least. In case of the same problem with your original query, please, send the results of SHOW FULL PROCESSLIST when your query hangs.
[8 Mar 2007 21:08] Vic Parat
As you requested:

mysql> status
--------------
mysql  Ver 14.12 Distrib 5.0.27, for redhat-linux-gnu (x86_64) using readline 5.0

Connection id:          35
Current database:       lumin_train_db
Current user:           root@localhost
SSL:                    Not in use
Current pager:          stdout
Using outfile:          ''
Using delimiter:        ;
Server version:         5.0.27-log
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    latin1
Conn.  characterset:    latin1
UNIX socket:            /var/lib/mysql/mysql.sock

mysql> show full processlist\G
*************************** 1. row ***************************
     Id: 1
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 315460
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 2
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 8
  State: Has read all relay log; waiting for the slave I/O thread to update it
   Info: NULL
*************************** 3. row ***************************
     Id: 4
   User: replication_user
   Host: 72.52.154.136:48676
     db: NULL
Command: Binlog Dump
   Time: 315408
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
*************************** 4. row ***************************
     Id: 54
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show full processlist
*************************** 5. row ***************************
     Id: 55
   User: traindata
   Host: localhost
     db: lumin_train_db
Command: Query
   Time: 20
  State: Removing duplicates
   Info: SELECT DISTINCTROW
         tbl_orderitems.var_prodno_items FROM tbl_orderitems
GROUP BY
tbl_orderitems.var_prodno_items,tbl_orderitems.int_custno_items
having tbl_orderitems.var_prodno_items like 'lb%' and
tbl_orderitems.int_custno_items='13352' order by
tbl_orderitems.var_prodno_items desc
5 rows in set (0.00 sec)
[9 Mar 2007 6:45] Valeriy Kravchuk
Please, send the results of SHOW CREATE TABLE and SHOW TABLE STATUS for that tbl_orderitems table. How many rows are returned?
[21 Mar 2007 13:48] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.37. In case of the same results, please, send your my.cnf file content and the results of "free" Linux command.
[21 Apr 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Jun 2009 22:23] Zack H
I believe I have encountered this same issue. I am running version 5.0.68 on Fedora OS. I have a query that ran fine daily for over 2 years and then hung on the "removing duplicates" state for 1 week before I noticed. The same query hangs when run on my master or multiple slaves. 

Query: SELECT DISTINCT member_id FROM wish_list WHERE STATUS IN ("active","active_vacation","hold") GROUP BY member_id, rank HAVING COUNT(*)>1;

*** The query returns 146 rows in 11 secs without the "DISTINCT", every time

The create table:

CREATE TABLE `wish_list` (
  `id` mediumint(8) NOT NULL auto_increment,
  `member_id` int(8) NOT NULL default '0',
  `company_id` int(10) unsigned NOT NULL default '0',
  `status` enum('active','deleted','pending','completed','converted_to_reminder','active_vacation','frozen','hold','merged_on_amg_release','merged_after_amg_release') NOT NULL default 'active',
  `attributes` tinyint(3) unsigned NOT NULL default '1',
  `time_add` int(11) NOT NULL default '0',
  `time_status_change` int(11) NOT NULL default '0',
  `auto_order` enum('yes','no') NOT NULL default 'no',
  `rank` decimal(13,5) NOT NULL,
  `update_timestamp` timestamp NOT NULL default '0000-00-00 00:00:00' on update CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `status` (`status`),
  KEY `nbs_id_status` (`nbs_id`,`status`),
  KEY `time_add` (`time_add`),
  KEY `member_id` (`member_id`,`status`)
) ENGINE=MyISAM AUTO_INCREMENT=877142 DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
[9 Jun 2009 8:54] Valeriy Kravchuk
Please, send the results of

EXPLAIN SELECT DISTINCT member_id FROM wish_list WHERE STATUS IN
("active","active_vacation","hold") GROUP BY member_id, rank HAVING COUNT(*)>1;

and

SHOW TABLE STATUS LIKE 'wish_list'\G
[31 Jul 2009 6:55] Valeriy Kravchuk
Zack,

I see this in your explain results:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	wish_list	ALL	status	\N	\N	\N	716846	Using where; Using temporary; Using filesort

So, table is read entirely and, without DISTINCT, it is fast enough. Slowdown may be related only to temporary tables or filesort part. Please, check if you have enough space on disk for temporary tables (df -k) and send the results of:

show global variables;

statement and

free

Linux command.
[31 Aug 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[7 Oct 2009 9:33] Sveta Smirnova
Zack,

thank you for the feedback.

I can not create table with definition provided: get error "ERROR 1072 (42000): Key column 'nbs_id' doesn't exist in table" Additionally I can not repeat described behavior. Please upgrade to current version 5.0.86, try with it and if problem still exists provide correct definition of the table.
[21 Oct 2009 15:18] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.86 or 5.1.39, and inform about the results.
[22 Nov 2009 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".