Bug #31791 MySQL Server hangs on "Query End" state that cannot be killed.
Submitted: 23 Oct 2007 15:57 Modified: 28 Nov 2007 12:28
Reporter: Dipin Hora Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.1.22-rc OS:Linux (Debian Etch)
Assigned to: CPU Architecture:Any
Tags: 5.1.22-rc, hang, query end, server

[23 Oct 2007 15:57] Dipin Hora
Description:
We have a target partitioned table and a source non-partitioned table. We are using Kettle as the ETL tool to process the data. The process runs correctly for a while and then one of the selects against the partitioned table gets stuck in the "Query End" state as follows:

|  2 | etluser | 10.10.20.66:62712 | test | Query   |  224 | query end | SELECT col1, col2, col3, col4, modify_on from table1 where col1 = '?' and col2 = '?' and col3 = '?'|

The select is a simple select that does a lookup based on an index that runs without any issues for over 100,000 rows before the issues surfaces. There are no messages in the mysql.err file or anywhere else. There is no stack trace of anything else as the server doesn't crash but just hangs. There is no IO or CPU activity occurring after it hangs. Attempting to shut down the server results in the server becoming totally unresponsive and the process to be killed to resume operation. Once resumed, the tables are marked as being left open and need to be checked/repaired to correct that issue.

This is a critical issue as we cannot load our data into the server as a result and would not be able to use 5.1 for our non production reporting system as a result. If there is any way for us to provide you more information to diagnose the issue please let us know and we will do what we can to help out.

How to repeat:
We are able to repeat the issue fairly quickly and easily if needed but we are not able to identify a simple test case to repeat the problem with.
[23 Oct 2007 16:17] Sveta Smirnova
Thank you for the report.

Please provide output of SHOW CREATE TABLE for the target partitioned table and the source non-partitioned table. Also provide output of SHOW TABLE STATUS for both tables.
[23 Oct 2007 16:33] Dipin Hora
Create table for the target:

CREATE TABLE `downloads` (
  `transfer_id` varchar(32) NOT NULL,
  `transfer_type` varchar(10) DEFAULT NULL,
  `package_id` varchar(40) DEFAULT NULL,
  `torrent_id` varchar(64) NOT NULL,
  `torrent_rg` varchar(20) DEFAULT NULL,
  `counter` int(11) DEFAULT '0',
  `session_id` varchar(32) NOT NULL,
  `node_id` varchar(40) DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  `transfer_role` varchar(8) DEFAULT NULL,
  `storage_proxy_id` int(11) DEFAULT NULL,
  `sp_expiration_time` datetime DEFAULT NULL,
  `start_time` datetime DEFAULT NULL,
  `complete_time` datetime DEFAULT NULL,
  `transfer_bps` bigint(20) DEFAULT NULL,
  `cancel_time` datetime DEFAULT NULL,
  `cancel_reason` varchar(13) DEFAULT NULL,
  `bytes_from_peers` bigint(20) DEFAULT '-1',
  `bytes_from_sps` bigint(20) DEFAULT '-1',
  `bytes_from_dsps` bigint(20) DEFAULT '-1',
  `percent_completed` double DEFAULT NULL,
  `modify_on` datetime DEFAULT NULL,
  `bytes_from_peers_udp` bigint(20) DEFAULT '-1',
  `total_peers` int(11) DEFAULT '-1',
  `total_connections` int(11) DEFAULT '-1',
  `total_connections_udp` int(11) DEFAULT '-1',
  `rss_url_id` bigint(20) DEFAULT NULL,
  `peer_id` varchar(40) DEFAULT NULL,
  `actual_bytes_from_sps` bigint(20) DEFAULT '-1',
  `actual_bytes_from_peers` bigint(20) DEFAULT '-1',
  `actual_bytes_from_peers_udp` bigint(20) DEFAULT '-1',
  `bytes_from_http` bigint(20) DEFAULT '-1',
  `actual_bytes_from_http` bigint(20) DEFAULT '-1',
  `has_http_url` tinyint(1) DEFAULT '0',
  `is_sequential` tinyint(1) DEFAULT '0',
  `referer_url` varchar(1024) DEFAULT NULL,
  `event_type` char(3) DEFAULT NULL,
  `retry_count` int(11) DEFAULT '0',
  `referer_domain` varchar(255) DEFAULT NULL,
  KEY `IX_downloads_1` (`package_id`),
  KEY `IX_downloads_2` (`session_id`),
  KEY `FK_downloads_1` (`torrent_id`),
  KEY `FK_downloads_2` (`torrent_rg`),
  KEY `FK_downloads_3` (`storage_proxy_id`),
  KEY `FK_downloads_4` (`node_id`),
  KEY `FK_downloads_5` (`user_id`),
  KEY `IX_downloads_modify_on` (`modify_on`),
  KEY `IX_downloads_lookup` (`torrent_id`,`node_id`,`transfer_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE (to_days(modify_on)) (PARTITION downloads_min VALUES LESS THAN (732950) ENGINE = MyISAM, PARTITION downloads_200610 VALUES LESS THAN (732981) ENGINE = MyISAM, PARTITION downloads_200611 VALUES LESS THAN (733011) ENGINE = MyISAM, PARTITION downloads_200612 VALUES LESS THAN (733042) ENGINE = MyISAM, PARTITION downloads_200701 VALUES LESS THAN (733073) ENGINE = MyISAM, PARTITION downloads_200702 VALUES LESS THAN (733101) ENGINE = MyISAM, PARTITION downloads_200703 VALUES LESS THAN (733132) ENGINE = MyISAM, PARTITION downloads_200704 VALUES LESS THAN (733162) ENGINE = MyISAM, PARTITION downloads_200705 VALUES LESS THAN (733193) ENGINE = MyISAM, PARTITION downloads_200706 VALUES LESS THAN (733223) ENGINE = MyISAM, PARTITION downloads_200707 VALUES LESS THAN (733254) ENGINE = MyISAM, PARTITION downloads_200708 VALUES LESS THAN (733285) ENGINE = MyISAM, PARTITION downloads_200709 VALUES LESS THAN (733315) ENGINE = MyISAM, PARTITION downloads_200710 VALUES LESS THAN (733346) ENGINE = MyISAM, PARTITION downloads_200711 VALUES LESS THAN (733376) ENGINE = MyISAM, PARTITION downloads_200712 VALUES LESS THAN (733407) ENGINE = MyISAM, PARTITION downloads_max VALUES LESS THAN MAXVALUE ENGINE = MyISAM) */

Create table for the source:

CREATE TABLE `downloads_inc` (
  `transfer_id` varchar(32) NOT NULL,
  `transfer_type` varchar(10) DEFAULT NULL,
  `package_id` varchar(40) DEFAULT NULL,
  `torrent_id` varchar(64) NOT NULL,
  `torrent_rg` varchar(20) DEFAULT NULL,
  `counter` int(11) DEFAULT '0',
  `session_id` varchar(32) NOT NULL,
  `node_id` varchar(40) DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  `transfer_role` varchar(8) DEFAULT NULL,
  `storage_proxy_id` int(11) DEFAULT NULL,
  `sp_expiration_time` datetime DEFAULT NULL,
  `start_time` datetime DEFAULT NULL,
  `complete_time` datetime DEFAULT NULL,
  `transfer_bps` bigint(20) DEFAULT NULL,
  `cancel_time` datetime DEFAULT NULL,
  `cancel_reason` varchar(13) DEFAULT NULL,
  `bytes_from_peers` bigint(20) DEFAULT '-1',
  `bytes_from_sps` bigint(20) DEFAULT '-1',
  `bytes_from_dsps` bigint(20) DEFAULT '-1',
  `percent_completed` double DEFAULT NULL,
  `modify_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `bytes_from_peers_udp` bigint(20) DEFAULT '-1',
  `total_peers` int(11) DEFAULT '-1',
  `total_connections` int(11) DEFAULT '-1',
  `total_connections_udp` int(11) DEFAULT '-1',
  `rss_url_id` bigint(20) DEFAULT NULL,
  `peer_id` varchar(40) DEFAULT NULL,
  `actual_bytes_from_sps` bigint(20) DEFAULT '-1',
  `actual_bytes_from_peers` bigint(20) DEFAULT '-1',
  `actual_bytes_from_peers_udp` bigint(20) DEFAULT '-1',
  `bytes_from_http` bigint(20) DEFAULT '-1',
  `actual_bytes_from_http` bigint(20) DEFAULT '-1',
  `has_http_url` tinyint(1) DEFAULT '0',
  `is_sequential` tinyint(1) DEFAULT '0',
  `referer_url` varchar(1024) DEFAULT NULL,
  `event_type` char(3) DEFAULT NULL,
  `retry_count` int(11) DEFAULT '0',
  `referer_domain` varchar(255) DEFAULT NULL,
  KEY `IX_downloads_1` (`package_id`),
  KEY `IX_downloads_2` (`session_id`),
  KEY `FK_downloads_1` (`torrent_id`),
  KEY `FK_downloads_2` (`torrent_rg`),
  KEY `FK_downloads_3` (`storage_proxy_id`),
  KEY `FK_downloads_4` (`node_id`),
  KEY `FK_downloads_5` (`user_id`),
  KEY `IX_downloads_modify_on` (`modify_on`),
  KEY `IX_downloads_inc_lookup` (`event_type`,`torrent_id`,`node_id`,`transfer_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

Actual query that hung:

SELECT transfer_bps, bytes_from_peers, bytes_from_sps, bytes_from_dsps, percent_completed, modify_on, bytes_from_peers_udp, total_peers, total_connections, total_connections_udp, actual_bytes_from_sps, actual_bytes_from_peers, actual_bytes_from_peers_udp, bytes_from_http, actual_bytes_from_http, complete_time FROM downloads WHERE transfer_id = '0D8D552BDD6B4EEC85344533AC6CB7DB'  AND torrent_id = '00000000000000000000000099F1C0113CF671B3DC2214F6332B732D9664811B'  AND node_id = 'DC4D17947D114996AB69DA1E71C21483'
[23 Oct 2007 16:34] Dipin Hora
Show table status for the two tables:

mysql> show table status where name in ('downloads', 'downloads_inc');
+---------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name          | Engine | Version | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+---------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| downloads     | MyISAM |      10 | Dynamic    | 467625 |            415 |   194118568 |               0 |    139538432 |         0 |           NULL | 2007-10-16 19:00:45 | 2007-10-23 11:39:21 | 2007-10-22 18:58:45 | latin1_swedish_ci |     NULL | partitioned    |         | 
| downloads_inc | MyISAM |      10 | Dynamic    | 797775 |            362 |   289394736 | 281474976710655 |    190315520 |         0 |           NULL | 2007-10-16 15:24:48 | 2007-10-22 18:56:22 | 2007-10-22 18:56:22 | latin1_swedish_ci |     NULL |                |         | 
+---------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
2 rows in set (0.00 sec)
[28 Oct 2007 12:28] Valeriy Kravchuk
Please, send the results of:

EXPLAIN SELECT transfer_bps, bytes_from_peers, bytes_from_sps, bytes_from_dsps, percent_completed,
modify_on, bytes_from_peers_udp, total_peers, total_connections, total_connections_udp,
actual_bytes_from_sps, actual_bytes_from_peers, actual_bytes_from_peers_udp,
bytes_from_http, actual_bytes_from_http, complete_time FROM downloads WHERE transfer_id =
'0D8D552BDD6B4EEC85344533AC6CB7DB'  AND torrent_id =
'00000000000000000000000099F1C0113CF671B3DC2214F6332B732D9664811B'  AND node_id =
'DC4D17947D114996AB69DA1E71C21483'\G
[27 Nov 2007 10:50] Bjorn Julander
I seem to have a similar situation, I'm really curios if this is
the same bug bug triggered in a different way? 
I have an alter table of a MERGE table (changing the union to a 
new set), which occasionally hangs in state end infinitely.
Just as in the bug description above CPU and IO activity is zero.

At this state, mysql is usually unresponsive. New connections
are not allowed, a normal stop of the system '/etc/ini.d/mysql stop' will
not stop the server, a hard kill of mysqld is needed. If we had
a previous shell-connect to the database some commands can be issued.
'show processlist' is such a command. 'show status' will not work
and no form of select of any database will work.

I'm currently unable to repeat this in a systematic way in a resonable time, 
although our application does hang on most servers tried after some
time (several days).

Some more details:
We have tested with several versions of mysql in the 5.0.X series, but not
yet in 5.1.X, exact version does not seem to matter. We have for sure this
issue in 5.0.45, 5.0.24 and 5.0.26-pro.
OS is in all cases RedHat Enterprise 4 update 4
Kernel version is: 2.6.9-34.ELsmp
All running in 64-bit mode.
We are running some different HW setups, mainly two dualcore AMDs.

The program basically has a couple of threads that is run from 
java using mysql-connector/J, both versions in 5.0.X and 5.1.x have been 
tested. What they exercise is roughly:
  1) create new table 
  2) alter a merge table to also include new table
  3) input data into new table (using load data local infile)
  4) occasionally reverse by removing a old table by first updating merge table
     and then dropping the now excluded table.

Any clues if this is the same issue triggered in a different way 
or a totally different problem?
[29 Nov 2007 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".