Bug #42174 Select count(*) with subquery consuming multi-gigs of memory, 5.0.67 & 4.1.22
Submitted: 16 Jan 2009 22:46 Modified: 19 Jan 2009 7:06
Reporter: Daniel Smythe Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.0.67 & 4.1.22 OS:Linux
Assigned to: CPU Architecture:Any
Tags: count, Memory, memory leak, subqueries

[16 Jan 2009 22:46] Daniel Smythe
Description:
Executing a SELECT count(*) from joined tables with a subquery causes extreme memory consumption, upwards of 10GB per query.

We've come across this recently in both MySQL versions 5.0.67 and 4.1.22 where a simple SELECT count(*) with a subquery consumes ~10GB of RAM on tables that are extremely small.

The Query which causes the extreme memory consumption is:

SELECT
  count(*)
FROM
  pligg_pageviews, pligg_links
WHERE
  `pv_type`='story'
AND
  link_id IN(
    SELECT DISTINCT
      link_id
    FROM
      pligg_links
    WHERE
      link_id = pv_page_id);

When running this query on a slave server which is relatively stable, I loop through the memory usage at 1 second intervals while the query executes:

[root@slaveMySQLserver ~]# while [ 1 ] ; do ps auwx | grep 19920 | grep -v grep | awk '{print $6}' ; sleep 1 ; done
2685792
2685792 <--- Statically consuming 2.6 G
2685792
2685792
2685792
2685792
2685792
2685792
2685792
2963008 <--- Query begins
3274536
3586984
3897832
4209776
4522900
4835444
5147948
5460068
5773800
6088656
6402508
6717052
7031000
7344664
7658304
7973936
8287788
8601960
8916628
9231536 <--- Query finishes executing @ 9G

When finished, there are times when all the memory is not de-allocated also. We're experiencing MySQL instances consuming upwards of 22GB Virtual memory due to this issue and finally taking our servers down. Any combination of FLUSH commands does not release the memory as well, and we must resort to restarting the instance if we catch it in time. It's even worse when multiple of these queries execute simultaneously.

We do receive the expected output, However the memory consumption while it runs is what the issue is. It can't be right.

How to repeat:
 Table: pligg_links
Create Table: CREATE TABLE `pligg_links` (
  `link_id` int(20) NOT NULL auto_increment,
  `link_author` int(20) NOT NULL default '0',
  `link_status` enum('discard','queued','published','abuse','duplicated') NOT NULL default 'discard',
  `link_randkey` int(20) NOT NULL default '0',
  `link_votes` int(20) NOT NULL default '0',
  `link_reports` int(20) NOT NULL default '0',
  `link_comments` int(20) NOT NULL default '0',
  `link_karma` decimal(10,2) NOT NULL default '0.00',
  `link_modified` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `link_date` timestamp NOT NULL default '0000-00-00 00:00:00',
  `link_published_date` timestamp NOT NULL default '0000-00-00 00:00:00',
  `link_category` int(11) NOT NULL default '0',
  `link_lang` int(11) NOT NULL default '1',
  `link_url` varchar(200) NOT NULL default '',
  `link_url_title` text,
  `link_title` text NOT NULL,
  `link_title_url` varchar(255) default NULL,
  `link_content` text NOT NULL,
  `link_summary` text,
  `link_tags` text,
  `link_field1` varchar(255) NOT NULL default '',
  `link_field2` varchar(255) NOT NULL default '',
  `link_field3` varchar(255) NOT NULL default '',
  `link_field4` varchar(255) NOT NULL default '',
  `link_field5` varchar(255) NOT NULL default '',
  `link_field6` varchar(255) NOT NULL default '',
  `link_field7` varchar(255) NOT NULL default '',
  `link_field8` varchar(255) NOT NULL default '',
  `link_field9` varchar(255) NOT NULL default '',
  `link_field10` varchar(255) NOT NULL default '',
  `link_field11` varchar(255) NOT NULL default '',
  `link_field12` varchar(255) NOT NULL default '',
  `link_field13` varchar(255) NOT NULL default '',
  `link_field14` varchar(255) NOT NULL default '',
  `link_field15` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`link_id`),
  KEY `link_author` (`link_author`),
  KEY `link_url` (`link_url`),
  KEY `link_date` (`link_date`),
  KEY `link_published_date` (`link_published_date`),
  FULLTEXT KEY `link_url_2` (`link_url`,`link_url_title`,`link_title`,`link_content`,`link_tags`),
  FULLTEXT KEY `link_tags` (`link_tags`),
  FULLTEXT KEY `link_search` (`link_title`,`link_content`,`link_tags`)
) ENGINE=MyISAM AUTO_INCREMENT=1481 DEFAULT CHARSET=utf8
1 row in set (0.04 sec)

mysql> SHOW CREATE TABLE pligg_pageviews\G
*************************** 1. row ***************************
       Table: pligg_pageviews
Create Table: CREATE TABLE `pligg_pageviews` (
  `pv_id` int(10) unsigned NOT NULL auto_increment,
  `pv_type` enum('story','out','profile') NOT NULL default 'story',
  `pv_page_id` int(11) NOT NULL default '0',
  `pv_datetime` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `pv_user_id` int(11) NOT NULL default '0',
  `pv_user_ip` varchar(64) default '0',
  PRIMARY KEY  (`pv_id`),
  KEY `pv_type` (`pv_type`,`pv_page_id`,`pv_user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=26263 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select count(*) from pligg_pageviews;
+----------+
| count(*) |
+----------+
| 26253 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from pligg_links;
+----------+
| count(*) |
+----------+
| 1480 |
+----------+
1 row in set (0.00 sec)

mysql> SHOW TABLE STATUS LIKE 'pligg_links'\G
*************************** 1. row ***************************
           Name: pligg_links
         Engine: MyISAM
        Version: 9
     Row_format: Dynamic
           Rows: 1480
 Avg_row_length: 1364
    Data_length: 2019200
Max_data_length: 4294967295
   Index_length: 3322880
      Data_free: 0
 Auto_increment: 1481
    Create_time: 2008-10-22 02:29:21
    Update_time: 2009-01-16 09:18:32
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

mysql> SHOW TABLE STATUS LIKE 'pligg_pageviews'\G
*************************** 1. row ***************************
           Name: pligg_pageviews
         Engine: MyISAM
        Version: 9
     Row_format: Dynamic
           Rows: 26262
 Avg_row_length: 33
    Data_length: 874184
Max_data_length: 4294967295
   Index_length: 630784
      Data_free: 0
 Auto_increment: 26263
    Create_time: 2008-10-22 02:29:21
    Update_time: 2009-01-16 10:40:12
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

Suggested fix:

The query itself CAN be shortened:

mysql> SELECT count(*) FROM pligg_pageviews WHERE pv_type = 'story';

+----------+
| count(*) |
+----------+
|    11329 |
+----------+
1 row in set (0.05 sec)

And we have informed the developers of Pligg and the module Page Statistics v 0.2 of the issue and urged them to change the code as soon as possible - however it sure does seem to us that this is poor use of memory and should not be consuming multiple gigabytes of memory during execution.
[17 Jan 2009 13:48] Valeriy Kravchuk
Thank you for a problem report. Please, send the results of:

EXPLAIN SELECT
  count(*)
FROM
  pligg_pageviews, pligg_links
WHERE
  `pv_type`='story'
AND
  link_id IN(
    SELECT DISTINCT
      link_id
    FROM
      pligg_links
    WHERE
      link_id = pv_page_id)\G

from your environment.
[19 Jan 2009 2:16] Daniel Smythe
Here's the Explain output. It is a considerable number of rows, however still seems out of place to me:

mysql> EXPLAIN SELECT count(*) FROM pligg_pageviews,pligg_links WHERE `pv_type`='story' AND link_id IN(SELECT DISTINCT link_id FROM pligg_links WHERE link_id=pv_page_id);
+----+--------------------+-----------------+--------+---------------+---------+---------+------------------------------------+-------+-------------------------------------------+
| id | select_type        | table           | type   | possible_keys | key     | key_len | ref                                | rows  | Extra                                     |
+----+--------------------+-----------------+--------+---------------+---------+---------+------------------------------------+-------+-------------------------------------------+
|  1 | PRIMARY            | pligg_pageviews | ref    | pv_type       | pv_type |       1 | const                              | 11359 | Using where; Using index                  |
|  1 | PRIMARY            | pligg_links     | index  | NULL          | PRIMARY |       4 | NULL                               |  1480 | Using where; Using index                  |
|  2 | DEPENDENT SUBQUERY | pligg_links     | eq_ref | PRIMARY       | PRIMARY |       4 | newsurl.pligg_pageviews.pv_page_id |     1 | Using where; Using index; Using temporary |
+----+--------------------+-----------------+--------+---------------+---------+---------+------------------------------------+-------+-------------------------------------------+
3 rows in set (0.00 sec)

mysql> select 11357*1480;
+------------+
| 11357*1480 |
+------------+
|   16808360 |
+------------+
1 row in set (0.00 sec)

We're still only talking about 3MB of data or so.

mysql> show table status like 'pligg_links';
+-------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| Name        | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation       | Checksum | Create_options | Comment |
+-------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| pligg_links | MyISAM |       9 | Dynamic    | 1480 |           1364 |     2019200 |      4294967295 |      3322880 |         0 |           1481 | 2008-10-22 02:29:21 | 2009-01-16 09:18:32 | NULL       | utf8_general_ci |     NULL |                |         |
+-------------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
1 row in set (0.00 sec)

mysql> show table status like 'pligg_pageviews';
+-----------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| Name            | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation       | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
| pligg_pageviews | MyISAM |       9 | Dynamic    | 26262 |             33 |      874184 |      4294967295 |       630784 |         0 |          26263 | 2008-10-22 02:29:21 | 2009-01-16 10:40:12 | NULL       | utf8_general_ci |     NULL |                |         |
+-----------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+
1 row in set (0.00 sec)
[19 Jan 2009 2:22] Daniel Smythe
And also, the specific binaries we are using - perhaps using x86_64 binaries is part of the issue:

mysql-max-4.1.22-unknown-linux-gnu-x86_64-glibc23
mysql-5.0.67-linux-x86_64-glibc23
[19 Jan 2009 7:06] Sveta Smirnova
This is duplicate of bug #42037