Bug #37411 Threads staying in the "Sending Data" for long periods of time
Submitted: 14 Jun 2008 19:21 Modified: 19 Nov 2008 18:45
Reporter: Robin McMillon Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.0.51a OS:Solaris
Assigned to: Assigned Account CPU Architecture:Any

[14 Jun 2008 19:21] Robin McMillon
Description:
Using the pre-built binary downloaded from MySQL on a Sun x4450.

version= 5.0.51a-log 
version_comment=MySQL Community Server (GPL)                            
version_compile_machine = x86_64                                               
version_compile_os = pc-solaris2.10                                      

Description:

MySQL has serious periodic slowdowns with queries spending a long time in the "Sending Data" state.  These same queries are issued regularly 24/7 but we only see problems periodically.   This has happened multiple times in the past two days.  Nothing runs on this machine but MySQL.  This information is from June 13, 2008.

Queries involved are of the type:  (check show processlist output for others)

(1)
select page0_.CONTENTID as CONTENTID, page0_.CHILD_POSITION as CHILD_P13_, page0_.PARENTID as PARENTID, page0_.SPACEID as SPACEID, page0_.TITLE as TITLE, page0_.VERSION as VERSION, page0_.CREATOR as CREATOR, page0_.CREATIONDATE as CREATION6_, page0_.LASTMODIFIER as LASTMODI7_, page0_.LASTMODDATE as LASTMODD8_, page0_.VERSIONCOMMENT as VERSIONC9_, page0_.PREVVER as PREVVER, page0_.CONTENT_STATUS as CONTENT11_ from CONTENT page0_, SPACES space1_ where page0_.CONTENTTYPE='PAGE' and ((lower(space1_.SPACEKEY)='XXXX'  and page0_.SPACEID=space1_.SPACEID)and(lower(page0_.TITLE)='XXXX' )and(page0_.PREVVER is null )and(page0_.CONTENT_STATUS='XXXX' ));

(2)
select page0_.CONTENTID as CONTENTID0_, comments1_.CONTENTID as CONTENTID1_, children2_.CONTENTID as CONTENTID2_, page0_.CHILD_POSITION as CHILD_P13_0_, page0_.PARENTID as PARENTID0_, page0_.SPACEID as SPACEID0_, page0_.TITLE as TITLE0_, page0_.VERSION as VERSION0_, page0_.CREATOR as CREATOR0_, page0_.CREATIONDATE as CREATION6_0_, page0_.LASTMODIFIER as LASTMODI7_0_, page0_.LASTMODDATE as LASTMODD8_0_, page0_.VERSIONCOMMENT as VERSIONC9_0_, page0_.PREVVER as PREVVER0_, page0_.CONTENT_STATUS as CONTENT11_0_, comments1_.PAGEID as PAGEID1_, comments1_.PARENTCOMMENTID as PARENTC21_1_, comments1_.TITLE as TITLE1_, comments1_.VERSION as VERSION1_, comments1_.CREATOR as CREATOR1_, comments1_.CREATIONDATE as CREATION6_1_, comments1_.LASTMODIFIER as LASTMODI7_1_, comments1_.LASTMODDATE as LASTMODD8_1_, comments1_.VERSIONCOMMENT as VERSIONC9_1_, comments1_.PREVVER as PREVVER1_, comments1_.CONTENT_STATUS as CONTENT11_1_, children2_.PAGEID as PAGEID2_, children2_.PARENTCOMMENTID as PARENTC21_2_, children2_.TITLE as TITLE2_, children2_.VERSION as VERSION2_, children2_.CREATOR as CREATOR2_, children2_.CREATIONDATE as CREATION6_2_, children2_.LASTMODIFIER as LASTMODI7_2_, children2_.LASTMODDATE as LASTMODD8_2_, children2_.VERSIONCOMMENT as VERSIONC9_2_, children2_.PREVVER as PREVVER2_, children2_.CONTENT_STATUS as CONTENT11_2_, children2_.PARENTCOMMENTID as PARENTC21___, children2_.CONTENTID as CONTENTID__ from CONTENT page0_ left outer join CONTENT comments1_ on page0_.CONTENTID=comments1_.PAGEID left outer join CONTENT children2_ on comments1_.CONTENTID=children2_.PARENTCOMMENTID, SPACES space3_ where page0_.CONTENTTYPE='XXXX' and ((lower(space3_.SPACEKEY)='XXXX'  and page0_.SPACEID=space3_.SPACEID)and(lower(page0_.TITLE)='XXXX' )and(page0_.PREVVER is null )and(page0_.CONTENT_STATUS='XXXX' )) order by children2_.CREATIONDATE;

(3)
select contentent0_.CONTENTID as x0_0_, contentent0_.VERSION as x1_0_, contentent0_.LASTMODIFIER as x2_0_, contentent0_.LASTMODDATE as x3_0_, contentent0_.VERSIONCOMMENT as x4_0_ from CONTENT contentent0_ where (contentent0_.PREVVER=XXXX )or(contentent0_.CONTENTID=XXXX ) order by  contentent0_.VERSION desc;

(4)
select defaulthib0_.id as id, defaulthib0_.name as name, defaulthib0_.password as password, defaulthib0_.email as email, defaulthib0_.created as created, defaulthib0_.fullname as fullname from users defaulthib0_ where (defaulthib0_.name='XXXX' );

(5)
select localmembe0_.groupid as groupid__, localmembe0_.userid as userid__, defaulthib1_.id as id0_, defaulthib1_.name as name0_, defaulthib1_.password as password0_, defaulthib1_.email as email0_, defaulthib1_.created as created0_, defaulthib1_.fullname as fullname0_ from local_members localmembe0_ inner join users defaulthib1_ on localmembe0_.userid=defaulthib1_.id where localmembe0_.groupid=XXXX

The affected table structures are:

CREATE TABLE `CONTENT` (
  `CONTENTID` bigint(20) NOT NULL default '0',
  `CONTENTTYPE` varchar(255) NOT NULL default '',
  `TITLE` varchar(255) default NULL,
  `VERSION` int(11) default NULL,
  `CREATOR` varchar(255) default NULL,
  `CREATIONDATE` datetime default NULL,
  `LASTMODIFIER` varchar(255) default NULL,
  `LASTMODDATE` datetime default NULL,
  `VERSIONCOMMENT` mediumtext,
  `PREVVER` bigint(20) default NULL,
  `CONTENT_STATUS` varchar(255) default NULL,
  `SPACEID` bigint(20) default NULL,
  `PARENTID` bigint(20) default NULL,
  `MESSAGEID` varchar(255) default NULL,
  `PAGEID` bigint(20) default NULL,
  `PARENTCOMMENTID` bigint(20) default NULL,
  `USERNAME` varchar(255) default NULL,
  `DRAFTPAGEID` varchar(255) default NULL,
  `DRAFTSPACEKEY` varchar(255) default NULL,
  `DRAFTTYPE` varchar(255) default NULL,
  `DRAFTPAGEVERSION` int(11) default NULL,
  `CHILD_POSITION` int(11) default NULL,
  PRIMARY KEY  (`CONTENTID`),
  KEY `c_parentcommid_idx` (`PARENTCOMMENTID`),
  KEY `c_title_idx` (`TITLE`),
  KEY `c_spaceid_idx` (`SPACEID`),
  KEY `c_parentid_idx` (`PARENTID`),
  KEY `c_username_idx` (`USERNAME`),
  KEY `c_prevver_idx` (`PREVVER`),
  KEY `c_messageid_idx` (`MESSAGEID`),
  KEY `c_contenttype_idx` (`CONTENTTYPE`),
  KEY `c_pageid_idx` (`PAGEID`),
  CONSTRAINT `FK6382C05917D4A070` FOREIGN KEY (`PREVVER`) REFERENCES `CONTENT` (`CONTENTID`),
  CONSTRAINT `FK6382C05974B18345` FOREIGN KEY (`PARENTID`) REFERENCES `CONTENT` (`CONTENTID`),
  CONSTRAINT `FK6382C0598C38FBEA` FOREIGN KEY (`PAGEID`) REFERENCES `CONTENT` (`CONTENTID`),
  CONSTRAINT `FK6382C059B2DC6081` FOREIGN KEY (`SPACEID`) REFERENCES `SPACES` (`SPACEID`),
  CONSTRAINT `FK6382C059B97E9230` FOREIGN KEY (`PARENTCOMMENTID`) REFERENCES `CONTENT` (`CONTENTID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `SPACES` (
  `SPACEID` bigint(20) NOT NULL default '0',
  `SPACENAME` varchar(255) default NULL,
  `SPACEKEY` varchar(255) NOT NULL default '',
  `SPACEDESCID` bigint(20) default NULL,
  `HOMEPAGE` bigint(20) default NULL,
  `CREATOR` varchar(255) default NULL,
  `CREATIONDATE` datetime default NULL,
  `LASTMODIFIER` varchar(255) default NULL,
  `LASTMODDATE` datetime default NULL,
  `SPACETYPE` varchar(255) default NULL,
  `SPACEGROUPID` bigint(20) default NULL,
  PRIMARY KEY  (`SPACEID`),
  UNIQUE KEY `SPACEKEY` (`SPACEKEY`),
  KEY `s_spacedescid_idx` (`SPACEDESCID`),
  KEY `s_spacegroupid_idx` (`SPACEGROUPID`),
  KEY `s_homepage_idx` (`HOMEPAGE`),
  CONSTRAINT `FK9228242D11B7BFEE` FOREIGN KEY (`HOMEPAGE`) REFERENCES `CONTENT` (`CONTENTID`),
  CONSTRAINT `FK9228242D16994414` FOREIGN KEY (`SPACEGROUPID`) REFERENCES `SPACEGROUPS` (`SPACEGROUPID`),
  CONSTRAINT `FK9228242D2C72D3D2` FOREIGN KEY (`SPACEDESCID`) REFERENCES `CONTENT` (`CONTENTID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `users` (
  `id` bigint(20) NOT NULL default '0',
  `name` varchar(255) NOT NULL default '',
  `password` varchar(255) default NULL,
  `email` varchar(255) default NULL,
  `created` datetime default NULL,
  `fullname` varchar(255) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Possibly related:
We have also found that if we set innodb_thread_concurrency > 1, we periodically get a backlog which causes load on the machine to rapidly skyrocket (go from < 0.5 to > 30 in a couple of minutes) and MySQL to use >95% of a CPU.  Queries are *extremely* slow during these time periods.  The problem is rapidly fixed by dropping innodb_thread_concurrency = 1.

More information will be attached in a few minutes.

                                   

How to repeat:
It's periodic but I'm not sure what triggers the server to slow down or those queries to cause problems periodically as they are being run constantly.  We have been unable to repeat it under a test load.
[15 Jun 2008 10:36] Valeriy Kravchuk
Thank you for a problem report. How many CPUs/cores do you have?

Please, send also the EXPLAIN results for the queries you mentioned. Looks like you already included all the other information in the .rtf file attached.
[16 Jun 2008 15:50] Robin McMillon
CPUs/cores: We have 4 dual-core Intel Xeon 2.93GHz "Tigerton" CPUs

I will attach the EXPLAIN results as a file since they are too long for the comment field.
[16 Jun 2008 15:51] Robin McMillon
EXPLAIN results for queries mentioned in original report

Attachment: mysql_bug_report_attachment2.rtf (application/rtf, text), 9.10 KiB.

[17 Jun 2008 17:29] Heikki Tuuri
Vasil,

can you look at this performance problem? This could be 'thread thrashing'.

'Sending data' means that rows are read from one table to further processing. Under congestion, I think that state is a very common thread state.

Regards,

Heikki
[28 Jul 2008 15:00] Knut Hellebø
I am seeing identical behaviour on a Solaris 8 system (V440). This happens when running FLEXnet Manager from Acresso along with MySQL. I currently run with innodb_thread_concurrency set to 0 and occasionally see very high load on the server. When setting this variable to 1, the load becomes OK after a while, but the processes are still running in "Sending Data" status inside MySQL.
Here's OS and MySQL details:

uname -a
SunOS bgedsu12 5.8 Generic_117350-54 sun4u sparc SUNW,Sun-Fire-V440

+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| version                 | 5.0.45                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | sparc                        |
| version_compile_os      | sun-solaris2.8               |
+-------------------------+------------------------------+

Knut Hellebø
[7 Aug 2008 0:42] James Day
Knut, you should experiment to find the innodb_thread_concurrency value that produces the fastest improvement. That will be your high load optimal setting. It may be higher than 1, it's worth trying 2,4,6,8,16,32 and adjusting further from the one that is best.
[19 Oct 2008 18:01] Jiri Hlinka
my.cnf from server. most variables are changed inside running mysql server as in my-innodb-heavy-4G.cnf

Attachment: my.cnf (application/octet-stream, text), 5.44 KiB.

[19 Oct 2008 18:07] Jiri Hlinka
result of SHOW INNODB STATUS while server is in high load.

Attachment: show-innodb-status-high-load.dump (application/octet-stream, text), 8.18 KiB.

[19 Oct 2008 18:09] Jiri Hlinka
SHOW MUTEX STATUS result

Attachment: show-mutex-status.dump (application/octet-stream, text), 472 bytes.

[19 Oct 2008 18:45] Valeriy Kravchuk
As this is the most "waited" mutex:

| buf0buf.c   |  545 |   171362 |

your problem is related to buf_pool->mutex. So, smaller number of concurrent InnoDB threads, smaller buffer pool, split_buf_pool_mutex patch (search for it) or 5.1.x may help you to some extent. 

Please, check if the problem is repeatable with 5.1.28.
[20 Oct 2008 5:34] Jiri Hlinka
[19 Oct 20:45] Valeriy Kravchuk:
Thank you for your reply.
Is it possible to solve it by increasing innodb_buffer_pool_size as the problem starts when data dont fit into memory (as described here: http://aggregator.foolab.org/node/44598)?
[30 Oct 2008 14:44] Heikki Tuuri
Jiri,

http://bugs.mysql.com/file.php?id=10466

the printout shows no contention at all inside InnoDB. Also, the file I/O is small.

It is processing several huge SELECT queries and InnoDB is doing 2 million record reads per second. Does MySQL optimize the queries properly?

Regards,

Heikki
[30 Oct 2008 14:49] Heikki Tuuri
Robin McMillon,

can you please print SHOW INNODB STATUS\G during the slowdown?

Regards,

Heikki
[20 Nov 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".