Bug #64837 MySQL Crashing
Submitted: 2 Apr 2012 2:40 Modified: 24 Oct 2012 23:33
Reporter: Ron Davis Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.1.62-log OS:Linux (CentOS release 5.8 (Final))
Assigned to: CPU Architecture:Any

[2 Apr 2012 2:40] Ron Davis
Description:
First, please forgive me as this is the first time i have ever needed to submit a bug. With over 7 years working with MySQL i have never found occasion to actually need to put in a bug report. So please forgive if i have done something incorrectly. Thanks in advanced.

At a high level we it appears that under high load MySQL seems to crash and restart. The difficulty is that each time it crashes it appears to be a different sort of error. And at times there are different queries reported. We are unable to locate anything that sticks out enough to identify as another bug or find a fix.

Hardware Details: 
Hardware: Dell 2950 Generation 3
CPU: 4 x Intel(R) Xeon(R) CPU           E5310  @ 1.60GHz
Memory: 32 GB of memory
HardDrives: Raid 10, Dell 6i PERC

Operating System Details: CentOS 5.8 (Patched to latest yesterday)
kernel: 2.6.18-308.1.1.el5 #1 SMP Wed Mar 7 04:16:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

I will attached Crashes

How to repeat:
At this point we are unable to repeat. We only see the problem when the server appears to be under high load.

Some other interesting facts:
1. There are two slaves off the master. They never crash
2. We see crashes @ 300QPS and 2500QPS, but mostly it seems to happen a lot during high load times. But not always.

Suggested fix:
Here is the troubleshooting we have attempted:
1. We replaced PERC battery. Still Continued to Crash
2. We replaced Memory on RAID Controller. Still Continued to Crash
3. We failed over the master to one of the slaves, problem followed to start on the new master on new hardware. (So we don't think it is Hardware related)
3. Problem started on 5.1.32, we upgraded to 5.1.62. Still Continued to Crash
4. We patched server from 5.3 to 5.8.  Still Continued to Crash

We are in the process of going through any query changes that we have made recently, but are unable to find anything that interesting.

I am completely out of ideas as to what to even try next in order to narrow this down.

We are currently in the process of bring up another server to run whatever tests we can think to run.
[2 Apr 2012 2:41] Ron Davis
MySQL Error Log

Attachment: error.txt (text/plain), 64.94 KiB.

[2 Apr 2012 2:45] Ron Davis
my.cnf

Attachment: my.txt (text/plain), 1.83 KiB.

[2 Apr 2012 2:46] Ron Davis
my-server, my.cnf overrides.

Attachment: my-server.txt (text/plain), 2.55 KiB.

[2 Apr 2012 6:55] Valeriy Kravchuk
Please, send the output of

free

Linux command. Send also the results of:

show create table TAX_QUEUE\G
show table status like 'TAX_QUEUE';

(this is the table from the first crash I see in your error log).
[2 Apr 2012 14:23] Ron Davis
Thanks for the response.

Free Output
             total       used       free     shared    buffers     cached
Mem:      32940056   10304476   22635580          0     751800    6374224
-/+ buffers/cache:    3178452   29761604
Swap:      2031608          0    2031608

root@localhost (PROD_BILLING) mysql> show create table TAX_QUEUE\G
*************************** 1. row ***************************
       Table: TAX_QUEUE
Create Table: CREATE TABLE `TAX_QUEUE` (
  `ar_trans_id` int(11) unsigned NOT NULL,
  PRIMARY KEY (`ar_trans_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

root@localhost (PROD_BILLING) mysql> show table status like 'TAX_QUEUE'\G
*************************** 1. row ***************************
           Name: TAX_QUEUE
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 2029
 Avg_row_length: 40
    Data_length: 81920
Max_data_length: 0
   Index_length: 0
      Data_free: 11534336
 Auto_increment: NULL
    Create_time: 2012-03-30 23:15:19
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)
[4 Apr 2012 18:03] Ron Davis
More information. 

To this point the crashes have been isolated to our master database server (db2-billing). We now can create crashes on the slave as well. 

How to recreate:
Create a script to execute the following query rapidly (in our case we are able to get this to run around 600QPS:

SELECT ai.ar_trans_id, ar.ar_trans_type_id, ar.user_id, ar.amount, ar.post_date, ar.start_date, ar.stop_date, ar.batch_number,
                ar.debit_gl_id, ar.credit_gl_id, ar.customer_id, ar.account_id, ar.agency_id, ar.region_id,
                IFNULL(sd.description, IFNULL(atcd.description, ar.description)) as description
            FROM AGED_ITEM ai
            JOIN AR_TRANSACTION ar ON ar.ar_trans_id = ai.ar_trans_id
                        LEFT JOIN AR_TRANS_CUSTOM_DESCRIPTION atcd ON atcd.ar_trans_id = ar.ar_trans_id
                        LEFT JOIN AR_TRANS_STATIC_DESCRIPTION atsd ON atsd.ar_trans_id = ar.ar_trans_id
                        LEFT JOIN STATIC_DESCRIPTION sd ON sd.static_description_id = atsd.static_description_id
            LEFT JOIN TAX_TRANSACTION tt ON tt.ar_trans_id = ar.ar_trans_id
                    LEFT JOIN AR_TRANSACTION taxParent ON taxParent.ar_trans_id = tt.taxed_trans_id
            WHERE ai.customer_id = 263013
             AND (ar.credit_gl_id IN ( '509','510','511','521','522','535') OR (ar.ar_trans_type_id = 18 AND taxParent.credit_gl_id IN ( '509','510','511','521','522','535' )))

Then start replication from the master. Both slaves seem to crash at roughly the same time. 

If it pleases, attached are some table creates for tables in question:

| AGED_ITEM | CREATE TABLE `AGED_ITEM` (
  `customer_id` int(11) NOT NULL DEFAULT '0',
  `ar_trans_id` int(11) unsigned NOT NULL,
  `unpaid_amount` decimal(10,2) NOT NULL DEFAULT '0.00',
  `trans_date` date NOT NULL DEFAULT '0000-00-00',
  `date_modified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`ar_trans_id`),
  KEY `trans_date` (`trans_date`),
  KEY `idx_aged_item` (`customer_id`,`date_modified`) USING BTREE,
  CONSTRAINT `AGED_ITEM_ibfk_1` FOREIGN KEY (`ar_trans_id`) REFERENCES `AR_TRANSACTION` (`ar_trans_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `AR_TRANSACTION` (
  `ar_trans_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `ar_trans_type_id` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `date_entered` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `user_id` int(11) NOT NULL DEFAULT '0',
  `amount` decimal(10,2) NOT NULL DEFAULT '0.00',
  `post_date` date NOT NULL DEFAULT '0000-00-00',
  `batch_number` varchar(32) NOT NULL DEFAULT '',
  `sj_number` varchar(32) NOT NULL DEFAULT '',
  `debit_gl_id` smallint(6) NOT NULL DEFAULT '0',
  `credit_gl_id` smallint(6) NOT NULL DEFAULT '0',
  `customer_id` int(11) unsigned NOT NULL DEFAULT '0',
  `account_id` int(11) unsigned DEFAULT NULL,
  `start_date` date NOT NULL DEFAULT '0000-00-00',
  `stop_date` date NOT NULL DEFAULT '0000-00-00',
  `description` varchar(45) DEFAULT NULL,
  `invoice_id` int(10) unsigned DEFAULT NULL,
  `child_trans_id` int(11) unsigned DEFAULT NULL,
  `parent_trans_id` int(11) DEFAULT NULL,
  `billed_trans_id` int(11) unsigned DEFAULT NULL,
  `currency_id` char(3) NOT NULL DEFAULT '',
  `actual_stop_date` date DEFAULT NULL,
  `glid_id` int(10) unsigned DEFAULT NULL,
  `billing_attempted` datetime DEFAULT NULL,
  `billing_collected` datetime DEFAULT NULL,
  `hard_decline` date DEFAULT NULL,
  `agency_id` int(10) unsigned NOT NULL DEFAULT '0',
  `region_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `user_defined_1` varchar(48) DEFAULT NULL,
  `user_defined_2` varchar(48) DEFAULT NULL,
  `user_defined_3` varchar(48) DEFAULT NULL,
  `user_defined_4` varchar(48) DEFAULT NULL,
  `user_defined_5` varchar(48) DEFAULT NULL,
  `jurisdiction_id` smallint(5) unsigned DEFAULT NULL,
  `tax_trans_id` int(11) DEFAULT NULL,
  `partial_payment` double(10,2) DEFAULT NULL,
  `last_updated_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`ar_trans_id`),
  KEY `idx_post_date` (`post_date`),
  KEY `isx_ar_trans_type_id` (`ar_trans_type_id`),
  KEY `idx_batch_number` (`batch_number`),
  KEY `idx_debit` (`debit_gl_id`),
  KEY `idx_credit_debit` (`credit_gl_id`,`debit_gl_id`),
  KEY `account_id` (`account_id`),
  KEY `child_trans_id` (`child_trans_id`),
  KEY `idx_billed_trans_id` (`billed_trans_id`),
  KEY `parent_trans_id` (`parent_trans_id`),
  KEY `user_defined_4` (`user_defined_4`),
  KEY `stop_date` (`stop_date`),
  KEY `customer_id` (`customer_id`,`billing_collected`,`billed_trans_id`),
  KEY `idx_last_updated_on` (`last_updated_on`),
  KEY `invoice_id` (`invoice_id`),
  KEY `user_defined_5` (`user_defined_5`)
) ENGINE=InnoDB AUTO_INCREMENT=351939786 DEFAULT CHARSET=latin1

CREATE TABLE `AR_TRANS_CUSTOM_DESCRIPTION` (
  `ar_trans_id` int(11) unsigned NOT NULL,
  `description` char(45) NOT NULL DEFAULT '',
  PRIMARY KEY (`ar_trans_id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `AR_TRANS_STATIC_DESCRIPTION` (
  `ar_trans_id` int(11) unsigned NOT NULL,
  `static_description_id` smallint(5) unsigned NOT NULL,
  PRIMARY KEY (`ar_trans_id`) USING BTREE,
  KEY `Index_2` (`static_description_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

 CREATE TABLE `TAX_TRANSACTION` (
  `ar_trans_id` int(11) unsigned NOT NULL,
  `tax_type` char(2) NOT NULL DEFAULT '',
  `geocode_id` smallint(5) unsigned DEFAULT NULL,
  `customer_type_id` tinyint(1) DEFAULT NULL,
  `taxed_trans_id` int(11) unsigned DEFAULT NULL,
  PRIMARY KEY (`ar_trans_id`),
  KEY `taxed_trans_id` (`taxed_trans_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
[13 May 2012 17:24] MySQL Verification Team
Can we get the output of:  SHOW CREATE TABLE `STATIC_DESCRIPTION`;
[13 May 2012 17:44] MySQL Verification Team
BTW, if this is a slave server running 5.1.62, you have to *disable the query cache* in my.cnf.  There is a memory override bug that can lead to these spurious crashes all over the show.  While that bug is triggerable, there's no point in debugging any other memory corruptions.

So, you must eliminate the bug #64624 before we can continue here.
[14 May 2012 14:51] Ron Davis
Thanks for the response Shane. Below is the show for STATIC_DESCRIPTION.

| STATIC_DESCRIPTION | CREATE TABLE `STATIC_DESCRIPTION` (
  `static_description_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `static_description_type_id` smallint(5) unsigned NOT NULL,
  `description` char(64) NOT NULL DEFAULT '',
  PRIMARY KEY (`static_description_id`) USING BTREE,
  UNIQUE KEY `description` (`description`),
  KEY `FK_STATIC_DESCRIPTION_1` (`static_description_type_id`),
  CONSTRAINT `FK_STATIC_DESCRIPTION_1` FOREIGN KEY (`static_description_type_id`                                        ) REFERENCES `STATIC_DESCRIPTION_TYPE` (`static_description_type_id`)
) ENGINE=InnoDB AUTO_INCREMENT=976 DEFAULT CHARSET=latin1 |

Also, when we first started seeing this issue, only the master was crashing. Not the slave, so i am not sure that bug #64624 is relative here. That said, we were able to take the query above posted on Apr 4th, run it at almost 3K queries per second, with replication and see the slave crash.
[14 May 2012 14:56] Ron Davis
As an update to this. We were unable to find any cause and just ended up breaking the above query into multiple calls made by PHP. This has since stopped this issue.  

To this day we cannot explain WHY simultaneously running said query along with updates either from replication or web traffic caused a crash, but we were able get it to stop by just not running said query. :-) Not the best solution, but it did work.
[24 Sep 2012 23:33] MySQL Verification Team
Please try latest version. Thanks.
[25 Oct 2012 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".