MySQL Bugs: #16383: Repeated errors like "InnoDB: Error: record field 5 len 4294965509" in log file

Bug #16383	Repeated errors like "InnoDB: Error: record field 5 len 4294965509" in log file
Submitted:	11 Jan 2006 18:25	Modified:	17 Jan 2006 8:02
Reporter:	Nathan Goff	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.0.17	OS:	Linux (Red Hat WS 3.0u4)
Assigned to:	Marko Mäkelä	CPU Architecture:	Any

Description:
I keep seeing errors like this: "InnoDB: Error: record field 5 len 4294965509" in my log files.  I can't find anything on the web related to these errors.  I have 1 master server and 8 replication servers.  It only happens on one of the replication servers.  It does not appear to crash the server or really do anything negative to it, but I am worried it is a sign of deeper problems.

How to repeat:
Do not know how to repeat.

I am looking at this more and have discovered that it is coming from the master database, not the slave.  Our error reporting script was reporting the wrong server name.  I am going to attach our log file that shows a bunch of very interesting information.  Not sure what it all means, but appears to be related to the "record field 5 len" messages.

Thank you for a problem report. Please, send (or uload) the whole error log from your master and describe the hardware used (RAM size etc.)

SHOW CREATE TABLE `Regression_Data` results may be useful to.

Create table syntax:

CREATE TABLE `Regression_Data` (\n  `TestArea` varchar(245) NOT NULL default '',\n  `TestCase` varchar(245) NOT NULL default '',\n  `Result` enum('UNKNOWN','PASS','FAIL','UNEXPECTED_PASS','EXPECTED_FAIL','CORE_DUMP','TIMED_OUT','UNTESTED','UNKNOWN_FAIL','RUNNING') NOT NULL default 'UNKNOWN',\n  `Host` varchar(245) default NULL,\n  `Host_OS` varchar(245) default NULL,\n  `Time_Execute` bigint(11) NOT NULL default '0',\n  `Time_Compare` bigint(11) NOT NULL default '0',\n  `Time_Total` bigint(11) NOT NULL default '0',\n  `Result_TS` datetime NOT NULL default '0000-00-00 00:00:00',\n  `Last_Modified` datetime NOT NULL default '0000-00-00 00:00:00',\n  `Diff_Hash` varchar(40) default NULL,\n  `Resdir` varchar(100) NOT NULL default '',\n  `Regression_Run` bigint(20) NOT NULL default '0',\n  `Latest_Flag` tinyint(1) NOT NULL default '0',\n  PRIMARY KEY  (`TestArea`,`TestCase`,`Result_TS`,`Regression_Run`),\n  KEY `Result_Index` (`Result`,`Result_TS`,`TestArea`,`TestCase`,`Regression_Run`),\n  KEY `Regression_Run` (`Regression_Run`,`Diff_Hash`),\n  CONSTRAINT `Regression_Data_ibfk_1` FOREIGN KEY (`Regression_Run`) REFERENCES `Regression_Runs` (`Regression_Run`)\n) ENGINE=InnoDB DEFAULT CHARSET=latin1

Hardware:

- Dell rackmount 850
- Raid 1+0 4 SCSI disks
- Xeon 3.6Ghz
- 4GB memory

Nathan,

did you run CHECK TABLE? On which table?

That can print the errors about a record field length.

Earlier in the log we see other type of corruption: there is a secondary index entry, but it has no clustered index record.

If you run CHECK TABLE on that QA_Results.Regression_Data, what does mysqld print to the .err log?

Assigning this to Marko.

Regards,

Heikki

Nathan, Marko,

Nathan, can you resolve some of the stack traces in the .err log?

The InnoDB instance was created from an InnoDB Hot Backup in November 2005:

"
051111 14:34:54 mysqld started
051111 14:34:54 [Warning] Changed limits: max_open_files: 1024 max_connections: 100 table_cache: 457
InnoDB: The log file was created by ibbackup --apply-log at
InnoDB: ibbackup 051111 13:54:19
InnoDB: NOTE: the following crash recovery is part of a normal restore.
051111 14:34:55 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
051111 14:34:56 InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 8 3474552844.
InnoDB: Doing recovery: scanned up to log sequence number 8 3474552844
InnoDB: Last MySQL binlog file position 0 367053103, file name ./orw-gbsmysql-bin.000027
051111 14:34:56 InnoDB: Started; log sequence number 8 3474552844
"

Log files were made bigger:

"
051111 14:49:10 InnoDB: Log file /gbs/gbs_db/ib_logfile0 did not exist: new to be created
InnoDB: Setting log file /gbs/gbs_db/ib_logfile0 size to 1000 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
051111 14:49:24 InnoDB: Log file /gbs/gbs_db/ib_logfile1 did not exist: new to be created
InnoDB: Setting log file /gbs/gbs_db/ib_logfile1 size to 1000 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
051111 14:49:38 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
051111 14:49:39 InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 8 3474568204.
InnoDB: Doing recovery: scanned up to log sequence number 8 3474568204
InnoDB: Last MySQL binlog file position 0 13556, file name ./mysql-bin.000004
"

But something probably went wrong, since mysqld crashed in 17 minutes. It crashed many times during the next 32 days.

Some of the crashes were in table Regression_Data.

Then the crashes for some reason ended, but on Jan 3 suddenly we have corruption:

"
Version: '5.0.15-standard-log' socket: '/tmp/mysql.sock' port: 3306 MySQL Community Edition - Standard (GPL)
060103 11:00:32 InnoDB: error clustered record for sec rec not found
InnoDB: index `Regression_Run` of table `QA_Results/Regression_Data`
"

The problem might be a result of some error that was made in restoring the InnoDB Hot Backup. For example, if some of the .ibd files was not the correct one, we might have corruption in the database.

It might also be some bug in InnoDB Hot Backup or InnoDB.

Regards,

Heikki

All crashes reported in the log have occurred at 0x814e7f8 and have very similar stack traces. I would very much like to have some of those stack traces resolved.

I would also like to see a page dump of one of the pages affected.
Nathan, would it be possible for you to run mysqld under gdb and set a breakpoint to rem0rec.c:1318 (the error print in rec_validate())? When the breakpoint is reached (presumably in CHECK TABLE QA_Results.Regression_Data), please execute the following commands in gdb:

p *offsets@*offsets
i lo
bt
p/x rec[-30]@100
call buf_page_print(buf_frame_align(rec))

Nathan,
if I remember correctly, you have licenses of InnoDB Hot Backup 2.0 but not 3.0. Thus, at the time of restoring from the backup, the tables must have been in the old ROW_FORMAT=REDUNDANT format. MySQL/InnoDB 5.0.3 introduced a more compact row format, ROW_FORMAT=COMPACT, which is the default. The compact row format is not supported by InnoDB Hot Backup 2.0. (Please contact us for a free license upgrade.)

From the error log, it looks like some of the tables have been converted to the compact format. Can you remember when you might have done such conversions? The conversion could be done with OPTIMIZE TABLE or ALTER TABLE, or by restoring an SQL dump. This could help us in troubleshooting.

Also, is it possible that any *.ibd or *.frm files have been manually copied between database instances? You know, the *.ibd files alone do not necessarily represent a consistent view of the database. The redo logs (ib_logfile*) are needed as well.

To answer some of your questions....yes we had an old version of hot backup and tried to setup this server intitially using it, but it failed as you can see in the logs.  We then got the beta version from InnoDB that works with 5.0.  It appeared to have worked correctly.

There really is no easy way for us to run it in gdb.  It is the master for 8 slaves that run at our development sites all around the globe.  Our division of 200+ engineers come to a halt when we take down the database.  It drives our automated build and packaging system.

Here are the stacktraces you asked for:

0x814e7f8 _Z17mysql_lock_tablesP3THDPP8st_tablejjPb + 856
0x95e7c8 (?)
(nil)
0x82a67e9 btr_estimate_number_of_different_key_vals + 1061
0x82a4d98 btr_cur_parse_del_mark_set_clust_rec + 48
0x829a1a4 btr_index_rec_validate + 1232
0x827f063 row_purge + 627
0x827adcf row_truncate_table_for_mysql + 2231
0x81f3335 _ZN9ha_myisam6createEPKcP8st_tableP24st_ha_create_information + 1177
0x81a13ca _Z15select_describeP4JOINbbbPKc + 2906
0x8192f97 _ZN4JOIN4execEv + 2695
0x8198ca9 _ZN13st_table_list5printEP3THDP6String + 381
0x818f8b2 _Z9frm_erroriP8st_tablePKcii + 390
0x81903cd _ZN4JOIN7prepareEPPP4ItemP13st_table_listjS1_jP8st_orderS7_S1_S7_P13st_select_lexP18st_select_lex_unit + 513
0x818c6ba _Z7openfrmP3THDPKcS2_jjjP8st_table + 5450
0x81621ad _ZN15sys_var_thd_bitD1Ev + 5
0x823cfe6 _Z15gmt_sec_to_TIMEP13st_mysql_timelPK17st_time_zone_info + 198
0x823ce82 _Z11sec_to_TIMEP13st_mysql_timell + 318
0x823cf80 _Z15gmt_sec_to_TIMEP13st_mysql_timelPK17st_time_zone_info + 96
0x823ab7d _ZN16Sensitive_cursorC1EP3THDP13select_result + 57
0x823b98f _ZN19Materialized_cursorD0Ev + 143
0x8166a3a _Z21mysql_execute_commandP3THD + 4770
0x8168d82 _Z21mysql_execute_commandP3THD + 13802
0x8160aa6 _ZN20sys_var_thd_sql_mode28symbolic_mode_representationEP3THDmPm + 34
0x81605d1 _ZN7set_var6updateEP3THD + 13
0x815fab1 _ZN18sys_var_rand_seed26updateEP3THDP7set_var + 5
0x958341 (?)
0x7d8fee (?)

---

0x814e7f8 _Z17mysql_lock_tablesP3THDPP8st_tablejjPb + 856
0x95e7c8 (?)
(nil)
0x82a67e9 btr_estimate_number_of_different_key_vals + 1061
0x82a4d98 btr_cur_parse_del_mark_set_clust_rec + 48
0x829a1a4 btr_index_rec_validate + 1232
0x827c0ef row_drop_table_for_mysql + 3807
0x81f302b _ZN9ha_myisam6createEPKcP8st_tableP24st_ha_create_information + 399
0x81a1129 _Z15select_describeP4JOINbbbPKc + 2233
0x8192ff3 _ZN4JOIN4execEv + 2787
0x8198ff5 _ZN13st_select_lex5printEP3THDP6String + 565
0x8192fa2 _ZN4JOIN4execEv + 2706
0x8198ca9 _ZN13st_table_list5printEP3THDP6String + 381
0x818f8b2 _Z9frm_erroriP8st_tablePKcii + 390
0x81903cd _ZN4JOIN7prepareEPPP4ItemP13st_table_listjS1_jP8st_orderS7_S1_S7_P13st_select_lexP18st_select_lex_unit + 513
0x8227ee8 _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 632
0x8227a9c _Z17init_slave_threadP3THD14SLAVE_THD_TYPE + 356
0x8182d55 _Z7yyparsePv + 76549
0x8162160 _ZN26sys_var_sync_binlog_periodD1Ev + 8
0x823cfe6 _Z15gmt_sec_to_TIMEP13st_mysql_timelPK17st_time_zone_info + 198
0x823ce82 _Z11sec_to_TIMEP13st_mysql_timell + 318
0x823cf80 _Z15gmt_sec_to_TIMEP13st_mysql_timelPK17st_time_zone_info + 96
0x823ab7d _ZN16Sensitive_cursorC1EP3THDP13select_result + 57
0x823b98f _ZN19Materialized_cursorD0Ev + 143
0x8166a3a _Z21mysql_execute_commandP3THD + 4770
0x8168d82 _Z21mysql_execute_commandP3THD + 13802
0x8160aa6 _ZN20sys_var_thd_sql_mode28symbolic_mode_representationEP3THDmPm + 34
0x81605d1 _ZN7set_var6updateEP3THD + 13
0x815fab1 _ZN18sys_var_rand_seed26updateEP3THDP7set_var + 5
0x958341 (?)
0x7d8fee (?)

Our master server crashed again last night.  It crashed after we did an alter table on the gbs/Build table.  Now it is complaining about the data dictionary.  I went through the suggestions on the web to find orphaned tables, but I didn't find any.  The database appears to be running fine now.  It crashed last night around 8PM Pacific time.  I don't know if this is related to the len messages.

I am working on the stacktraces for the crashes right now.

Stacktrace:

0x8151dc8  + 135601608
0x95e7c8  + 9824200
0x286fd000  + 678416384
0x8274534  + 136791348
0x82726c8  + 136783560
0x827511f  + 136794399
0x8273681  + 136787585
0x8273221  + 136786465
0x8273464  + 136787044
0x8277b4a  + 136805194
0x81f7722  + 136279842
0x81ad6a0  + 135976608
0x81acba6  + 135973798
0x81678ca  + 135690442
0x816c80a  + 135710730
0x816431e  + 135676702
0x8163e49  + 135675465
0x8163351  + 135672657
0x958341  + 9798465
0x7d8fee  + 8228846

Nathan,

Unfortunately, those stack traces do not make any sense, i.e., the call stacks are impossible. This is not the first time when mysqld reports useless stack traces. It would be nice if you could attach gdb to the mysqld process, because we have found gdb to produce accurate stack traces.

Would it be possible for you to set up a test system from a (hot or cold) backup of the data files? Or could you make one of the replication slaves the new master and take the old master offline?

When the system is offline, I'd run mysqld inside gdb while invoking CHECK TABLE on those tables that have caused problems.

Our database rapidly went downhill after I sent those messages.  It started crashing on just about anything relating to modifying foreign keys, as well as inserting data into a table where a foreign key contraint would fail.  We stopped it and ran a mysqldump of all our data.  Then started with a fresh datadir and reloaded our entire database.  It is much happier now.  We are not seeing the "len" errors yet.  I suppose this bug can be closed and I will re-open if we see it again.

Thank you for all your help.

Very well, I will close the bug for now.

It is a pity that we could not find out the definite cause of the corruption. I would recommend you to run memtest86 or a similar memory-tester tool on the server. If the memory subsystem is at fault, it will cause corruption sooner or later. If the corruption occurs in a clustered index, you will not be able to get an SQL dump of the database.

We're having a similar problem, in MySQL 5.1.36, but it is crashing MySQL.  It is a new machine, replacing an older machine that is almost identical.  The older machine did not crash with the same SQL (no SQL changes).  The new machine was built using backups from the old machine:

Linux db03 2.6.24-1-amd64 Debian GNU/Linux lenny/sid
Linux db04 2.6.26-2-amd64 Debian GNU/Linux 5.0 (this one is crashing)

Here's one such error:

[ERROR] MySQL thread id 703704, query id 47377652 report01.company.com 10.1.4.126 dbname Copying to tmp table
[ERROR] SELECT /* Reporting Query 261 */
[ERROR] 	l.label_id, 
[ERROR] 	l.label_name, 
[ERROR] 	rls.display_artist_name as release_display_artist_name, 
[ERROR] 	rls.release_name, 
[ERROR] 	rls.release_id, 
[ERROR] 	va.video_id, 
[ERROR] 	va.video_name, 
[ERROR] 	va.video_type, 
[ERROR] 	va.artist_name as video_artist_name, 
[ERROR] 	va.tv, 
[ERROR] 	rls.upc, 
[ERROR] 	SUM(rrd.total_sales ) total_sales, 
[ERROR] 	SUM(rrd.rightsholder_net ) AS net_earnings, 
[ERROR] 	SUM( rrd.total_sales * aser.rate ) total_sales_nc, 
[ERROR] 	SUM( rrd.rightsholder_net * aser.rate ) AS rightsholder_net_nc, 
[ERROR] 	SUM( (rrd.total_sales * (100 - IF(l.dlabel_display_percent IS NOT NULL AND l.dlabel_display_percent >0, l.dlabel_display_percent
[ERROR] InnoDB: Submit a detailed bug report to http://bugs.mysql.com
[ERROR]  InnoDB: error clustered record for sec rec not found
[ERROR] InnoDB: index `royalty_journal_id` of table `dbname`.`royalty_report_detail`
[ERROR] InnoDB: sec index record PHYSICAL RECORD: n_fields 2; compact format; info bits 0
[ERROR]  0: len 4; hex 00163bec; asc ; ;;
[ERROR]  1: len 4; hex 0854f18f; asc T ;;
[ERROR] InnoDB: clust index record PHYSICAL RECORD: n_fields 1; compact format; info bits 0
[ERROR]  0: len 8; hex 696e66696d756d00; asc infimum ;;
[ERROR] TRANSACTION 14D426C3D, ACTIVE 6 sec, process no 12231, OS thread id 1264200016 fetching rows, thread declared inside InnoDB 318
[ERROR] mysql tables in use 27, locked 0
[ERROR] MySQL thread id 703704, query id 47377652 report01.company.com 10.1.4.126 dbname Copying to tmp table
[ERROR] SELECT /* Reporting Query 261 */
[ERROR] 	l.label_id, 
[ERROR] 	l.label_name, 
[ERROR] 	rls.display_artist_name as release_display_artist_name, 
[ERROR] 	rls.release_name, 
[ERROR] 	rls.release_id, 
[ERROR] 	va.video_id, 
[ERROR] 	va.video_name, 
[ERROR] 	va.video_type, 
[ERROR] 	va.artist_name as video_artist_name, 
[ERROR] 	va.tv, 
[ERROR] 	rls.upc, 
[ERROR] 	SUM(rrd.total_sales ) total_sales, 
[ERROR] 	SUM(rrd.rightsholder_net ) AS net_earnings, 
[ERROR] 	SUM( rrd.total_sales * aser.rate ) total_sales_nc, 
[ERROR] 	SUM( rrd.rightsholder_net * aser.rate ) AS rightsholder_net_nc, 
[ERROR] 	SUM( (rrd.total_sales * (100 - IF(l.dlabel_display_percent IS NOT NULL AND l.dlabel_display_percent >0, l.dlabel_display_percent
[ERROR] InnoDB: Submit a detailed bug report to http://bugs.mysql.com
[ERROR]  InnoDB: error clustered record for sec rec not found
[ERROR] InnoDB: index `royalty_journal_id` of table `dbname`.`royalty_report_detail`

Sheeri, it does not look like the same bug. You are seeing "InnoDB: error clustered record for sec rec not found". Can you please file a new ticket for that?

apologies, the comment on 13 Jan 2006 2:06 by Heikki Tuuri had that "InnoDB: error clustered record for sec rec not found" message so I thought it was related.  I have created http://bugs.mysql.com/bug.php?id=52136 with this issue.