MySQL Bugs: #107593: mysql community edition 5.7.16 prompts that the table is corrupted

Bug #107593	mysql community edition 5.7.16 prompts that the table is corrupted
Submitted:	19 Jun 2022 1:21	Modified:	20 Jun 2022 13:28
Reporter:	peiyang zhang	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.7.16	OS:	CentOS (7.5.1804)
Assigned to:		CPU Architecture:	x86 (32)

Description:
2022-06-19T01:01:51.152569+08:00 0 [Warning] 'user' entry 'root@localhost' ignored in --skip-name-resolve mode.
2022-06-19T01:01:51.153197+08:00 0 [Warning] 'user' entry 'py_mark_by_will@localhost' ignored in --skip-name-resolve mode.
2022-06-19T01:01:51.153927+08:00 0 [Warning] 'db' entry 'sys mysql.sys@localhost' ignored in --skip-name-resolve mode.
2022-06-19T01:01:51.154144+08:00 0 [Warning] 'proxies_priv' entry '@ root@localhost' ignored in --skip-name-resolve mode.
2022-06-19T01:01:51.199470+08:00 0 [Warning] 'tables_priv' entry 'sys_config mysql.sys@localhost' ignored in --skip-name-resolve mode.
2022-06-19T01:01:51.250390+08:00 0 [Note] Event Scheduler: Loaded 0 events
2022-06-19T01:01:51.250574+08:00 1 [Note] Event Scheduler: scheduler thread started with id 1
2022-06-19T01:01:51.250676+08:00 0 [Note] /usr/local/mysql-5.7.16-linux-glibc2.5-x86_64/bin/mysqld: ready for connections.
Version: '5.7.16-log'  socket: '/tmp/mysql_jydb.sock'  port: 43306  MySQL Community Server (GPL)
2022-06-19T01:03:07.520944+08:00 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=0, page number=80], should be [page id: space=1426, 
page number=857190]
2022-06-19T01:03:07.521194+08:00 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of page [page id: space=1426, page number=857190]. You may have to
 recover from a backup.
2022-06-19T01:03:07.521222+08:00 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex c66dffe000000050000475eb000017f400000002db586b48000000000000000000000000000000ec01bc02be0000003800000000000003b2000000000002dc310000000000000000001c00000000
0000000000000000000000000000000.....  //A lot of content is omitted here  
InnoDB: End of page dump
2022-06-19T01:03:07.599732+08:00 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 3329097696, calculated checksums for field1: crc32 3650104114/884774814, inno
db 1809404915, none 3735928559, stored checksum in field2 3329097696, calculated checksums for field2: crc32 3650104114/884774814, innodb 992001978, none 3735928559,  page 
LSN 2 3680004935, low 4 bytes of LSN at page end 3680004935, page number (if stored to page already) 80, space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be a freshly allocated page
2022-06-19T01:03:07.599749+08:00 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the er
ror. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan y
our table for corruption. Please refer to http://dev.mysql.com/doc/rejydban/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.
2022-06-19T01:03:34.437277+08:00 6 [Note] Aborted connection 6 to db: 'unconnected' user: 'zabbix' host: 'localhost' (Got an error writing communication packets)
2022-06-19T01:03:34.441824+08:00 4 [Note] Aborted connection 4 to db: 'unconnected' user: 'zabbix' host: 'localhost' (Got an error writing communication packets)
2022-06-19T01:03:34.445628+08:00 5 [Note] Aborted connection 5 to db: 'unconnected' user: 'zabbix' host: 'localhost' (Got an error writing communication packets)
2022-06-19T01:04:04.623789+08:00 0 [Note] InnoDB: Buffer pool(s) load completed at 220619  1:04:04
2022-06-19T01:19:09.726287+08:00 0 [Note] Giving 1 client threads a chance to die gracefully
2022-06-19T01:19:09.726380+08:00 0 [Note] Shutting down slave threads
2022-06-19T01:19:11.726703+08:00 0 [Note] Forcefully disconnecting 1 remaining clients
2022-06-19T01:19:11.726767+08:00 0 [Note] Event Scheduler: Killing the scheduler thread, thread id 1
2022-06-19T01:19:11.726792+08:00 0 [Note] Event Scheduler: Waiting for the scheduler thread to reply
2022-06-19T01:19:11.726867+08:00 0 [Note] Event Scheduler: Stopped
2022-06-19T01:19:11.726891+08:00 0 [Note] Event Scheduler: Purging the queue. 0 events
2022-06-19T01:19:11.727206+08:00 0 [Note] Binlog end
2022-06-19T01:19:11.730994+08:00 0 [Note] Shutting down plugin 'rpl_semi_sync_slave'
2022-06-19T01:19:11.731062+08:00 0 [Note] Shutting down plugin 'rpl_semi_sync_master'
2022-06-19T01:19:11.731122+08:00 0 [Note] Stopping ack receiver thread
2022-06-19T01:19:11.731261+08:00 0 [Note] unregister_replicator OK
2022-06-19T01:19:11.731282+08:00 0 [Note] Shutting down plugin 'ngram'
 .....

How to repeat:
Suspected to be a bug，The phenomenon of encountering this problem is: the database instance is continuously restarted, and after positioning, it is found that a partition of a partition table has a bad block.

Suggested fix:
Fix this bug in 5.7.16

add more error message

Attachment: more_info.txt (text/plain), 47.92 KiB.

I tried to look at the source code and found that there seems to be a memory problem，Specific information is as follows：

ut0ut.cc：  

	/* we abort here because if unknown error code is given, this could
	mean that memory corruption has happened and someone's error-code
	variable has been overwritten with bogus data */
	ut_error;

	/* NOT REACHED */
	return("Unknown error");
}

I don't know what caused it, I need your help to solve it, thank you very much！

I have done a bad block check on the hard disk at the operating system layer. I have checked the physical disk for problems, and I have confirmed that the physical disk bad blocks can be ruled out.

Hi Mr. zhang,

Thank you for your bug report.

However , it is not a bug.

As you can clearly see there is a difference between the calculated and actual checksum. That means that you have had some transient error in your hardware, either in RAM modules or disk. Transient errors can not be caught, or are rarely caught, by the hardware diagnostic tools.

We recommend the use of the reliable hardware , like ECC RAM modules, 2 bits checking and 1 bit correcting, as well as RAID  or RAID 10 disk arrays.

Not a bug.