Bug #100640 MyISAM table is damaged and read_only is turned on, causing startup failure
Submitted: 26 Aug 2020 6:11 Modified: 26 Aug 2020 13:02
Reporter: lei yue Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S2 (Serious)
Version:>=5.7.29, 5.7.31 OS:CentOS (7.5)
Assigned to: CPU Architecture:x86

[26 Aug 2020 6:11] lei yue
Description:
When my.cnf contains the configuration as follows:
[mysqld]
read_only=1

Follow the test steps of the test students, first create a user:
mysql> create user test@'%' identified by '1234';
Query OK, 0 rows affected (0.01 sec)

Then close mysqld; use kill -9, the operation is as follows:
shell> kill -9 $(pidof mysqld)

Then restart mysqld, as follows:
shell> mysqld_safe --defaults-file=/etc/mysql/my.cnf &

At this time the problem reappeared, mysqld failed to start, we checked the error log, the information is as follows:
...
2020-08-18T10:13:55.506415+08:00 0 [ERROR] /usr/local/mysql/bin/mysqld: Table './mysql/user' is marked as crashed and should be repaired
2020-08-18T10:13:55.506553+08:00 0 [ERROR] Fatal error: Can't open and lock privilege tables: Table './mysql/user' is marked as crashed and should be repaired
2020-08-18T10:13:55.506658+08:00 0 [ERROR] Fatal error: Failed to initialize ACL/grant/time zones structures or failed to remove temporary table files.
2020-08-18T10:13:55.506789+08:00 0 [ERROR] Aborting
...

The difference between the code of 5.7.31 and the code of 5.7.26 is as follows. The code of 5.7.31 (sql/sql_base.cc:5549) is shown below:
...
   // Phase 3: Acquire the locks which have been requested so far.
   if (thd->mdl_context.acquire_locks(&mdl_requests, lock_wait_timeout))
     return true;

   /*
     Now when we have protection against concurrent change of read_only
     option we can safely re-check its value.
     Skip the check for FLUSH TABLES ... WITH READ LOCK and
     FLUSH TABLES ... FOR EXPORT as they are not supposed to be affected
     by read_only modes.
   */
   if (need_global_read_lock_protection &&
       !(flags & MYSQL_OPEN_SKIP_SCOPED_MDL_LOCK) &&
       !(flags & MYSQL_LOCK_IGNORE_GLOBAL_READ_ONLY) &&
       check_readonly(thd, true))
     return true;
...

How to repeat:
Turn off read_only, which is as follows:
[mysqld]
read_only=0

Then start mysqld again and find that the MyISAM table can be automatically repaired and started normally; the error log information is as follows:
...
2020-08-18T11:52:26.775553+08:00 0 [ERROR] /usr/local/mysql/bin/mysqld: Table'./mysql/user' is marked as crashed and should be repaired
2020-08-18T11:52:26.776217+08:00 0 [Warning] Checking table:'./mysql/user'
2020-08-18T11:52:26.776273+08:00 0 [ERROR] 1 client is using or hasn't closed the table properly
2020-08-18T11:52:26.882537+08:00 0 [Note] Failed to start slave threads for channel''
2020-08-18T11:52:26.906018+08:00 0 [Note] Event Scheduler: Loaded 0 events
2020-08-18T11:52:26.906480+08:00 0 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.7.31-debug-log' socket:'/var/lib/mysql/sock/mysql.sock' port: 3306 Source distribution
[26 Aug 2020 6:15] lei yue
Modify the mysql version
[26 Aug 2020 6:23] lei yue
The commit id of the code change is: 0405ebee and 05824063
[26 Aug 2020 12:15] MySQL Verification Team
Hello lei yue,

Thank you for the report and feedback.
With the provided steps I couldn't inject the corruption and hence had to manually corrupt the user.MYD before trying kill -9 to trigger restart(which failed).

-
rm -rf 100640/
bin/mysqld --initialize-insecure --basedir=$PWD --datadir=$PWD/100640 --log-error-verbosity=3 
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/100640 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3333 --log-error=$PWD/100640/log.err --mysqlx=0 --log-error-verbosity=3  --secure-file-priv="" --read_only=1 2>&1 &

-
for i in {1..5000}; do bin/mysql -uroot -S /tmp/mysql_ushastry.sock -e "create user test${i}@'%' identified by '1234';grant all on *.* to test${i}@'%';"; done

- kill -9 <pid_of_mysqld>

- excerpt from error log

2020-08-26T08:13:18.114427Z 0 [ERROR] mysqld: Table 'user' is marked as crashed and should be repaired
2020-08-26T08:13:18.114477Z 0 [ERROR] Fatal error: Failed to initialize ACL/grant/time zones structures or failed to remove temporary table files.
2020-08-26T08:13:18.114566Z 0 [ERROR] Aborting

regards,
Umesh
[26 Aug 2020 13:02] lei yue
hi,
  Yes. In our environment, we have used two methods to reproduce this problem; one is to manually create user xxx, and then kill -9 $(pidof mysqld) in a short period of time, which is the method I provided above ; The other is to create users in the for loop, and then kill -9 $(pidof mysqld), which is the method you use.
[5 May 13:02] Buchan Milne
We're currently running 5.7.27, and we're testing 5.7.33.

We ran into the same problem in our integration tests, where a backup is created (using xtrabackup) on one host, and then restored on a second host.

Our integration tests are failing on this issue, where the restored host cannot start with:

2021-05-05T11:45:50.321516Z 0 [ERROR] /usr/bin/mysqld_real: Table './mysql/user' is marked as crashed and should be repaired
2021-05-05T11:45:50.321559Z 0 [ERROR] Fatal error: Can't open and lock privilege tables: Table './mysql/user' is marked as crashed and should be repaired

We configure all of our hosts with:
[mysqld]
read_only

We have tooling that performs failovers as necessary, in the event of all hosts rebooting, manual intervention is required to promote a host to writable, but we would prefer to have a write outage than a "split brain" scenario at any point.

This now leaves us with a (read-only) availability risk, where any host that experiences a power failure may not start MySQL, if table recovery is required.

Alternatively, we need a more complex approach that starts MySQL without networking with read-only=OFF first, waits for it to start, and then restart it again with networking and read-only=ON.

Or, this bugs needs more attention.

Also, this seems to be a duplicate of https://bugs.mysql.com/bug.php?id=100283 which is also verified.