Bug #98743 Invalid entry on mysql.gtid_executed table is unrecoverable
Submitted: 26 Feb 2020 1:02 Modified: 26 Feb 2020 6:45
Reporter: Yoshihiko Matsuzaki Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.7,8.0, 5.7.29, 8.0.19 OS:Any
Assigned to: CPU Architecture:Any

[26 Feb 2020 1:02] Yoshihiko Matsuzaki
Description:
If mysql.gtid_executed table accidentally gets an invalid entry and then mysqld is restarted, mysqld will never be started with "Malformed GTID set specification" message.

This issue was found on one of my databases without any manual modification on mysql.gtid_executed table (i.e. the invalid entry on mysql.gtid_executed table naturally happened), and also I have ruled out the possibility of binlog corruption while troubleshooting because the issue persisted even though I removed all binary logs, disabled binary logging by --disable-log-bin option, and removed log-bin option from my.cnf.

As per the source code, the only code path to emit the "Malformed GTID set specification" message under binary logging disabled is, read_gtid_executed_from_table() function call in mysqld_main() as described later. The function reads GTID from mysql.gtid_executed table, then I think it's an anomaly on mysql.gtid_executed table.

We haven't identified what the invalid entry on mysql.gtid_executed table and how it happened because we don't have any option to forcibly start the process to see the content of mysql.gtid_executed table.

---

MySQL tries to extract GTID from mysql.gtid_executed table while starting up in mysqld_main() in sql/mysqld.cc:
https://github.com/mysql/mysql-server/blob/mysql-8.0.19/sql/mysqld.cc#L6607

If mysql.gtid_executed table has an invalid entry, add_gtid_text() will raise a server error and startup sequence will be aborted, because read_gtid_executed_from_table() in sql/rpl_gtid_state.cc calls fetch_gtid() in sql/rpl_gtid_persist.cc, and then fetch_gtid() calls add_gtid_text():
https://github.com/mysql/mysql-server/blob/mysql-5.7.23/sql/rpl_gtid_state.cc#L788
https://github.com/mysql/mysql-server/blob/mysql-5.7.23/sql/rpl_gtid_persist.cc#L785
https://github.com/mysql/mysql-server/blob/mysql-5.7.23/sql/rpl_gtid_set.cc#L598

Almost all other GTID initialization only happens when binary logging is enabled (opt_log_bin is set) in mysqld_main(). However, only read_gtid_executed_from_table() always happens regardless of gtid_mode and binary logging configuration:

https://github.com/mysql/mysql-server/blob/mysql-8.0.19/sql/mysqld.cc#L6607
====================
  if (!opt_initialize && !opt_initialize_insecure) {
    // Initialize executed_gtids from mysql.gtid_executed table.
    if (gtid_state->read_gtid_executed_from_table() == -1) unireg_abort(1);
  }

  if (opt_bin_log) {
    /*
      Initialize GLOBAL.GTID_EXECUTED and GLOBAL.GTID_PURGED from
      gtid_executed table and binlog files during server startup.
    */
    Gtid_set *executed_gtids =
        const_cast<Gtid_set *>(gtid_state->get_executed_gtids());
...
====================

In other words, there's no way to opt out of read_gtid_executed_from_table() execution, and also there's no automatic recovery on the table.

This behavior is reproduced in the latest 5.7/8.0 release (MySQL 5.7.29 and 8.0.19).

How to repeat:
This steps are just to emulate the issue. As described earlier, I found this problem on a natural anomaly on mysql.gtid_executed table.

1) Insert an invalid entry to mysql.gtid_executed to emulate the problem (instead of waiting for natural occurrence)

mysql> INSERT INTO mysql.gtid_executed VALUES ("invalid", 0, 0);

2) Restart the process

3) mysqld won't restart with the following error

mysqld: Malformed GTID set specification 'invalid:0-0'.
2020-02-25T06:33:51.651531Z 0 [ERROR] [MY-010119] [Server] Aborting

Suggested fix:
There could be several options to address this issue.

1) Implemeting a way to opt out of read_gtid_executed_from_table() execution while starting up. For example, moving the execution into the if(opt_log_bin) block, wrapping the execution with if block switching by gtid_mode/log_bin, or implementing a command-line option to skip GTID initialization/recovery.

2) Relaxing GTID initialization/recovery to not abort startup sequence even though initialization is failed, especially in the case of gtid_mode = OFF/OFF_PERMISSIVE.

3) Automatic table recovery when any anomaly found on mysql.gtid_executed table during GTID initialization/recovery.
[26 Feb 2020 6:45] MySQL Verification Team
Hello Yoshihiko-San,

Thank you for the report and feedback.
I was able to reproduce using your steps on 5.7.29 and 8.0.19(5.6 not applicable) builds. Also, if you see *INSERT* was warning "Please do not modify the gtid_executed table. This is a mysql internal system table to store GTIDs for committed transactions. Modifying it can lead to an inconsistent GTID state.". 

Development would still require normal way to reproduce this though. If you are able to provide a reliable test case then please let us know. Thank you!

regards,
Umesh
[26 Feb 2020 6:51] MySQL Verification Team
- 5.7.29

rm -rf 98743/
bin/mysqld --initialize-insecure --basedir=$PWD --datadir=$PWD/98743 --log-error-verbosity=3
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --log-error-verbosity=3 --secure-file-priv="" --skip-name-resolve --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1 &

 bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.29-log MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select * from mysql.gtid_executed;
Empty set (0.00 sec)

mysql>  INSERT INTO mysql.gtid_executed VALUES ("invalid", 0, 0);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                             |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 3129 | Please do not modify the gtid_executed table. This is a mysql internal system table to store GTIDs for committed transactions. Modifying it can lead to an inconsistent GTID state. |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)

mysql> \q
Bye
[1]+  Done                    bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --log-error-verbosity=3 --secure-file-priv="" --skip-name-resolve --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1

-- restart instance

 bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --log-error-verbosity=3 --secure-file-priv="" --skip-name-resolve --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1 &
[1] 6822
.
.

[1]+  Exit 1                  bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --log-error-verbosity=3 --secure-file-priv="" --skip-name-resolve --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1

- extract from error log

mysqld: Malformed GTID set specification 'invalid:0-0'.
2020-02-26T06:20:50.362878Z 0 [ERROR] Aborting

2020-02-26T06:20:50.362893Z 0 [Note] Binlog end
[26 Feb 2020 6:52] MySQL Verification Team
- 8.0.19

rm -rf 98743/
bin/mysqld --initialize-insecure --basedir=$PWD --datadir=$PWD/98743 --log-error-verbosity=3
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3333 --log-error=$PWD/98743/log.err --mysqlx=0 --log-error-verbosity=3  --secure-file-priv="" --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1 &

bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 8.0.19 MySQL Community Server - GPL

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select * from mysql.gtid_executed;
Empty set (0.01 sec)

mysql> INSERT INTO mysql.gtid_executed VALUES ("invalid", 0, 0);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                             |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 3129 | Please do not modify the gtid_executed table. This is a mysql internal system table to store GTIDs for committed transactions. Modifying it can lead to an inconsistent GTID state. |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)

mysql> \q
Bye
[1]+  Done                    bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --mysqlx=0 --log-error-verbosity=3 --secure-file-priv="" --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1

-- Try to bring up

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)

mysql> \q
Bye
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --mysqlx=0 --log-error-verbosity=3 --secure-file-priv="" --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1

.

[1]+  Exit 1                  bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/98743 --core-file --socket=/tmp/mysql_ushastry.sock --port=3333 --log-error=$PWD/98743/log.err --mysqlx=0 --log-error-verbosity=3 --secure-file-priv="" --gtid_mode=ON --log-bin --log-slave-updates --enforce-gtid-consistency --server-id=1 2>&1

-
mysqld: Malformed GTID set specification 'invalid:0-0'.
2020-02-26T05:52:15.176997Z 0 [Note] [MY-012922] [InnoDB] Waiting for purge to start
mysqld: Malformed GTID set specification 'invalid:0-0'.
2020-02-26T05:52:15.230311Z 0 [ERROR] [MY-010119] [Server] Aborting