Bug #82505 discarding tablespace head to inconsistence cluster and terminating nodes
Submitted: 9 Aug 2016 11:36 Modified: 26 Nov 2018 14:42
Reporter: Shahriyar Rzayev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S1 (Critical)
Version:5.7.14 OS:Ubuntu (16.04)
Assigned to: CPU Architecture:Any

[9 Aug 2016 11:36] Shahriyar Rzayev
Description:
Hi dear all,
With group replication on node3 run following:

CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) /*!50100 TABLESPACE `innodb_file_per_table` */ ENGINE=InnoDB AUTO_INCREMENT=70193 DEFAULT CHARSET=latin1 MAX_ROWS=1000000 COMPRESSION='lz4' ENCRYPTION='N'

With node3:
sysbench --db-driver=mysql --mysql-table-engine=InnoDB --mysql-db=dbtest --mysql-table=sbtest1 --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox14116.sock --num-threads=100 --test=/home/sh/REPOS/sysbench/sysbench/tests/db/oltp.lua --max-requests=0 run

Again with node3:

node3 [localhost] {msandbox} (dbtest) > alter table sbtest1 discard tablespace;
Query OK, 0 rows affected (0.06 sec)

node3 [localhost] {msandbox} (dbtest) > drop table sbtest1;
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

From node3 error log:

2016-08-09T11:16:07.058128Z 20 [ERROR] Slave SQL for channel 'group_replication_recovery': Could not execute Update_rows event on table dbtest.sbtest1; InnoDB: Tablespace has been
 discarded for table 'sbtest1', Error_code: 1814; Tablespace has been discarded for table 'sbtest1', Error_code: 1814; Table 'dbtest.sbtest1' doesn't exist, Error_code: 1146; hand
ler error HA_ERR_NO_SUCH_TABLE; the event's master log mysql-bin.000004, end_log_pos 2622069, Error_code: 1814
2016-08-09T11:16:07.058142Z 20 [Warning] Slave: InnoDB: Tablespace has been discarded for table 'sbtest1' Error_code: 1814
2016-08-09T11:16:07.058147Z 20 [Warning] Slave: Tablespace has been discarded for table 'sbtest1' Error_code: 1814
2016-08-09T11:16:07.058150Z 20 [Warning] Slave: Table 'dbtest.sbtest1' doesn't exist Error_code: 1146
2016-08-09T11:16:07.058154Z 20 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysq
l-bin.000004' position 2621778

Interesting thing "Table 'dbtest.sbtest1' doesn't exist" and "Tablespace has been discarded for table 'sbtest1'" errors appears, but while running drop statement there was an error due to --super-read-only.

As a result I lost whole cluster.

How to repeat:
See description.
[10 Aug 2016 10:57] MySQL Verification Team
Hi Shahriyar,

Thank you for the report.
I tried with a 3 node cluster, with the provided steps but still not seeing the issue which you observed. Could you please provide exact conf files used in your environment, and repeatable steps? Also, did you dynamically enable super_read_only anywhere?

--
[umshastr@hod03]~/bugs/sysbench:  sysbench/sysbench --test=sysbench/tests/db/oltp.lua --oltp-table-size=1000000  --mysql-table-engine=innodb --mysql-db=db1 --mysql-user=root --mysql-socket=/tmp/mysql_node3.sock  prepare
sysbench 0.5:  multi-threaded system evaluation benchmark

Creating table 'sbtest1'...
Inserting 1000000 records into 'sbtest1'

-- node3

[umshastr@hod03]/export/umesh/server/source/bugs/mysql-5.7.14-labs: bin/mysql -uroot -S /tmp/mysql_node3.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 20
Server version: 5.7.14-labs-gr080-log Source distribution

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 3c418a33-5ed8-11e6-b847-0010e05f3e06 | hod03       |        3301 | ONLINE       |
| group_replication_applier | 4237a281-5ed8-11e6-ba10-0010e05f3e06 | hod03       |        3302 | ONLINE       |
| group_replication_applier | 460c6d1e-5ed8-11e6-bb6f-0010e05f3e06 | hod03       |        3303 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

mysql> alter table sbtest1 tablespace=innodb_file_per_table;
Query OK, 0 rows affected (3.53 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>  show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) /*!50100 TABLESPACE `innodb_file_per_table` */ ENGINE=InnoDB AUTO_INCREMENT=7000003 DEFAULT CHARSET=latin1 MAX_ROWS=1000000
1 row in set (0.00 sec)

mysql> alter table sbtest1 COMPRESSION='lz4' ENCRYPTION='Y';
Query OK, 1000000 rows affected (16.62 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest1 COMPRESSION='lz4' ENCRYPTION='N';
Query OK, 1000000 rows affected (20.06 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) /*!50100 TABLESPACE `innodb_file_per_table` */ ENGINE=InnoDB AUTO_INCREMENT=7000003 DEFAULT CHARSET=latin1 MAX_ROWS=1000000 COMPRESSION='lz4' ENCRYPTION='N'
1 row in set (0.00 sec)

--

[umshastr@hod03]~/bugs/sysbench:  sysbench/sysbench --test=sysbench/tests/db/oltp.lua --mysql-table-engine=innodb --mysql-db=db1 --mysql-user=root --mysql-socket=/tmp/mysql_node3.sock  --max-requests=0 --num-threads=100 run
WARNING: Both max-requests and max-time are 0, running endless test
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 100
Random number generator seed is 0 and will be ignored

Threads started!

-- After, 15-20 minutes while sysbench is still running

mysql> alter table sbtest1 discard tablespace;
Query OK, 0 rows affected (0.08 sec)

mysql> drop table sbtest1;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 3c418a33-5ed8-11e6-b847-0010e05f3e06 | hod03       |        3301 | ONLINE       |
| group_replication_applier | 4237a281-5ed8-11e6-ba10-0010e05f3e06 | hod03       |        3302 | ONLINE       |
| group_replication_applier | 460c6d1e-5ed8-11e6-bb6f-0010e05f3e06 | hod03       |        3303 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.replication_connection_status\G
*************************** 1. row ***************************
             CHANNEL_NAME: group_replication_applier
               GROUP_NAME: 8a94f357-aab4-11df-86ab-c80aa9429562
              SOURCE_UUID: 8a94f357-aab4-11df-86ab-c80aa9429562
                THREAD_ID: NULL
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET: 3c418a33-5ed8-11e6-b847-0010e05f3e06:1-2,
8a94f357-aab4-11df-86ab-c80aa9429562:1-7200691
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: group_replication_recovery
               GROUP_NAME:
              SOURCE_UUID:
                THREAD_ID: NULL
            SERVICE_STATE: OFF
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.replication_group_member_stats\G
*************************** 1. row ***************************
                      CHANNEL_NAME: group_replication_applier
                           VIEW_ID: 14708203987832098:3
                         MEMBER_ID: 460c6d1e-5ed8-11e6-bb6f-0010e05f3e06
       COUNT_TRANSACTIONS_IN_QUEUE: 0
        COUNT_TRANSACTIONS_CHECKED: 7200688
          COUNT_CONFLICTS_DETECTED: 0
COUNT_TRANSACTIONS_ROWS_VALIDATING: 5791
TRANSACTIONS_COMMITTED_ALL_MEMBERS: 3c418a33-5ed8-11e6-b847-0010e05f3e06:1-2,
8a94f357-aab4-11df-86ab-c80aa9429562:1-5925318
    LAST_CONFLICT_FREE_TRANSACTION: 8a94f357-aab4-11df-86ab-c80aa9429562:7200691
1 row in set (0.00 sec)

Thanks,
Umesh
[10 Aug 2016 10:59] Shahriyar Rzayev
node1 config

Attachment: my.sandbox.cnf (application/octet-stream, text), 3.04 KiB.

[10 Aug 2016 11:00] Shahriyar Rzayev
node2 config

Attachment: my.sandbox.cnf (application/octet-stream, text), 2.97 KiB.

[10 Aug 2016 11:00] Shahriyar Rzayev
node3 config

Attachment: my.sandbox.cnf (application/octet-stream, text), 3.04 KiB.

[10 Aug 2016 11:01] Shahriyar Rzayev
I have followed this guide -> http://mysqlhighavailability.com/mysql-group-replication-a-quick-start-guide/
[10 Aug 2016 11:43] MySQL Verification Team
Thank you for the conf files, did you alter table and enabled/disabled encryption? 

Thanks,
Umesh
[10 Aug 2016 12:18] Shahriyar Rzayev
I have enabled encryption, then alter table.
Could you please check it with encrypted table? I just wonder if I have paste wrong create statement here.
[10 Aug 2016 15:47] MySQL Verification Team
Finally, observed this issue on my test box.
I'm not sure whether it is intended behavior but observed with below sequence of steps:

-- node3 

node3>STOP GROUP_REPLICATION;
Query OK, 0 rows affected (8.13 sec)

node3>alter table sbtest1 discard tablespace;
Query OK, 0 rows affected (0.04 sec)

node3>START GROUP_REPLICATION;
Query OK, 0 rows affected (2.89 sec)

node3>drop table sbtest1;
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
node3>
node3>drop table sbtest1;
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
node3>

^^ Above error observed only when node3 is in "recovering", if it is in "ONLINE" mode then ALTER+DROP succeeds without any issue

node1>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | de2a9dea-5f0c-11e6-80b1-0010e05f3e06 | hod03       |       14115 | ONLINE       |
| group_replication_applier | df873fd3-5f0c-11e6-8223-0010e05f3e06 | hod03       |       14116 | ONLINE       |
| group_replication_applier | e0d8d72b-5f0c-11e6-835d-0010e05f3e06 | hod03       |       14117 | RECOVERING   |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

-- extract from node3

2016-08-10T15:34:35.350780Z 0 [Note] Plugin group_replication reported: 'connected to x.x.x.x 10303'
2016-08-10T15:34:35.351111Z 37 [ERROR] Slave SQL for channel 'group_replication_applier': Could not execute Update_rows event on table db1.sbtest1; InnoDB: Tablespace has been discarded for table 'sbtest1', Error_code: 1814; Tablespace has been discarded for table 'sbtest1', Error_code: 1814; Table 'db1.sbtest1' doesn't exist, Error_code: 1146; handler error HA_ERR_NO_SUCH_TABLE; the event's master log FIRST, end_log_pos 224, Error_code: 1814
2016-08-10T15:34:35.351126Z 37 [Warning] Slave: InnoDB: Tablespace has been discarded for table 'sbtest1' Error_code: 1814
2016-08-10T15:34:35.351132Z 37 [Warning] Slave: Tablespace has been discarded for table 'sbtest1' Error_code: 1814
2016-08-10T15:34:35.351137Z 37 [Warning] Slave: Table 'db1.sbtest1' doesn't exist Error_code: 1146
2016-08-10T15:34:35.351141Z 37 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'FIRST' position 0
2016-08-10T15:34:35.351152Z 37 [ERROR] Plugin group_replication reported: 'The applier thread execution was aborted. Unable to process more transactions, this member will now leave the group.'
2016-08-10T15:34:35.351170Z 34 [ERROR] Plugin group_replication reported: 'Fatal error during execution on the Applier process of Group Replication. The server will now leave the group.'
2016-08-10T15:34:35.351188Z 34 [Warning] Plugin group_replication reported: 'Skipping leave operation: member already left the group.'
2016-08-10T15:34:35.351430Z 34 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
[11 Aug 2016 6:33] Shahriyar Rzayev
Did you lose all nodes?
For me it was impossible to start others as well after this error.
[6 Sep 2016 4:29] MySQL Verification Team
Hi Shahriyar,

I have discussed internally with dev's on this and concluded that this is expected/intended behavior. I'm converting this issue to doc request so that such cases are explained and documented properly in our manual.

Thanks,
Umesh
[26 Nov 2018 14:42] David Moss
Posted by developer:
 
Thanks for your feedback. In the meantime full documentation has been added:
https://dev.mysql.com/doc/refman/8.0/en/group-replication.html

and it covers the behaviors you observed. Closing.
[5 Mar 2019 13:13] WANG GUANGYOU
node3>STOP GROUP_REPLICATION;
Query OK, 0 rows affected (8.13 sec)

node3>alter table sbtest1 discard tablespace;
Query OK, 0 rows affected (0.04 sec)

I suspect that alter table sbtest1 discard tablespace; can not be executed. For  after run STOP GROUP_REPLICATION, the node become read only