Bug #91477 Slave writes less when connected intermediary master with blackhole engine
Submitted: 28 Jun 16:12 Modified: 29 Jun 10:30
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.22 OS:Any
Assigned to: CPU Architecture:Any

[28 Jun 16:12] Sveta Smirnova
Description:
Multi-threaded slave (slave_parallel_workers=8) writes less when connected to intermediary master with blackhole engine. Practically it writes only in 2 threads, compared to situation when intermediary master uses InnoDB storage engine.

How to repeat:
1. Start 3 servers from mysql-test directory: 
./mtr --start --mysqld=--master-info-repository=table --mysqld=--relay-log-info-repository=table --suite=rpl rpl_row_img_eng_full --mysqld=--gtid_mode=ON --mysqld=--log-slave-updates --mysqld=--enforce-gtid-consistency --mysqld=--binlog-format=row &
2. Update slave_parallel_type and slave_parallel_workers on two of them:
mysql> set global slave_parallel_type='logical_clock';
Query OK, 0 rows affected (0.00 sec)

mysql> set global slave_parallel_workers=8;
Query OK, 0 rows affected (0.00 sec)
3. Create replication chain:
blackhole> CHANGE MASTER TO master_host='127.0.0.1', master_port=13000, master_user='root', MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 1 warning (0.47 sec)

blackhole> start slave;
Query OK, 0 rows affected (0.02 sec)

slave> CHANGE MASTER TO master_host='127.0.0.1', master_port=13001, master_user='root', MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 1 warning (0.39 sec)

slave> start slave;
Query OK, 0 rows affected (0.02 sec)
4. Create database sbtest and loaded 1 table into it:
master> create database sbtest;
Query OK, 1 row affected (0.04 sec)

$ ~/build/sysbench/bin/sysbench --threads=16 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=13000 --mysql-user=root ~/build/sysbench/share/sysbench/oltp_read_write.lua --tables=1 --table_size=10000 prepareWARNING: Both event and time limits are disabled, running an endless test
sysbench 1.1.0-2e6b7d5 (using bundled LuaJIT 2.1.0-beta2)

Initializing worker threads...

Creating table 'sbtest1'...
Inserting 10000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...
5. Change engine to Blackhole on intermediary master:
blackhole> set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

blackhole> alter table sbtest1 engine=blackhole;
Query OK, 10000 rows affected (0.40 sec)
Records: 10000  Duplicates: 0  Warnings: 0

blackhole> set sql_log_bin=1;
Query OK, 0 rows affected (0.00 sec)
6. Run load test:
$ ~/build/sysbench/bin/sysbench --threads=16 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=13000 --mysql-user=root ~/build/sysbench/share/sysbench/oltp_read_write.lua --tables=1 --table_size=10000 run
WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.1.0-2e6b7d5 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 16
Initializing random number generator from current time

Initializing worker threads...

Threads started!
7. After few minutes check progress on slave:
slave> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
               ...
              Master_Log_File: mysqld-bin.000001
          Read_Master_Log_Pos: 10457166
               Relay_Log_File: Thinkie-relay-bin.000002
                Relay_Log_Pos: 5735135
        Relay_Master_Log_File: mysqld-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...            
          Exec_Master_Log_Pos: 5734920
              Relay_Log_Space: 10457590
...              
        Seconds_Behind_Master: 185
...        
           Retrieved_Gtid_Set: 7ebc5de8-793b-11e8-912f-30b5c2208a0f:1-5128
            Executed_Gtid_Set: 7ebc5de8-793b-11e8-912f-30b5c2208a0f:1-2309
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

slave> select * from replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION                     | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
|              |         1 |        28 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:2331 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         2 |        29 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:2270 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         3 |        30 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
|              |         4 |        31 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
|              |         5 |        32 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
|              |         6 |        33 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
|              |         7 |        34 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
|              |         8 |        35 | ON            |                                           |                 0 |                    | 0000-00-00 00:00:00  |
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
8 rows in set (0.01 sec)
I found only two threads are running and one of them is very behind of another one.

8. Now test with InnoDB. Stop load, wait until slave catches up, drop table sbtest1 and re-create it. Now intermediary master also uses InnoDB storage engine:
master> drop table sbtest1;
Query OK, 0 rows affected (0.16 sec)

$ ~/build/sysbench/bin/sysbench --threads=16 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=13000 --mysql-user=root ~/build/sysbench/share/sysbench/oltp_read_write.lua --tables=1 --table_size=10000 prepare
WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.1.0-2e6b7d5 (using bundled LuaJIT 2.1.0-beta2)

Initializing worker threads...

Creating table 'sbtest1'...
Inserting 10000 records into 'sbtest1'
Creating a secondary index on 'sbtest1'...

blackhole> show create table sbtest1;
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                        |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest1 | CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=latin1 |
+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
9. Restart load:
$ ~/build/sysbench/bin/sysbench --threads=16 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=13000 --mysql-user=root ~/build/sysbench/share/sysbench/oltp_read_write.lua --tables=1 --table_size=10000 run
WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.1.0-2e6b7d5 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 16
Initializing random number generator from current time

Initializing worker threads...

Threads started!
10. Now slave is perfectly catching up the master, because all 8 parallel workers are busy and do their job:
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
...               
              Master_Log_File: mysqld-bin.000001
          Read_Master_Log_Pos: 12427193
               Relay_Log_File: Thinkie-relay-bin.000002
                Relay_Log_Pos: 12414040
        Relay_Master_Log_File: mysqld-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...            
          Exec_Master_Log_Pos: 12413825
              Relay_Log_Space: 12427617
...              
        Seconds_Behind_Master: 1
...        
           Retrieved_Gtid_Set: 7ebc5de8-793b-11e8-912f-30b5c2208a0f:1-5177
            Executed_Gtid_Set: 7ebc5de8-793b-11e8-912f-30b5c2208a0f:1-5168:5173
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql> select * from replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION                     | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
|              |         1 |        28 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5242 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         2 |        29 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5239 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         3 |        30 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5240 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         4 |        31 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5241 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         5 |        32 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5245 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         6 |        33 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5244 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         7 |        34 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5246 |                 0 |                    | 0000-00-00 00:00:00  |
|              |         8 |        35 | ON            | 7ebc5de8-793b-11e8-912f-30b5c2208a0f:5243 |                 0 |                    | 0000-00-00 00:00:00  |
+--------------+-----------+-----------+---------------+-------------------------------------------+-------------------+--------------------+----------------------+
8 rows in set (0.00 sec)
[28 Jun 19:53] Sveta Smirnova
This is repeatable with version 8.0.11 too
[29 Jun 10:30] Umesh Shastry
Hello Sveta,

Thank you for the report!
Verified as described. 

Regards,
Umesh
[16 Jul 16:21] Yura Sorokin
MTR test case to reproduce the problem (.test)

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug91477.test (application/octet-stream, text), 4.63 KiB.

[16 Jul 16:21] Yura Sorokin
MTR test case to reproduce the problem (.cnf)

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug91477.cnf (application/octet-stream, text), 493 bytes.

[16 Jul 16:22] Yura Sorokin
Put the attached files into the 'rpl' MTR test suite
('mysql-test/suite/rpl/t')