Bug #70426 mismatch between slave_worker_info and replication_execute_status_by_worker
Submitted: 26 Sep 2013 1:10 Modified: 24 Oct 2013 13:27
Reporter: Giuseppe Maxia Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.7.2 OS:Linux
Assigned to: CPU Architecture:Any
Tags: Monitoring, parallel, replication

[26 Sep 2013 1:10] Giuseppe Maxia
Description:
According to the documentation (http://dev.mysql.com/doc/refman/5.7/en/replication-execute-status-by-worker-table.html)

"The replication_execute_status_by_worker table has these columns:

WORKER_ID

The worker identifier (same value as the Worker_id column in the mysql.slave_worker_info table).
"

However, there are two problems:
* the table mysql.slave_worker_info has no "Worker_id" field;
*  Assuming that the column in mysql.slave_worker_info is "Id", which exists, the values do not match:

select id, relay_log_name,relay_log_pos, master_log_name,master_log_pos from mysql.slave_worker_info;select worker_id,service_state,last_seen_transaction from performance_schema . replication_execute_status_by_worker;
+----+-----------------------+---------------+------------------+----------------+
| id | relay_log_name        | relay_log_pos | master_log_name  | master_log_pos |
+----+-----------------------+---------------+------------------+----------------+
|  1 | ./qa-relay-bin.000006 |     147165231 | mysql-bin.000002 |      147165021 |
|  2 | ./qa-relay-bin.000006 |     148397777 | mysql-bin.000002 |      148397567 |
|  3 | ./qa-relay-bin.000006 |     148640659 | mysql-bin.000002 |      148640449 |
|  4 | ./qa-relay-bin.000006 |     149067681 | mysql-bin.000002 |      149067471 |
|  5 | ./qa-relay-bin.000006 |     149265614 | mysql-bin.000002 |      149265404 |
+----+-----------------------+---------------+------------------+----------------+
5 rows in set (0.00 sec)

+-----------+---------------+---------------------------------------------+
| worker_id | service_state | last_seen_transaction                       |
+-----------+---------------+---------------------------------------------+
|         0 | ON            | 71f8cf76-2644-11e3-bc64-00265531a0cc:411949 |
|         1 | ON            | 71f8cf76-2644-11e3-bc64-00265531a0cc:414634 |
|         2 | ON            | 71f8cf76-2644-11e3-bc64-00265531a0cc:415320 |
|         3 | ON            | 71f8cf76-2644-11e3-bc64-00265531a0cc:416480 |
|         4 | ON            | 71f8cf76-2644-11e3-bc64-00265531a0cc:417030 |
+-----------+---------------+---------------------------------------------+
5 rows in set (0.00 sec)

How to repeat:
Start parallel replication with N channels. 

Run 
select id, relay_log_name,relay_log_pos, master_log_name,master_log_pos from mysql.slave_worker_info;
select worker_id,service_state,last_seen_transaction from performance_schema . replication_execute_status_by_worker;
[26 Sep 2013 10:04] MySQL Verification Team
Hello Giuseppe,

Thank you for the bug report.
Verified as described.

Thanks,
Umesh
[26 Sep 2013 10:10] MySQL Verification Team
Agree with the reporter:

* the table mysql.slave_worker_info has no "Worker_id" field;

mysql> select version();
+----------------------------------------------+
| version()                                    |
+----------------------------------------------+
| 5.7.2-m12-enterprise-commercial-advanced-log |
+----------------------------------------------+
1 row in set (0.00 sec)

mysql> desc mysql.slave_worker_info;
+----------------------------+---------------------+------+-----+---------+-------+
| Field                      | Type                | Null | Key | Default | Extra |
+----------------------------+---------------------+------+-----+---------+-------+
| Id                         | int(10) unsigned    | NO   | PRI | NULL    |       |
| Relay_log_name             | text                | NO   |     | NULL    |       |
| Relay_log_pos              | bigint(20) unsigned | NO   |     | NULL    |       |
| Master_log_name            | text                | NO   |     | NULL    |       |
| Master_log_pos             | bigint(20) unsigned | NO   |     | NULL    |       |
| Checkpoint_relay_log_name  | text                | NO   |     | NULL    |       |
| Checkpoint_relay_log_pos   | bigint(20) unsigned | NO   |     | NULL    |       |
| Checkpoint_master_log_name | text                | NO   |     | NULL    |       |
| Checkpoint_master_log_pos  | bigint(20) unsigned | NO   |     | NULL    |       |
| Checkpoint_seqno           | int(10) unsigned    | NO   |     | NULL    |       |
| Checkpoint_group_size      | int(10) unsigned    | NO   |     | NULL    |       |
| Checkpoint_group_bitmap    | blob                | NO   |     | NULL    |       |
+----------------------------+---------------------+------+-----+---------+-------+
12 rows in set (0.00 sec)

http://dev.mysql.com/doc/refman/5.7/en/replication-execute-status-by-worker-table.html

Above table definition contradicts with the manual statement "same value as the Worker_id column in the mysql.slave_worker_info table".. if it meant for "id" then it is a doc bug too.. 

*  Assuming that the column in mysql.slave_worker_info is "Id", which exists, the values do not match:

mysql> select id, relay_log_name,relay_log_pos, master_log_name,master_log_pos from mysql.slave_worker_info;
+----+---------------------------------+---------------+-------------------------+----------------+
| id | relay_log_name                  | relay_log_pos | master_log_name         | master_log_pos |
+----+---------------------------------+---------------+-------------------------+----------------+
|  1 |                                 |             0 |                         |              0 |
|  2 |                                 |             0 |                         |              0 |
|  3 |                                 |             0 |                         |              0 |
|  4 |                                 |             0 |                         |              0 |
|  5 |                                 |             0 |                         |              0 |
|  6 |                                 |             0 |                         |              0 |
|  7 |                                 |             0 |                         |              0 |
|  8 |                                 |             0 |                         |              0 |
|  9 |                                 |             0 |                         |              0 |
| 10 | ./cluster-repo-relay-bin.000004 |          9522 | cluster-repo-bin.000001 |          12909 |
+----+---------------------------------+---------------+-------------------------+----------------+
10 rows in set (0.00 sec)

mysql> select worker_id,service_state,last_seen_transaction from performance_schema . replication_execute_status_by_worker;
+-----------+---------------+-----------------------------------------+
| worker_id | service_state | last_seen_transaction                   |
+-----------+---------------+-----------------------------------------+
|         0 | ON            |                                         |
|         1 | ON            |                                         |
|         2 | ON            |                                         |
|         3 | ON            |                                         |
|         4 | ON            |                                         |
|         5 | ON            |                                         |
|         6 | ON            |                                         |
|         7 | ON            |                                         |
|         8 | ON            |                                         |
|         9 | ON            | a666e4ac-2744-11e3-82ec-5628c563a202:45 |
+-----------+---------------+-----------------------------------------+
10 rows in set (0.00 sec)

^^
[24 Oct 2013 13:27] Jon Stephens
Documented fix as follows in the 5.7.3 changelog:

    The WORKER_ID column of the replication_execute_status_by_worker
    table did not use the internal id column values from the
    mysql.slave_worker_info table as expected.
      
Closed.