Description:
I noticed something after recently changing servers to use the new lockless output with performance_schema_show_processlist = NO.
This relates specifically to how the SQL and I/O replication threads are shown.
I'm not sure if it should be a bug or an important difference as code that uses processlist output may need to be aware of these differences to recognise these threads and currently it's not 100% clear how to distinguish these threads from normal user threads.
How to repeat:
Seen on 8.0.34 but I believe this is most likely to be a general problem on later innovation releases including 8.3.0 too.
I can the following:
root@somehost [(none)]> show processlist;
+--------+-------------------+----------------------------------------+--------------------+---------+-------+----------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------------+----------------------------------------+--------------------+---------+-------+----------------------------------------------------------+------------------+
| 105966 | orchestrator | orchestratorapp-AAAA.example.com:54851 | NULL | Sleep | 3 | | NULL |
| 105885 | orchestrator | orchestratorapp-AAAA.example.com:55881 | NULL | Sleep | 5 | | NULL |
| 105886 | orchestrator | orchestratorapp-AAAA.example.com:55889 | NULL | Sleep | 5 | | NULL |
| 41342 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for source to send event | NULL |
| 41343 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Query | 0 | Replica has read all relay log; waiting for more updates | NULL |
| 41344 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Query | 0 | Waiting for an event from Coordinator | NULL |
| 41345 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Query | 11962 | Waiting for an event from Coordinator | NULL |
| 41346 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41347 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41348 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41349 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41350 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41351 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41352 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41353 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41354 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41355 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41356 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41357 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41358 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41359 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41360 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41361 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41362 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41363 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41364 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41365 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41366 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41367 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41368 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41369 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41370 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41371 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41372 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41373 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41374 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 41375 | orchestrator | orchestratorapp-BBBB.example.com | NULL | Connect | 39837 | Waiting for an event from Coordinator | NULL |
| 106049 | orchestrator | orchestratorapp-BBBB.example.com:60939 | NULL | Sleep | 4 | | NULL |
| 106050 | orchestrator | orchestratorapp-BBBB.example.com:60941 | NULL | Sleep | 4 | | NULL |
| 106051 | orchestrator | orchestratorapp-BBBB.example.com:60951 | NULL | Sleep | 4 | | NULL |
| 105908 | orchestrator | orchestratorapp-AAAA.example.com:34401 | NULL | Sleep | 3 | | NULL |
| 105909 | orchestrator | orchestratorapp-AAAA.example.com:34411 | NULL | Sleep | 3 | | NULL |
| 105911 | orchestrator | orchestratorapp-AAAA.example.com:57201 | NULL | Sleep | 5 | | NULL |
| 99952 | root | localhost | NULL | Query | 0 | executing | show processlist |
+--------+-------------------+----------------------------------------+--------------------+---------+-------+----------------------------------------------------------+------------------+
59 rows in set (0.00 sec)
root@somehost [(none)]> show global variables like '%processlist%';
+-------------------------------------+-------+
| Variable_name | Value |
+-------------------------------------+-------+
| performance_schema_show_processlist | ON |
+-------------------------------------+-------+
1 row in set (0.01 sec)
root@somehost [(none)]> set global performance_schema_show_processlist = OFF;
Query OK, 0 rows affected (0.00 sec)
root@somehost [(none)]> show processlist;
+--------+-------------------+----------------------------------------+-----------+---------+-------+----------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------------+----------------------------------------+-----------+---------+-------+----------------------------------------------------------+------------------+
| 41342 | system user | connecting host | NULL | Connect | 40500 | Waiting for source to send event | NULL |
| 41343 | system user | | NULL | Query | 0 | Replica has read all relay log; waiting for more updates | NULL |
| 41344 | system user | | NULL | Query | 0 | Waiting for an event from Coordinator | NULL |
| 41345 | system user | | NULL | Query | 608 | Waiting for an event from Coordinator | NULL |
| 41346 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41347 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41348 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41349 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41350 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41351 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41352 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41353 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41354 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41355 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41356 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41357 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41358 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41359 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41360 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41361 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41362 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41363 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41364 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41365 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41366 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41367 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41368 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41369 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41370 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41371 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41372 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41373 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41374 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 41375 | system user | | NULL | Connect | 40500 | Waiting for an event from Coordinator | NULL |
| 99952 | root | localhost | NULL | Query | 0 | init | show processlist |
| 107060 | orchestrator | orchestratorapp-AAAA.example.com:40507 | NULL | Sleep | 3 | | NULL |
| 107061 | orchestrator | orchestratorapp-AAAA.example.com:40513 | NULL | Sleep | 3 | | NULL |
| 107083 | orchestrator | orchestratorapp-AAAA.example.com:59721 | NULL | Sleep | 3 | | NULL |
| 107086 | orchestrator | orchestratorapp-AAAA.example.com:43257 | NULL | Sleep | 5 | | NULL |
| 107087 | orchestrator | orchestratorapp-AAAA.example.com:43269 | NULL | Sleep | 5 | | NULL |
| 107151 | orchestrator | orchestratorapp-AAAA.example.com:38947 | NULL | Sleep | 5 | | NULL |
| 107239 | orchestrator | orchestratorapp-BBBB.example.com:34417 | NULL | Sleep | 4 | | NULL |
| 107240 | orchestrator | orchestratorapp-BBBB.example.com:34437 | NULL | Sleep | 4 | | NULL |
| 107241 | orchestrator | orchestratorapp-BBBB.example.com:34427 | NULL | Sleep | 4 | | NULL |
+--------+-------------------+----------------------------------------+-----------+---------+-------+----------------------------------------------------------+------------------+
58 rows in set (0.00 sec)
The output has been somewhat manipulated to remove real data but you get the idea.
1. With performance_schema_show_processlist = OFF you see:
User = "system user", Host = ""
2. With performance_schema_show_processlist = ON you see:
User/Host = username/hostname of the connection that initiated the replication thread.
That user/host may no longer be present so it's misleading to show this information, as the the SQL / I/O threads created are created on the system so where they were connected from is largely I believe irrelevant. I think the output is incorrect and misleading.
I personally think this is a bug. I guess if the behaviour of performance_schema.processlist (implicit from the setting used) may have been like this forever but I just missed it.
Basically I was looking for a safe/good way to find, recognise and manipulate these threads for certain workflows. The command "Connect" may be one way to do the filtering but to cleanly distinguish I/O from SQL threads requires being aware of all States that both thread types can have. When catching certain error conditions this may be tricky. Also for GR there may be other similar complications.
Why do I show this with orchestrator/orchestratorapp. In the case of this server replication was reconfigured remotely by orchestrator which uses a different username and comes from a remote connection from a different host and so the logging of the replication threads in show processlist is somewhat confusing.
Suggested fix:
I consider this a bug and think the values shown when performance_schema_show_processlist = OFF should be shown even when performance_schema_show_processlist = ON, and the performance_schema.processlist output should match in a similar manner.
If the change is deliberate and explicit it should be mentioned. https://dev.mysql.com/doc/refman/8.0/en/performance-schema-processlist-table.html seems very vague and refers to https://dev.mysql.com/doc/refman/8.0/en/performance-schema-system-variables.html#sysvar_pe... but the specifics of how replication threads are handled and displayed is completely missing and the difference shown above is missing entirely.
So thoughts on this would be welcome. Fixing the bug would be good if you agree the behaviour and output might be confusing.