Bug #110615 Some MTR tests in group_replication suite fail because of timeouts
Submitted: 5 Apr 2023 10:42 Modified: 6 Apr 2023 13:04
Reporter: Varun Nagaraju (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:Tests: Group Replication Severity:S7 (Test Cases)
Version:8.0.32 OS:Linux
Assigned to: CPU Architecture:Any
Tags: Contribution

[5 Apr 2023 10:42] Varun Nagaraju
Description:
Some group_replication tests fail on high load with either a timeout or a result content mismatch with additional warning message regarding debug sync point.

./mtr group_replication.gr_json_array_unique_key_conflict{,,,,,,,,,,,,,,,,,,,,,,} --force --max-test-fail=0 --retry=0 --retry-failure=0 --parallel=20 --repeat=10
Logging: /home/varun/percona-server-4165/repo/mysql-test/mysql-test-run.pl  .......... group_replication.gr_json_array_unique_key_conflict group_replication.gr_json_array_unique_key_conflict group_replication.gr_json_array_unique_key_conflict --force --max-test-fail=0 --retry=0 --retry-failure=0 --parallel=20 --repeat=10
MySQL Version 8.0.32
Checking supported features
 - Binaries are debug compiled
Using 'all' suites
Collecting tests
Checking leftover processes
Removing old var directory
Creating var directory '/home/varun/percona-server-4165/bld/mysql-test/var'
Installing system database
Using parallel: 20

==============================================================================
                  TEST NAME                   WORKER RESULT  TIME (ms) COMMENT
------------------------------------------------------------------------------
[  0%] group_replication.gr_json_array_unique_key_conflict w13 [ pass ]  15072
[  0%] group_replication.gr_json_array_unique_key_conflict w5  [ pass ]  15149
[  1%] group_replication.gr_json_array_unique_key_conflict w18 [ pass ]  15031
[  1%] group_replication.gr_json_array_unique_key_conflict w4  [ pass ]  15249
.
.
.
.
.
[ 30%] group_replication.gr_json_array_unique_key_conflict w1  [ pass ]  12504
[ 30%] group_replication.gr_json_array_unique_key_conflict w4  [ pass ]  12480
[ 31%] group_replication.gr_json_array_unique_key_conflict w12 [ fail ]
        Test ended at 2023-04-05 12:58:55

CURRENT_TEST: group_replication.gr_json_array_unique_key_conflict
mysqltest: At line 158: Error in sync_with_master.inc
In included file ./include/sync_slave_sql.inc: 159
included from ./include/sync_slave_sql_with_master.inc: 79
included from ./include/gr_parallel_local_and_remote_transactions.inc: 172
included from /home/varun/percona-server-4165/repo/mysql-test/suite/group_replication/t/gr_json_array_unique_key_conflict.test: 69
.
.
.
.
connection server_1;
ERROR: sync_slave_sql.inc failed on connection 'server_1'
ERROR: use_gtids='1'
ERROR: _saved_gtids='3a81b776-d383-11ed-bf65-2c33587b06a6:1-5:1000002-1000003'
ERROR: _saved_file='server-binary-log.000001'
ERROR: _saved_pos='2386'
ERROR: _saved_channel_name=
ERROR: timeout='60'
ERROR: result='-1'
ERROR: error type: Timeout after 60 seconds.
safe_process[165943]: Child process: 165944, exit: 1

Result content mismatch case:

CURRENT_TEST: group_replication.gr_primary_key_on_invisible_column_conflict 
--- /tmp/results/mysql-test/suite/group_replication/r/gr_primary_key_on_invisible_column_conflict.result2023-02-11 03:09:56.000000000 +0300 
+++ /tmp/results/mysql-test/var/2/log/gr_primary_key_on_invisible_column_conflict.reject2023-02-11 04:30:09.740995165 +0300 
@@ -152,6 +152,8 @@ 
# remote server was ordered first. If the test case is 
# is positive scenario, no error will be seen here. 
[connection server1] 
+Warnings: 
+Warning1639debug sync point wait timed out 
[connection server1] 
SET @@GLOBAL.DEBUG=@debug_save; 
############################################################ 
mysqltest: Result content mismatch

How to repeat:
Run the group replication tests which invoke mysql-test/include/gr_parallel_local_and_remote_transactions.inc under high load.

Suggested fix:
Invoking DEBUG_SYNC=RESET immediately after sending the 'waiting' signal might cause the signal to be lost even before it wakes up the thread which is waiting for the signal. So, clearing DEBUG_SYNC should be done as part of the cleanup of the test.
[5 Apr 2023 10:43] Varun Nagaraju
I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Attachment: Bug#110615.patch (text/x-patch), 61.30 KiB.

[6 Apr 2023 13:04] MySQL Verification Team
Hello Varun Nagaraju,

Thank you for the report and contribution.
I'm able to reproduce but only on debug builds.
Please ensure to re-send the patch via "Contribution" tab. Otherwise we would not be able to accept it.

regards,
Umesh
[6 Apr 2023 13:09] Varun Nagaraju
Uploading the patch via contribution

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: Bug#110615.patch (text/x-patch), 61.30 KiB.