Bug #116535 | Plugin instructed the server to rollback the current transaction | ||
---|---|---|---|
Submitted: | 4 Nov 2024 4:56 | Modified: | 29 Dec 2024 12:42 |
Reporter: | Krishnadas K P | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S3 (Non-critical) |
Version: | 8.0.39-0ubuntu0.24.04.2 | OS: | Ubuntu (24.04) |
Assigned to: | MySQL Verification Team | CPU Architecture: | x86 |
Tags: | group replication, rollback transaction |
[4 Nov 2024 4:56]
Krishnadas K P
[8 Nov 2024 15:45]
MySQL Verification Team
Hi, I did not manage to reproduce this, but I am not 100% sure even if I did this would be a bug as in multimaster having some transactions in parallel could be problematic so those would be rollbacked. MySQL do not know you are using only one server for writes so this rule would be enabled in multimaster setup. I'm double checking this with the GR team but in the meantime if you could help me reproduce this it would be great as ATTM me pushing a large number of transactions to a GR with multimaster did not reproduce this so it has to be a specific mixture of transactions Thanks
[11 Nov 2024 5:22]
Krishnadas K P
Thank you for looking into this. I understand this is a hard case to reproduce. For me, I am able to consistently reproduce this with my performance testing script which does a test of my app, of which this MySQL cluster is the database component. As mentioned already, I am not actually doing multi-write for node conflict induced rollbacks to happen and hence would have to assume the conflicts are between queries running in the same node. My concern with this behavior is that, if the queries do conflict, why are they not conflicting when I do group_replication_switch_to_single_primary_mode() and promptly return when I do group_replication_switch_to_multi_primary_mode(). What checks do single primary mode lack that multi primary mode has, that might be giving a pass to these queries ? Either single primary mode is allowing queries that should have been rolled back or multi-primary mode is rolling back queries that should have been allowed. In addition, I am not able to get diagnostics on what is causing the conflict/rollback from the logs. I have enabled general log and all I can see is some queries are rolling back and no other info. https://bugs.mysql.com/84730 was for having more observability into such cases but doesn't seem to have implemented. Please let me know if you need logs etc. for looking into this. Once again, thank you for looking into this.
[12 Nov 2024 15:32]
MySQL Verification Team
Hi, Yes, FR from Bug#84730 would let us easily find what's wrong. If you have idea how I can more easily reproduce this - share please :) I assume you cannot share your testing procedure as it is part of your app? I am working with GR team to try to find out how to proceed. Kind regards
[14 Nov 2024 5:17]
Krishnadas K P
Thank you. My tests are mostly doing API requests to the app rather than SQL queries so might have a bit of trouble replicating that exactly. However I will try to find out a way to replicate this consistently with scripts. Would it be helpful to share logs from single primary mode (without any rollbacks) and multi primary mode (with rollbacks) ?
[20 Nov 2024 9:00]
Krishnadas K P
I have an update on this. I did similar load testing by using Galera cluster and I did not face transaction rollback issue. This seems to be isolated to the multi-primary mode Group replication config
[29 Nov 2024 12:42]
MySQL Verification Team
> Would it be helpful to share logs from single primary mode (without any rollbacks) and multi primary mode (with rollbacks) ? Logs might help, but if you are able to write a script that does similar to what your app is doing that reproduces the problem that would be the best case scenario as I'm failing to reproduce this myself.
[30 Dec 2024 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".