Description:
I think that there is something in BUG #91008. I made an update there but it's in closed, so you are not gonna to check it again.
We made very similar problem like there.
In a nut shell:
- master was changing table with only 2 columns,
- replication stopped
More information which I passed to 91008.
2024-10-16T10:24:23.319479Z 64 [Warning] [MY-010584] [Repl] Replica SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_l
og_pos 401108027; Error in cleaning up after an event preceding the commit; the group log file/position: binlog.000361 401107180, Error_code: MY-000001
2024-10-16T10:24:23.319598Z 64 [ERROR] [MY-010584] [Repl] Replica SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_log
_pos 401108027; Error 'Got error 125 - 'Transaction has been rolled back' during COMMIT' on query. Default database: ''. Query: 'COMMIT', Error_code: MY-001180
2024-10-16T10:24:23.319660Z 63 [Warning] [MY-010584] [Repl] Replica SQL for channel '': ... The replica coordinator and worker threads are stopped, possibly leaving data in
inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to probl
ems. In such cases you have to examine your data (see documentation for details). Error_code: MY-001756
we do have ndbd 8.0.34.
Suddenly it stopped with this error.
Slave_IO_State: Waiting for source to send event
Master_Host: master_host
Master_User: repliaction_user
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000361
Read_Master_Log_Pos: 591918710
Relay_Log_File: relay_file
Relay_Log_Pos: 39318280
Relay_Master_Log_File: binlog.000361
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1180
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_log_pos 401108027. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 401107180
Relay_Log_Space: 57718384
and the only action was to execute "start slave;". No skip, nothing, it has reach last position of master binlog.
BUT.
It looks that mysqld has skipped problematic transaction.
I can see in binlog, that value was:
# at 401108471
#241016 12:24:23 server id 41 end_log_pos 401108536 CRC32 0x7dc8061e Write_rows: table id 272
# at 401108536
#241016 12:24:23 server id 31 end_log_pos 401108609 CRC32 0xe20d63df Write_rows: table id 1008
# at 401108609
#241016 12:24:23 server id 41 end_log_pos 401108743 CRC32 0xa759b900 Write_rows: table id 477
# at 401108743
#241016 12:24:23 server id 41 end_log_pos 401108818 CRC32 0xc1e1b813 Write_rows: table id 479 flags: STMT_END_F
### INSERT INTO `mysql`.`ndb_apply_status`
### SET
### @1=41
### @2=77507911831519245
### @3=''
### @4=0
### @5=0
### INSERT INTO `database1`.`table_1`
### SET
### @1='key_1'
### @2=38
### INSERT INTO `database2`.`table_2`
### SET
### @1='val_1'
### @2='val_2'
### @3=''
### @4=1
### @5=1729074263
### @6=1
### @7=1
### @8='longer value'
### INSERT INTO `database2`.`table_2`
I've checked on master
### INSERT INTO `database1`.`table_1`
### SET
### @1='key_1'
### @2=38
val 38 was set for key_1, however when slave was synch to master had value 36 for key_1.
It looks that mysqld has skipped this transaction on slave side.
How to repeat:
Pleas try to set up replication, and perform updates on 2 columns table.