Bug #116682 replication breaks when update on 2 col table
Submitted: 15 Nov 15:05 Modified: 15 Nov 19:38
Reporter: Marek Kretkiewicz Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S3 (Non-critical)
Version:8.0.34 OS:Red Hat
Assigned to: MySQL Verification Team CPU Architecture:Any

[15 Nov 15:05] Marek Kretkiewicz
Description:
I think that there is something in BUG #91008. I made an update there but it's in closed, so you are not gonna to check it again.

We made very similar problem like there.

In a nut shell:
- master was changing table with only 2 columns,
- replication stopped

More information which I passed to 91008.

2024-10-16T10:24:23.319479Z 64 [Warning] [MY-010584] [Repl] Replica SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_l
og_pos 401108027; Error in cleaning up after an event preceding the commit; the group log file/position: binlog.000361 401107180, Error_code: MY-000001
2024-10-16T10:24:23.319598Z 64 [ERROR] [MY-010584] [Repl] Replica SQL for channel '': Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_log
_pos 401108027; Error 'Got error 125 - 'Transaction has been rolled back' during COMMIT' on query. Default database: ''. Query: 'COMMIT', Error_code: MY-001180
2024-10-16T10:24:23.319660Z 63 [Warning] [MY-010584] [Repl] Replica SQL for channel '': ... The replica coordinator and worker threads are stopped, possibly leaving data in
 inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to probl
ems. In such cases you have to examine your data (see documentation for details). Error_code: MY-001756

we do have ndbd 8.0.34.
Suddenly it stopped with this error.

       Slave_IO_State: Waiting for source to send event
                  Master_Host: master_host
                  Master_User: repliaction_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000361
          Read_Master_Log_Pos: 591918710
               Relay_Log_File: relay_file
                Relay_Log_Pos: 39318280
        Relay_Master_Log_File: binlog.000361
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1180
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'ANONYMOUS' at source log binlog.000361, end_log_pos 401108027. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 401107180
              Relay_Log_Space: 57718384

and the only action was to execute "start slave;". No skip, nothing, it has reach last position of master binlog.

BUT.

It looks that mysqld has skipped problematic transaction.
I can see in binlog, that value was:

# at 401108471
#241016 12:24:23 server id 41  end_log_pos 401108536 CRC32 0x7dc8061e   Write_rows: table id 272
# at 401108536
#241016 12:24:23 server id 31  end_log_pos 401108609 CRC32 0xe20d63df   Write_rows: table id 1008
# at 401108609
#241016 12:24:23 server id 41  end_log_pos 401108743 CRC32 0xa759b900   Write_rows: table id 477
# at 401108743
#241016 12:24:23 server id 41  end_log_pos 401108818 CRC32 0xc1e1b813   Write_rows: table id 479 flags: STMT_END_F
### INSERT INTO `mysql`.`ndb_apply_status`
### SET
###   @1=41
###   @2=77507911831519245
###   @3=''
###   @4=0
###   @5=0
### INSERT INTO `database1`.`table_1`
### SET
###   @1='key_1'
###   @2=38
### INSERT INTO `database2`.`table_2`
### SET
###   @1='val_1'
###   @2='val_2'
###   @3=''
###   @4=1
###   @5=1729074263
###   @6=1
###   @7=1
###   @8='longer value'
### INSERT INTO `database2`.`table_2`

I've checked on master 

### INSERT INTO `database1`.`table_1`
### SET
###   @1='key_1'
###   @2=38

val 38 was set for key_1, however when slave was synch to master had value 36 for key_1.

It looks that mysqld has skipped this transaction on slave side.

How to repeat:
Pleas try to set up replication, and perform updates on 2 columns table.
[15 Nov 19:38] MySQL Verification Team
Duplicate of Bug #91008