Bug #119628 Undo Truncation Bug, not fixed in 8.0, leads to Data Corruption.
Submitted: 5 Jan 22:20
Reporter: Jean-François Gagné Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[5 Jan 22:20] Jean-François Gagné
Description:
Hi,

Bug#112262, reported 5 Sep 2023 for 8.0 was fixed only in 8.4 and not in 8.0 (should have been fixed in 8.0.37 released at the same time as 8.4.0).  It is very easy to verify that this bug still affects 8.0.44.

After a failed Automated Undo Log Truncation because of super_read_only, a trunc.log file is left on disk, and on the next MySQL startup, this file triggers an Undo Tablespace Reconstruction (visible in log verbosity 3).  This Undo Tablespace Reconstruction leads to Data Corruption in case of a MySQL Crash with an uncommitted transaction.  As shown in How to repeat, the data of the uncommitted transaction is present in InnoDB after restart (v = 2 for id = 1), while we would expect this transaction to be rolled back by crash recovery (and the value to be restored to v = 1 for id = 1).

In addition to the case of a MySQL Crash, I think corruption can be triggered in case of preparing a hot backup, but I have not verified this.  Also, I think that if a non-slow Shutdown is done (without a full InnoDB Purge), the Undo Tablespace Reconstruction might lead to losing pointers to delete-marked rows (and updated rows), which will prevent the InnoDB Purge to garbage-collect deleted rows (and could lead to other nasty situations caused by rows having invalid rollback pointer).

Many thanks for looking into this (I think this is as simple as back-porting the fix from 8.4 to 8.0),

Jean-François Gagné

How to repeat:
# Create a sandbox for our tests.
# (we need replication-crash-safety as we will kill mysqld)
# (innodb_purge_rseg_truncate_frequency=1 for quick automated truncation)
# (the pv command is a trick to time command execution)
{
  v=8.0.44; d=${v//./_}
  dbda="--repl-crash-safe -c innodb_purge_rseg_truncate_frequency=1 -c innodb_max_undo_log_size=12582912"
  dbdeployer deploy replication $v $dbda | pv -tN dbdepl. > /dev/null
  cd ~/sandboxes/rsandbox_$d
}
  dbdepl.: 0:00:13

# Create a table for our tests, and configure replicas.
# (replica 1 is sro: automated truncation will fail there)
{
  ./m <<< "
    CREATE DATABASE test_jfg;
    CREATE TABLE test_jfg.t1(id INT PRIMARY KEY, v INT)"

  ./s1 <<< "SET PERSIST super_read_only = 1"
}

# Some functions to make tests more compact.
function inserts() {
  ./m test_jfg <<< "TRUNCATE TABLE t1"
  seq 1 1000000 |
    while read i; do echo "($i,$(($i % 10)))"; done |
    paste -s -d "$(printf ',%.0s' {1..1000})\n"     |
    sed -e 's/.*/INSERT INTO t1 values &;/'         |
    sed -e '1s/^/BEGIN; /;$s/$/ COMMIT;/'           |
    ./m test_jfg | pv -tN insert
}

function replication_wait() {
  local sms="$(./m <<< "SHOW MASTER STATUS\G")"
  local file="$(awk '$1 == "File:"{print $2}' <<< "$sms")"
  local pos="$(awk '$1 == "Position:"{print $2}' <<< "$sms")"

  for n in s{1..2}; do
    ./$n <<< "SELECT MASTER_POS_WAIT('$file', $pos)" > /dev/null
  done | pv -tN rplwait
}

# Generate two trunc.log files in node1.
# (unclear why, but below sometimes only generate 1 file)
# (with a single trunc.log file, there will not be data corruption)
# (rerun the commands until there are two trunc.log files in node1)
{
  for i in 1 2; do inserts; replication_wait; sleep 2; done
  ls -l node{1,2}/data/undo*
}
  insert: 0:00:06
  rplwait: 0:00:01
   insert: 0:00:06
  rplwait: 0:00:01
-rw-r----- 1 jfgagne staff 16777216 Jan  5 16:40 node1/data/undo_001
-rw-r----- 1 jfgagne staff 16777216 Jan  5 16:40 node1/data/undo_002
-rw-r----- 1 jfgagne staff    16384 Jan  5 16:40 node1/data/undo_1_trunc.log
-rw-r----- 1 jfgagne staff    16384 Jan  5 16:40 node1/data/undo_2_trunc.log
-rw-r----- 1 jfgagne staff 16777216 Jan  5 16:40 node2/data/undo_001
-rw-r----- 1 jfgagne staff 16777216 Jan  5 16:40 node2/data/undo_002

# With two trunc.log files, crash MySQL with an uncommitted transaction.
# (cd in a sub-shell for the commands to also work for node2)
#
# Below, after killing mysqld, the transaction was not rolled back by crash recovery.
# (with a rollback, we would have v = 1 for id = 1, but we have v = 2 which is the uncommitted value)
# (running the same commands in node2 leads to a good rollback on startup)

( cd node1/; ./use test_jfg; )

slave1 [localhost:23446] {msandbox} (test_jfg) > SET GLOBAL read_only = off; SELECT * FROM t1 ORDER BY id LIMIT 1; BEGIN; UPDATE t1 SET v = v+1; SELECT * FROM t1 ORDER BY id LIMIT 1;
Query OK, 0 rows affected (0.00 sec)

+----+------+
| id | v    |
+----+------+
|  1 |    1 |
+----+------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 1000000 rows affected (1.88 sec)
Rows matched: 1000000  Changed: 1000000  Warnings: 0

+----+------+
| id | v    |
+----+------+
|  1 |    2 |
+----+------+
1 row in set (0.00 sec)

slave1 [localhost:23446] {msandbox} (test_jfg) > system ls -l data/undo*
-rw-r----- 1 jfgagne staff 50331648 Jan  5 16:40 data/undo_001
-rw-r----- 1 jfgagne staff 16777216 Jan  5 16:40 data/undo_002
-rw-r----- 1 jfgagne staff    16384 Jan  5 16:40 data/undo_1_trunc.log
-rw-r----- 1 jfgagne staff    16384 Jan  5 16:40 data/undo_2_trunc.log
slave1 [localhost:23446] {msandbox} (test_jfg) > system kill -9 $(cat data/mysql_sandbox*.pid)
slave1 [localhost:23446] {msandbox} (test_jfg) > SELECT * FROM t1 ORDER BY id LIMIT 1;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    14
Current database: test_jfg

+----+------+
| id | v    |
+----+------+
|  1 |    2 |
+----+------+
1 row in set (0.02 sec)

Suggested fix:
Back-port the fix for Bug#112262 in 8.0.