Bug #101978 Generated columns (virtual & stored) do not update on replica
Submitted: 11 Dec 2020 20:19 Modified: 20 Apr 18:09
Reporter: Vasilis Kioustelidis Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:8.0.22 OS:CentOS (7.9.2009)
Assigned to: CPU Architecture:x86 ( Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz)
Tags: functional index, functional index, replication, virtual column

[11 Dec 2020 20:19] Vasilis Kioustelidis
Description:
I have an original replication setup consisting of two OLTP mysql instances (5.7.31 on Centos 7) and two OLAP (5.7.30 on Solaris). The schemas on the OLAP servers differ from OLTP in two significant ways:
1. several tables have virtual columns that are build by applying:
* a cast to date on a VARCHAR timestamp
* a hashing function on BLOB
2. the tables that have the above enhancements have been provided with indexes on the generated columns
The original replication setup has been working perfectly for several months withou issue.
I decided to add three new OLAP devices in an effort to remove the solaris nodes AND upgrade to the latest version of mysql.

The three new instances use mysql 8.0.14 (on Centos 7) and get replication from one the OLAP nodes. This new setup was also working as expected and everything was being updated as normal.

The next step was to decouple the three new OLAP nodes from the old OLAP nodes and have them receive replication directly from the OLTP nodes. I first upgraded the new nodes to 8.0.22 without issue and I switched their master configuration to the OLTP servers. The row updates started coming through as expected, the computed columns were complete however the indexes on the computed columns stopped updating.

I can retrieve data using the indexes up to the point when I made the change but since then the indexed receive NO updates. I also tried to create copies of the indexes (different name) and again I noticed that the new indexes were complete up to the creation point-in-time but they received no updates after that.

How to repeat:
I've managed to replicate the behaviour by creating a small lab using docker. I've created two docker instances named olap_replica_v8.0.22_a and olap_replica_v8.0.22_b. The instance olap_replica_v8.0.22_a holds the 'original' data and olap_replica_v8.0.22_b is the replica that contains the same table+generated columns+indexes.

The original data table on olap_replica_v8.0.22_a is as follows:
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| ID | int unsigned | NO | PRI | NULL | auto_increment |
| TM_TS | varchar(14) | YES | | NULL | |
| DATA | blob | YES | | NULL | |
+-------+--------------+------+-----+---------+----------------+

Next I created the replica table on olap_replica_v8.0.22_b as follows:
+--------+--------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------------------+
| ID | int unsigned | NO | PRI | NULL | auto_increment |
| TM_TS | varchar(14) | YES | | NULL | |
| DATA | blob | YES | | NULL | |
| s_date | date | YES | MUL | NULL | STORED GENERATED |
| v_date | date | YES | MUL | NULL | VIRTUAL GENERATED |
| s_hash | char(40) | YES | MUL | NULL | STORED GENERATED |
| v_hash | char(40) | YES | MUL | NULL | VIRTUAL GENERATED |
+--------+--------------+------+-----+---------+-------------------+

And the indexes on the computed columns:
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| log_data | 0 | PRIMARY | 1 | ID | A | 10 | NULL | NULL | | BTREE | | | YES | NULL |
| log_data | 1 | k_s_date | 1 | s_date | A | 1 | NULL | NULL | YES | BTREE | | | YES | NULL |
| log_data | 1 | k_s_hash | 1 | s_hash | A | 10 | NULL | NULL | YES | BTREE | | | YES | NULL |
| log_data | 1 | k_v_date | 1 | v_date | A | 1 | NULL | NULL | YES | BTREE | | | YES | NULL |
| log_data | 1 | k_v_hash | 1 | v_hash | A | 10 | NULL | NULL | YES | BTREE | | | YES | NULL |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

Finally I populated the original data using random values:
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+
| ID | TM_TS | DATA |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+
| 1 | 20201211 | 0x302E36313537343138363136323632363534302E36313537343138363136323632363534302E36313537343138363136323632363534 |
| 2 | 20201211 | 0x302E3331313939303536343937393630343033302E3331313939303536343937393630343033302E3331313939303536343937393630343033 |
| 3 | 20201211 | 0x302E37313237323732373037353238363839302E37313237323732373037353238363839302E37313237323732373037353238363839 |
| 4 | 20201211 | 0x302E36323736363436383935353931373732302E36323736363436383935353931373732302E36323736363436383935353931373732 |
| 5 | 20201211 | 0x302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238 |
| 6 | 20201211 | 0x302E3131373731343239333431303733353438302E3131373731343239333431303733353438302E3131373731343239333431303733353438 |
| 7 | 20201211 | 0x302E35383831343634393839373435343931302E35383831343634393839373435343931302E35383831343634393839373435343931 |
| 8 | 20201211 | 0x302E35383735383936343533373431383431302E35383735383936343533373431383431302E35383735383936343533373431383431 |
| 9 | 20201211 | 0x302E3137333530383736303638303931383336302E3137333530383736303638303931383336302E3137333530383736303638303931383336 |
| 10 | 20201211 | 0x302E3130343737343839383031353638343335302E3130343737343839383031353638343335302E3130343737343839383031353638343335 |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+

The same rows on the replica however are different (stored columns come up as NULL) and the indexes do not work:
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+--------+------------+--------+------------------------------------------+
| ID | TM_TS | DATA | s_date | v_date | s_hash | v_hash |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+--------+------------+--------+------------------------------------------+
| 1 | 20201211 | 0x302E36313537343138363136323632363534302E36313537343138363136323632363534302E36313537343138363136323632363534 | NULL | 2020-12-11 | NULL | 768b42b27c912d774b4112c94c000235384a86da |
| 2 | 20201211 | 0x302E3331313939303536343937393630343033302E3331313939303536343937393630343033302E3331313939303536343937393630343033 | NULL | 2020-12-11 | NULL | 89fb345354fa330cfa83ceee888f6da10f0aaf3e |
| 3 | 20201211 | 0x302E37313237323732373037353238363839302E37313237323732373037353238363839302E37313237323732373037353238363839 | NULL | 2020-12-11 | NULL | 50656f30f4265bf1e330b8e54eff3c7e7e941cb7 |
| 4 | 20201211 | 0x302E36323736363436383935353931373732302E36323736363436383935353931373732302E36323736363436383935353931373732 | NULL | 2020-12-11 | NULL | 3db8647974731b9185310c779517eed590a2643e |
| 5 | 20201211 | 0x302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238 | NULL | 2020-12-11 | NULL | bfbc30a6d13f8e21ad9263f8f6c70c2de5f26e8d |
| 6 | 20201211 | 0x302E3131373731343239333431303733353438302E3131373731343239333431303733353438302E3131373731343239333431303733353438 | NULL | 2020-12-11 | NULL | aa1194aaa2f95c6bf1f050c5bcac8899249b8d9e |
| 7 | 20201211 | 0x302E35383831343634393839373435343931302E35383831343634393839373435343931302E35383831343634393839373435343931 | NULL | 2020-12-11 | NULL | 9a2d77ed383dc5731cd740ca585b67b77e760306 |
| 8 | 20201211 | 0x302E35383735383936343533373431383431302E35383735383936343533373431383431302E35383735383936343533373431383431 | NULL | 2020-12-11 | NULL | caed9858d0278b5a18c6a580bb6cca7716da6acb |
| 9 | 20201211 | 0x302E3137333530383736303638303931383336302E3137333530383736303638303931383336302E3137333530383736303638303931383336 | NULL | 2020-12-11 | NULL | a757ea18725fa483f7d5ffd257db8eff6583b685 |
| 10 | 20201211 | 0x302E3130343737343839383031353638343335302E3130343737343839383031353638343335302E3130343737343839383031353638343335 | NULL | 2020-12-11 | NULL | f8e3f8936f3918f3409375df01921f567ae2a875 |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+--------+------------+--------+------------------------------------------+

mysql> select * from virtual_test.log_data where v_date=CURDATE();
Empty set (0.00 sec)

mysql> select * from virtual_test.log_data where s_date=CURDATE();
Empty set (0.01 sec)

mysql> select * from virtual_test.log_data where v_hash='f8e3f8936f3918f3409375df01921f567ae2a875';
Empty set (0.01 sec)

mysql> select * from virtual_test.log_data where v_hash=s_hash;
Empty set (0.00 sec)

mysql> select v_hash,s_hash from virtual_test.log_data where v_hash=s_hash;
Empty set (0.00 sec)

Finally, I recreated all generated columns and indexes and the values were computed correctly:

ALTER TABLE virtual_test.log_data DROP COLUMN `s_date`,DROP COLUMN `v_date`,DROP COLUMN `s_hash`,DROP COLUMN `v_hash`, DROP KEY `k_s_date`,DROP KEY `k_v_date`,DROP KEY `k_s_hash`,DROP KEY `k_v_hash`, ADD COLUMN `s_date` date GENERATED ALWAYS AS (cast(`tm_ts` as date)) STORED, ADD COLUMN `v_date` date GENERATED ALWAYS AS (cast(`tm_ts` as date)) VIRTUAL, ADD COLUMN `s_hash` char(40) GENERATED ALWAYS AS (sha(hex(`DATA`))) STORED, ADD COLUMN `v_hash` char(40) GENERATED ALWAYS AS (sha(hex(`DATA`))) VIRTUAL, ADD KEY `k_s_date`(s_date), ADD KEY `k_v_date`(v_date), ADD KEY `k_s_hash`(s_hash), ADD KEY `k_v_hash`(v_hash) ;

mysql> select * from virtual_test.log_data where v_date=CURDATE();
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------------------------------------+------------------------------------------+
| ID | TM_TS | DATA | s_date | v_date | s_hash | v_hash |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------------------------------------+------------------------------------------+
| 1 | 20201211 | 0x302E36313537343138363136323632363534302E36313537343138363136323632363534302E36313537343138363136323632363534 | 2020-12-11 | 2020-12-11 | 768b42b27c912d774b4112c94c000235384a86da | 768b42b27c912d774b4112c94c000235384a86da |
| 2 | 20201211 | 0x302E3331313939303536343937393630343033302E3331313939303536343937393630343033302E3331313939303536343937393630343033 | 2020-12-11 | 2020-12-11 | 89fb345354fa330cfa83ceee888f6da10f0aaf3e | 89fb345354fa330cfa83ceee888f6da10f0aaf3e |
| 3 | 20201211 | 0x302E37313237323732373037353238363839302E37313237323732373037353238363839302E37313237323732373037353238363839 | 2020-12-11 | 2020-12-11 | 50656f30f4265bf1e330b8e54eff3c7e7e941cb7 | 50656f30f4265bf1e330b8e54eff3c7e7e941cb7 |
| 4 | 20201211 | 0x302E36323736363436383935353931373732302E36323736363436383935353931373732302E36323736363436383935353931373732 | 2020-12-11 | 2020-12-11 | 3db8647974731b9185310c779517eed590a2643e | 3db8647974731b9185310c779517eed590a2643e |
| 5 | 20201211 | 0x302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238302E3030303134313636363237303932343432313238 | 2020-12-11 | 2020-12-11 | bfbc30a6d13f8e21ad9263f8f6c70c2de5f26e8d | bfbc30a6d13f8e21ad9263f8f6c70c2de5f26e8d |
| 6 | 20201211 | 0x302E3131373731343239333431303733353438302E3131373731343239333431303733353438302E3131373731343239333431303733353438 | 2020-12-11 | 2020-12-11 | aa1194aaa2f95c6bf1f050c5bcac8899249b8d9e | aa1194aaa2f95c6bf1f050c5bcac8899249b8d9e |
| 7 | 20201211 | 0x302E35383831343634393839373435343931302E35383831343634393839373435343931302E35383831343634393839373435343931 | 2020-12-11 | 2020-12-11 | 9a2d77ed383dc5731cd740ca585b67b77e760306 | 9a2d77ed383dc5731cd740ca585b67b77e760306 |
| 8 | 20201211 | 0x302E35383735383936343533373431383431302E35383735383936343533373431383431302E35383735383936343533373431383431 | 2020-12-11 | 2020-12-11 | caed9858d0278b5a18c6a580bb6cca7716da6acb | caed9858d0278b5a18c6a580bb6cca7716da6acb |
| 9 | 20201211 | 0x302E3137333530383736303638303931383336302E3137333530383736303638303931383336302E3137333530383736303638303931383336 | 2020-12-11 | 2020-12-11 | a757ea18725fa483f7d5ffd257db8eff6583b685 | a757ea18725fa483f7d5ffd257db8eff6583b685 |
| 10 | 20201211 | 0x302E3130343737343839383031353638343335302E3130343737343839383031353638343335302E3130343737343839383031353638343335 | 2020-12-11 | 2020-12-11 | f8e3f8936f3918f3409375df01921f567ae2a875 | f8e3f8936f3918f3409375df01921f567ae2a875 |
+----+----------+----------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------------------------------------+------------------------------------------+
[11 Dec 2020 20:25] Vasilis Kioustelidis
The CPU and server architecture is:
   Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz on ProLiant DL360p Gen8
The OS release is:
   CentOS Linux release 7.9.2009 (Core)
[12 Dec 2020 3:13] Vasilis Kioustelidis
This file contains the statements to run on primary & replica to create tables & columns

Attachment: log_data.sql (application/octet-stream, text), 1.69 KiB.

[12 Dec 2020 3:19] Vasilis Kioustelidis
This is a docker instance creation template

Attachment: mysql_docker_template.sh (application/octet-stream, text), 925 bytes.

[16 Dec 2020 9:50] MySQL Verification Team
Hello Vasilis,

Thank you for the report and test case.
Observed that 8.0.22 is affected.

regards,
Umesh
[16 Dec 2020 9:51] MySQL Verification Team
MySQL Server 8.0.22 test results

Attachment: 101978_8.0.22.results.txt (text/plain), 22.72 KiB.

[18 Jan 15:59] Vasilis Kioustelidis
I've just updated the nodes to 8.0.23 however the bug still persists. Is there something I can do to fix this in the new version (e.g. some parameter in the my.ini file)?
[3 Feb 16:23] Pedro Figueiredo
Posted by developer:
 
Hi!

Updates and different versions of MySQL aren't needed to reproduce the issue.
By executing the followin MTR test with any version equal or higher than 8.0.14 (didn't test below), issue is reproducible (*with ROW format ONLY*):

```
--source include/have_binlog_format_row.inc
--source include/master-slave.inc

--source include/rpl_connection_master.inc
CREATE TABLE `log_data` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `tm_ts` VARCHAR(14) DEFAULT NULL,
  `data` BLOB,
  PRIMARY KEY (`ID`)
) ENGINE = InnoDB;

--source include/sync_slave_sql_with_master.inc
ALTER TABLE
  `log_data`
ADD
  COLUMN `s_len` INT GENERATED ALWAYS AS (
    LENGTH(`data`)
  ) STORED,
ADD
  COLUMN `s_date` DATE GENERATED ALWAYS AS (
    CAST(`tm_ts` AS DATE)
  ) STORED,
ADD
  COLUMN `v_date` DATE GENERATED ALWAYS AS (
    CAST(`tm_ts` AS DATE)
  ) VIRTUAL,
ADD
  COLUMN `s_hash` char(40) GENERATED ALWAYS AS (
    SHA(
      HEX(`data`)
    )
  ) STORED,
ADD
  COLUMN `v_hash` char(40) GENERATED ALWAYS AS (
    SHA(
      HEX(`data`)
    )
  ) VIRTUAL,
ADD
  KEY `k_s_len` (s_len),
ADD
  KEY `k_s_date` (s_date),
ADD
  KEY `k_v_date`(v_date),
ADD
  KEY `k_s_hash`(s_hash),
ADD
  KEY `k_v_hash`(v_hash);

--source include/rpl_connection_master.inc
INSERT INTO `log_data` (`tm_ts`, `data`) 
SELECT * 
FROM 
  (
    WITH RECURSIVE sequence AS (
      SELECT 
        1 AS level 
      UNION ALL 
      SELECT 
        level + 1 AS value 
      FROM 
        sequence 
      WHERE 
        sequence.level < 10
    ) 
    SELECT 
      CURDATE()+ 0, 
      REPEAT(
        rand(), 
        3
      ) 
    FROM 
      sequence
  ) AS F;

--source include/sync_slave_sql_with_master.inc
SELECT * FROM `log_data`;
--let $assert_text= Stored indexes not NULL count is 0 should be 10
--let $assert_cond= "[SELECT COUNT(1) FROM log_data WHERE s_date IS NOT NULL AND s_hash IS NOT NULL]" = "0"
--source include/assert.inc

UPDATE `log_data` SET `tm_ts` = "20210203";
SELECT * FROM `log_data`;    
--let $assert_text= Stored indexes not NULL count is 10 AS it should be
--let $assert_cond= "[SELECT COUNT(1) FROM log_data WHERE s_date IS NOT NULL AND s_hash IS NOT NULL]" = "10"
--source include/assert.inc

--source include/rpl_connection_master.inc
INSERT INTO `log_data` (`tm_ts`, `data`) 
SELECT * 
FROM 
  (
    WITH RECURSIVE sequence AS (
      SELECT 
        1 AS level 
      UNION ALL 
      SELECT 
        level + 1 AS value 
      FROM 
        sequence 
      WHERE 
        sequence.level < 10
    ) 
    SELECT 
      CURDATE()+ 0, 
      REPEAT(
        rand(), 
        3
      ) 
    FROM 
      sequence
  ) AS F;

--source include/sync_slave_sql_with_master.inc
SELECT * FROM `log_data`;

--let $assert_text= Stored indexes not NULL count is 10 should be 20
--let $assert_cond= "[SELECT COUNT(1) FROM log_data WHERE s_date IS NOT NULL AND s_hash IS NOT NULL]" = "10"
--source include/assert.inc

--source include/rpl_connection_master.inc
DROP TABLE `log_data`;
--source include/rpl_end.inc

```

Thank you for your time.

Regards,
[20 Apr 18:09] Margaret Fisher
Posted by developer:
 
Changelog entry added for MySQL 8.0.24:

 With row-based replication in use (binlog_format=ROW), stored generated columns that existed only on the replica were not being updated when the other fields in the row were inserted or updated. These columns are now evaluated when the row image is unpacked, and their value is updated in the row image before it is applied.