Bug #87847 table with json data type,checksum table result inconsistent
Submitted: 23 Sep 2017 7:11 Modified: 16 Nov 2017 9:51
Reporter: Bin Hong (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: JSON Severity:S3 (Non-critical)
Version:mysql 5.7, 5.7.19 OS:MacOS
Assigned to: CPU Architecture:Any

[23 Sep 2017 7:11] Bin Hong
Description:
The table include json data type, when execute checksum table statment, it's result  may be inconsistent.

How to repeat:
create table t_json (a json);
insert t_json values('{"key1": "value1", "key2": "value2"}');
checksum table t_json;

--------------------------------------------------- 
mysql> show create table t_json; 
+--------+--------------------------------------------------------------------------------------+ 
| Table | Create Table | 
+--------+--------------------------------------------------------------------------------------+ 
| t_json | CREATE TABLE `t_json` ( 
`a` json DEFAULT NULL 
) ENGINE=InnoDB DEFAULT CHARSET=utf8 | 
+--------+--------------------------------------------------------------------------------------+ 
1 row in set (0.00 sec) 

mysql> select * from t_json 
-> ; 
+--------------------------------------+ 
| a | 
+--------------------------------------+ 
| {"key1": "value1", "key2": "value2"} | 
+--------------------------------------+ 
1 row in set (0.00 sec) 

mysql> checksum table t_json; 
+-------------+------------+ 
| Table | Checksum | 
+-------------+------------+ 
| test.t_json | 3873575678 | 
+-------------+------------+ 
1 row in set (0.00 sec) 

mysql> checksum table t_json; 
+-------------+------------+ 
| Table | Checksum | 
+-------------+------------+ 
| test.t_json | 2665545437 | 
+-------------+------------+ 
1 row in set (0.00 sec) 

mysql> checksum table t_json; 
+-------------+------------+ 
| Table | Checksum | 
+-------------+------------+ 
| test.t_json | 3873575678 | 
+-------------+------------+ 
1 row in set (0.00 sec) 

mysql> checksum table t_json; 
+-------------+------------+ 
| Table | Checksum | 
+-------------+------------+ 
| test.t_json | 3232111891 | 
+-------------+------------+ 
1 row in set (0.00 sec) 

Debugging found that the a checksum_crc32's POS parameter may be incorrect

    frame #0: 0x0000000100ea04b9 mysqld`binary_log::checksum_crc32(crc=2282172566, pos=")", length=12) at binlog_event.h:421

  row_crc= checksum_crc32(row_crc, f->ptr, f->pack_length());
 f->ptr pointer maybe error

Suggested fix:
  

The switch clause may have lost the MYSQL_TYPE_JSON condition 

sql_table.cc  mysql_checksum_table function

add   case MYSQL_TYPE_JSON:
 switch (f->type()) {
                case MYSQL_TYPE_BLOB:
                case MYSQL_TYPE_VARCHAR:
                case MYSQL_TYPE_GEOMETRY:
                case MYSQL_TYPE_BIT:
                case MYSQL_TYPE_JSON:
                {
                  String tmp;
                  f->val_str(&tmp);
                  row_crc= checksum_crc32(row_crc, (uchar*) tmp.ptr(),
                           tmp.length());
                  break;

Following is the debug information after the fixed
 

Process 9364 stopped
* thread #2, stop reason = step over
    frame #0: 0x0000000100cbe025 mysqld`mysql_checksum_table(thd=0x00000001041c4000, tables=0x00000001044277c0, check_opt=0x00000001041c6928) at sql_table.cc:10579
   10576	                case MYSQL_TYPE_JSON:
   10577	                {
   10578	                  String tmp;
-> 10579	                  f->val_str(&tmp);
   10580	                  row_crc= checksum_crc32(row_crc, (uchar*) tmp.ptr(),
   10581	                           tmp.length());
   10582	                  break;
(lldb)
Process 9364 stopped
* thread #2, stop reason = step over
    frame #0: 0x0000000100cbe044 mysqld`mysql_checksum_table(thd=0x00000001041c4000, tables=0x00000001044277c0, check_opt=0x00000001041c6928) at sql_table.cc:10580
   10577	                {
   10578	                  String tmp;
   10579	                  f->val_str(&tmp);
-> 10580	                  row_crc= checksum_crc32(row_crc, (uchar*) tmp.ptr(),
   10581	                           tmp.length());
   10582	                  break;
   10583	                }
(lldb) p tmp.ptr()
(const char *) $0 = 0x0000000103ef5930 "{"key1": "value1", "key2": "value2"}"
(lldb) p tmp.length()
(size_t) $1 = 36
[23 Sep 2017 7:17] Bin Hong
Since Filed_json inherits from Filed_blob
+--Field_str (abstract)
|  +--Field_longstr
|  |  +--Field_string
|  |  +--Field_varstring
|  |  +--Field_blob
|  |     +--Field_geom
|  |     +--Field_json

So the switch condition should have MYSQL_TYPE_JSON
[23 Sep 2017 7:56] Bin Hong
mysql_checksum_table function f->type() switch add MYSQL_TYPE_JSON case

Attachment: mysql_checksum_table_switch_add_json_cond.patch (application/octet-stream, text), 462 bytes.

[23 Sep 2017 15:34] Bin Hong
test case

Attachment: json_filed_checksum.test (application/octet-stream, text), 231 bytes.

[23 Sep 2017 15:35] Bin Hong
test result

Attachment: json_filed_checksum.result (application/octet-stream, text), 213 bytes.

[25 Sep 2017 8:38] MySQL Verification Team
Hello Hong bin,

Thank you for the report.
Verified as described with 5.7.19 build(not reproducible on 8.0.3).

Thanks,
Umesh
[25 Sep 2017 8:40] MySQL Verification Team
test results

Attachment: 87847.results (application/octet-stream, text), 7.57 KiB.

[25 Sep 2017 8:40] MySQL Verification Team
Please note that in order to submit contributions you must first sign the Oracle Contribution Agreement (OCA). For additional information please check http://www.oracle.com/technetwork/community/oca-486395.html.
If you have any questions, please contact the MySQL community team.
[27 Sep 2017 18:19] Richard Morris
This issue came up recently when we started using the JSON column.  We are currently running MySQL 5.7.18 (community) in a master/slave setup.  The replication is checked using the mysqlrplsync tool and was complaining about differences between the master and slave for non-empty tables that has a JSON column.

Doing a similar check as the reporter did, I was able to see that a very basic table created on the master and then replicated to the slave would produce different CHECKSUM TABLE values even though visually the information was identical.

I see that it doesn't seem to be an issue in MySQL 8.0.3 but I hope the resolution is backported into the 5.7 branch.  Right now mysqlrplsync is reporting errors that are most likely non-existent.
[9 Oct 2017 13:24] MySQL Verification Team
Hi hong bin,

I have informed community team and confirmed that they have received your signed OCA and they will get back to you this week.

Regards,
Umesh
[16 Nov 2017 9:51] Erlend Dahl
Fixed in 8.0.0 under the heading of

Bug#23535703 WRONG TABLE CHECKSUM ON TABLES WITH JSON COLUMNS
[16 Nov 2017 15:14] Richard Morris
Will this be fixed in the 5.7 branch?  From the description above, it seemed have already been resolved in 8.0.3.  

If it's not going to be fixed in the 5.7 branch, when will MySQL 8 be released?

Thanks.
[26 Nov 2020 13:50] Vojtech Kurka
By the way, this bug is causing us a headache when migrating from 5.7 to 8.0
Because the checksum on master (5.7) is different from the checksum on replica (8.0.21)

A workaround is using mysqldump and compare the text files.