Bug #88754 table id overflow causes replication out of sync
Submitted: 5 Dec 2017 8:07 Modified: 5 Dec 2017 13:06
Reporter: Team TXSQL Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.5.24, 5.5.58 OS:Any
Assigned to: CPU Architecture:Any

[5 Dec 2017 8:07] Team TXSQL
Description:
When binglog_format is "row", DML statements would have a Table_map_event in binlog; each Table_map_event would have a table_id; the table_id is represented as ulong in binlog, while read into TABLE_LIST.table_id in slave, which is of type uint, so if table_id is larger than 17179869184, it would cause type overflow, and those events in relay log would be silently skipped, hence causing data inconsistency between master and slave.

How to repeat:
Use the patch attached to add a DBUG_EXECUTE_IF point in function assign_new_table_id to manually assign a table_id larger than 17179869184, then issue the following statements on master:

-- configure binlog_format=row first

1) create table tbl(id int);
2) flush tables;
3) set debug='+d,before_assign_new_table_id';
4) insert into tbl values(1);

then check if the inserted data row can be retrieved from slave;

the result is that, we can see the events in relay log, but the data row cannot be found, because the events are skipped due to the type overflow.

Suggested fix:
adjust type of m_table_id of events and table_id of TABLE_LIST
[5 Dec 2017 8:08] Team TXSQL
patch to add debug point

Attachment: 0001-add-debug-point-to-make-table-id-larger-than-1717986.patch (application/octet-stream, text), 783 bytes.

[5 Dec 2017 13:06] MySQL Verification Team
Hello Team TXSQL!

Thank you for the report and test case.
Observed this with 5.5.58 build.

Thanks,
Umesh
[5 Dec 2017 13:12] MySQL Verification Team
5.5.58 test results

Attachment: 88754_5.5.58.results (application/octet-stream, text), 11.90 KiB.