MySQL Bugs: #67352: table_id is defined differently in sql/table.h vs sql/log

Bug #67352	table_id is defined differently in sql/table.h vs sql/log_event.h
Submitted:	24 Oct 2012 8:39	Modified:	30 Jan 2013 15:47
Reporter:	Valeriy Kravchuk	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Row Based Replication ( RBR )	Severity:	S2 (Serious)
Version:	5.1, 5.5, 5.6	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	replication, ulong

Description:
Binary log structure assumes that table_id in the event can be ulong, while the value is then assigned to table_id filed in TABLE structure that is defined as uint. This can cause overflow and problems during replication (events are not applied to correct table).

See https://bugs.launchpad.net/percona-server/+bug/1070255 for more details on when overflow can happen and real impact for replication.

How to repeat:
Just do code review.

[openxs@chief mysql-5.6]$ grep -n 'table_id;' sql/table.h
1569:  uint          table_id; /* table id (from binlog) for opened table */
[openxs@chief mysql-5.6]$ grep -n 'table_id;' sql/log_event.h
3794:  ulong get_table_id() const        { return m_table_id; }
3859:  ulong          m_table_id;
3983:  ulong get_table_id() const        { return m_table_id; }
4078:  ulong       m_table_id;  /* Table ID */

Suggested fix:
Use ulong for table_id everywhere?

Thank you for the bug report.

d:\build\5.5-2012-10-10>type sql\table.h | findstr table_id;
  uint          table_id; /* table id (from binlog) for opened table */

d:\build\5.5-2012-10-10>type sql\log_event.h | findstr table_id;
  ulong get_table_id() const        { return m_table_id; }
  ulong          m_table_id;
  ulong get_table_id() const        { return m_table_id; }
  ulong       m_table_id;       /* Table ID */

d:\build\5.5-2012-10-10>

I think it should be ulonglong everywhere because ulong can be 32-bits wide yet table_id (in the table map event) is 48-bits wide. For example, on Win64 long is 32-bits.

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Fixed in 5.6+.

Documented as follows in the 5.6.11 and 5.7.1 changelogs:

        Table IDs used in replication were defined as type ulong on the 
        master and uint on the slave. In addition, the maximum value for 
        table IDs in binary log events is 6 bytes (281474976710655). This 
        combination of factors led to the following issues:

            Data could be lost on the slave when a table was assigned an 
            ID greater than uint.

            Table IDs greater than 281474976710655 were written to the 
            binary log as 281474976710655.
            
            This led to a stopped slave when the slave encountered two 
            tables having the same table ID.
          
        To fix these problems, table IDs are now defined by both master 
        and slave as type ulonglong but constrained to a range of 0 to 
        281474976710655, restarting from 0 when it exceeds this value.

Closed.

5.6$ bzr log -r 4680
------------------------------------------------------------
revno: 4680
committer: Libing Song <libing.song@oracle.com>
branch nick: mysql-5.6
timestamp: Wed 2013-01-30 11:42:40 +0800
message:
  Bug#14801955 TABLE_ID IS DEFINED DIFFERENTLY IN SQL/TABLE.H VS SQL/LOG_EVENT.H
  
  Problem
  =======
  Table id used by replication, was defined as ulong on master
  (TABLE_SHARE::table_map_id). But uint on slave(TABLE_LIST::table_id).
  It caused a few problems below:
  * Data lost on slave if table it was greater than uint.
  * Slave stopped because it found two tables had same table id.
    E.g. t1's id is 0 and t2's id is UINT_MAX+1
  * Table id in Table_map_log_event is 6 Bytes long, so all table ids
    which were greater than max integer of 6 bytes(281474976710655)
    were binlogged as 281474976710655.
  
  Fix
  ===
  Table id on both master and slave are defined as ulonglong now,
  but table id is confined from 0 to 281474976710655(max value of
  6 bytes integer). Table id is allocated in aoto-increment way.
  It will restart from 0, when it exceeds 281474976710655.
  
  I also removed the code to check the dummy event on slave from
  log_event.cc. Dummy event appeared only between 5.1.5 and 5.1.11
  which were still generating old rows log events. FYI, Dummy event
  was removed by revision sp1r-mats@mysql.com-20060531172152-28107.