Bug #41101 set storage_engine and set create_table are not written to the binlog
Submitted: 28 Nov 2008 15:31 Modified: 27 Apr 2009 9:35
Reporter: Mark Callaghan Email Updates:
Status: In progress Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.0.67 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: binlog, replication, SET, storage_engine, table_type

[28 Nov 2008 15:31] Mark Callaghan
Description:
Run:
set storage_engine=innodb;
create table t(i int);

And then run:
set table_type=innodb;
create table s(i int);

And then dump the binlog data for these commands. The create table commands are in the binlog. The 'set storage_engine=innodb' and 'set table_type=innodb' commands are not.

Then use the binlog to recover a new master from a backup and the post-backup binlogs. The new master now uses the wrong table types. 

How to repeat:
Above

Suggested fix:
Replication is full of inconsistencies like this. Why must we remember all of the special cases? This makes MySQL much more expensive to operate than it needs to be. 

Things that make the outcome on a slave different than on a master must be done on the slave, not on the events written to the binlog.

There should only be two ways to prevent a statement that modifies database state from getting written to the binlog:
* set sql_log_bin=0
* a clause on the SQL statement
[28 Nov 2008 16:12] MySQL Verification Team
Thank you for the bug report.
[28 Nov 2008 16:15] Mark Callaghan
It would help if the documentation for the the set storage_engine and set table_type commands had a link to the docs that describe their impact on master and slave state.

http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#option_mysqld_storage_...

http://dev.mysql.com/doc/refman/5.0/en/replication-solutions-diffengines.html
[15 Dec 2008 15:20] Sergei Golubchik
It was intentional - to allow slave to use different storage engines for tables.
The ENGINE=... clause in the CREATE TABLE is replicated, if you want to force the engine to be the same.

(and no, I'm not advocating for this decision, only explaining that there *was* one, and the behavior is intentional)
[23 Mar 2009 20:15] Mats Kindahl
I'm sorry, but I can't view this as a bug since:

1. As pointed out by Serg, it is an intentional design to allow different
   engines on master and slave. If we binlog the storage_engine, it is 
   impossible to use different default storage engines on master and slave.

2. For the case that one want to use an explicit storage engine, there is a
   workaround where one uses the ENGINE= to set the engine.

3. When restoring the backup, the backup image should contain settings for the
   storage_engine variable and not the binary log.

4. The setting will have to be replicated for each statement, since it is
   impossible to know where replication will start. This will impact the size
   of the binlog. This is necessary since the following two alternatives (to
   avoid writing the storage_engine statement for each statement) are, IMHO,
   worse.

   If one chooses to just binlog the actual set statement, that will require
   the DBA to know the location of the setting and start replication from
   that.

   If one choses to generate a set storage_engine statement on starting
   replication, this will require the master to search the binary logs backward
   to find the last used setting.

   Now, suppose that we need to do this for all possible settings of server
   variables, not only storage_engine...

5. Fixing the problem is a significant undertaking and is likely to lead to a
   number of subtle bugs and situations that are hard to handle.

As an example of a problem that could occur: suppose that the restore is done to intentionally switch storage engine and therefore the storage_engine is set to the new storage engine. In that case, the tables will be restored with the old engine instead of the new one.

Statement-based replication is by design intended to replicate statements, not state.
[26 Mar 2009 10:35] Andrei Elkin
It's hard to disagree with earlier comments.
We should not add yet another @@session variable to the existing
context replicated with the query event.
We would rather to agree that the hard-coded `create table () engine=maria'
has the top preference. Without `engine=' clause @@session decides,
on the slave sql thread side as well.

However, it might be that some users would like to encapsulate
a part of @@session context with the query to replicate

 set storage_engine=innodb;
 create table t(i int);

as

 create table t (i int) engine=innodb;

A new option for converting the @@session into `engine=' clause could
be provided.

What do you think?
[23 Apr 2009 17:39] Andrei Elkin
Discussed with Sinisa on #support

[16:35]<sinisa> andrei, there is only one way that I see that this bug would
		be fixed so that behaviour is not changed ...
[16:36]<sinisa> andrei: new startup option, false by default, so that create
		table ... always gets appended the engine type ... that is the
		best that I can think of ...

So we got to an agreement about the last comment idea. I am setting to re-triage, as Sinisa
suggested.
Could you please decide if it's a feature request indeed.

Thanks!
[26 Apr 2009 3:18] Mark Callaghan
Yes, this should be a feature request. Thanks for the many responses.