Bug #56739 binary log activation sets server_id implicitly, but replication fails
Submitted: 12 Sep 2010 22:19 Modified: 21 Oct 2013 13:04
Reporter: Giuseppe Maxia Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.50 OS:Any (Mac OSX, Linux)
Assigned to: Assigned Account CPU Architecture:Any
Tags: binary logs., replication, server-id
Triage: Triaged: D3 (Medium)

[12 Sep 2010 22:19] Giuseppe Maxia
Description:
When enabling binary logs, the server ID is automatically set to '1'.
That seems enough to use the server as a master for replication, but it doesn't.

See this example:

## Create 2 sandboxes: master and slave
make_sandbox 5.1.50 --sandbox_dir=slave --sandbox_port=7000
make_sandbox 5.1.50 --sandbox_dir=slave --sandbox_port=7500

## check the master

cd $HOME/sandboxes/master
./use -e 'select @@server_id'
+-------------+
| @@server_id |
+-------------+
|           0 |
+-------------+

## restart, enabling binary logs, but not the server-id.
 
./restart --log-bin=mysql-bin
. sandbox server started

## check the server id

./use -e 'select @@server_id'
+-------------+
| @@server_id |
+-------------+
|           1 |
+-------------+

## looks OK. Let's go to the slave.

cd $HOME/sandboxes/slave

## enable the slave by setting a server ID

./start --server-id=2
... sandbox server started

## tell the slave where the master is

./use -e 'change master to master_host="127.0.0.1", master_port=7000, master_user="msandbox", master_password="msandbox"'

./use -e 'start slave'

## the slave complains:

./use -e 'show slave status\G'
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 7000
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysql_sandbox7500-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 106
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Misconfigured master - server id was not set'
               Last_SQL_Errno: 0
               Last_SQL_Error: 

# now let's start again
#

## remove everything from the slave
./clear

## and remove everything from the master
cd $HOME/sandboxes/master
./clear

## restart the master, now setting server-ID explicitly 

./start --log-bin=mysql-bin --server-id=1
. sandbox server started

## Go back to the slave, and enable it again from scratch

cd $HOME/sandboxes/slave
./start --server-id=2
.. sandbox server started

./use -e 'change master to master_host="127.0.0.1", master_port=7000, master_user="msandbox", master_password="msandbox"'
./use -e 'start slave'

## Now replication works

./use -e 'show slave status\G'
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 7000
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 106
               Relay_Log_File: mysql_sandbox7500-relay-bin.000002
                Relay_Log_Pos: 251
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 106
              Relay_Log_Space: 418
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 

Note: A similar issue was already reported as Bug#3829, and fixed. But apparently there is still some hidden work in the setting of the server ID

How to repeat:
See above

Suggested fix:
If enabling binary logs has the side effect of setting the server ID, make it in such a way that it can be used for replication.
Probably removing the side effect altogether would make debugging easier.
[17 Sep 2010 19:03] Sveta Smirnova
Thank you for the report.

Verified as described.
[21 Oct 2013 13:04] Jon Stephens
Fixed in 5.7. Documented in the 5.7.3 changelog as follows:

    Formerly, it was possible to start the server with binary logging 
    enabled but no server ID specified; in such cases, the server would 
    set server_id to 1 (rather than 0) while slaves remained unable to
    connect.

    Now the --server-id option must be used when starting the server 
    with binary logging enabled, otherwise the server is unable to start.
    If --server-id=0 is used, this value is no longer changed by the
    server; in this case, updates are written to the binary log, but
    slaves are unable to connect. Using --server-id without
    specifying a value has the same effect as setting it explicitly
    to 0.

Also added relevant/version-specific info to --server-id and --log-bin option descriptions in the 5.0-5.7 versions of the Manual.

Closed.
[4 Dec 2013 10:42] Laurynas Biveinis
mysql-server$ bzr log -r 6591
------------------------------------------------------------
revno: 6591
committer: Tiago Jorge <tiago.jorge@oracle.com>
branch nick: mysql-trunk
timestamp: Mon 2013-09-30 18:08:27 +0100
message:
  BUG#11763963 - BINARY LOG ACTIVATION SETS SERVER_ID IMPLICITLY,
                 BUT REPLICATION FAILS
  
  Problem:
  When one enables binary logging without setting a server id, letting it
  default to 0, the server changes it to 1, but replication does not work.
  According to the documentation, a server id > 0, should be a legal configuration
  to allow slave to connect and replication to work.
  
  Analysis:
  Apparently, this happens since the dawn of time as a design decision, meaning
  that binlog => server_id.
  
  In mysqld.cc::main()
  
  [...]
  
    if (opt_bin_log && server_id == 0)
    {
      server_id= 1;
  #ifdef EXTRA_DEBUG
      sql_print_warning("You have enabled the binary log, but you haven't set "
                        "server-id to a non-zero value: we force server id to
  1; "
                        "updates will be logged to the binary log, but "
                        "connections from slaves will not be accepted.");
  #endif
    }
  [...]
  
  What happened is that, when a slave tried to connect, it would get an error
  stating that the master was misconfigured, even with server-id = 1.
  
  After some testing we concluded that not even the documentation was accurate
  enough. The test conducted were:
  
  1) Start master with no options. Make sure that slave cannot connect (control
  experience)
  2) Start master with only binlog active. Make sure that slave cannot connect
  (the bug situation)
  3) Start master with binlog active and server-id == 0 in the command line.
  Make sure that slave cannot connect. (as documented)
  4) Start master with binlog active and server-id == 1 in the command line.
  Change server_id to 0 using SQL before connecting slave. Make sure that slave
  cannot connect.(as documented)
  
  Situations 1) and 2) are OK and according with expected and analyzed
  behavior. With 3) and 4) things are not as documented.
  
  In Situation 3) the server starts and implicitly sets server id to 1. When
  one starts a slave after configuring the master, there is no error in "SHOW
  SLAVE STATUS". It connects and works. Bin and Relay log have a server id of
  1.
  
  In Situation 4) in which one explicitly change the server_id in runtime, the
  behavior is the same as in Situation 3). Contents of the relay log and bin
  log have a server id == 0
  
  Fix:
  For trunk only the decision was:
  - Server_id must be explicitly set if binlog is active. If not, the server
  will not start.
  - if you set server_id=0, no slaves can connect, but your statements will be
  binlogged with the provided server_id.
  - if no server_id is set, it will have the default configured value and have
  the same behaviour as the above, if the default is 0.
  
  This patch will bring the code closer to which is documented. The dynamic change
  of server-id was not addressed here.
  
  For the previous version, one thinks that the documentation should be augmented
  according to the information detailed in the bug page.