Bug #14931 GRANT goes to binlog too late (causes test failures)
Submitted: 14 Nov 2005 22:57 Modified: 27 Mar 2009 18:07
Reporter: Guilhem Bichot Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.0 OS:Linux (linux)
Assigned to: Andrei Elkin CPU Architecture:Any

[14 Nov 2005 22:57] Guilhem Bichot
Description:
Observed initially on our "hammer" machine, this failure of rpl_sp.test:
mysqldev@hammer:~/hammer-glibc23/test/mysql-cluster-5.0.16-linux-x86_64-glibc23/mysql-test> perl ./mysql-test-run.pl --no-manager --timer --sleep=10 --tmpdir=/tmp/my_build-hammer-glibc23 --skip-ndb  rpl_sp
rpl_sp                         [ fail ]

Errors are (from /home/mysqldev/hammer-glibc23/test/mysql-cluster-5.0.16-linux-x86_64-glibc23/mysql-test/var/log/mysqltest-time) :
mysqltest: Result content mismatch
(the last lines may be the most important ones)
Below are the diffs between actual and expected results:
-------------------------------------------------------
*** r/rpl_sp.result     2005-11-14 10:22:24.000000000 +0300
--- r/rpl_sp.reject     2005-11-15 01:46:32.000000000 +0300
***************
*** 287,299 ****
  insert into t1 values (15)
  master-bin.000001     #       Query   1       #       use `mysqltest1`; grant CREATE ROUTINE, EXECUTE on mysqltest1.* to "zedjzlcsjhd"@127.0.0.1
  master-bin.000001     #       Query   1       #       use `mysqltest1`; grant SELECT on mysqltest1.t1 to "zedjzlcsjhd"@127.0.0.1
- master-bin.000001     #       Query   1       #       use `mysqltest1`; grant SELECT, INSERT on mysqltest1.t2 to "zedjzlcsjhd"@127.0.0.1
  master-bin.000001     #       Query   1       #       use `mysqltest1`; create procedure foo4()                         
  deterministic
  begin
  insert into t2 values(3);
  insert into t1 values (5);
  end
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t2 values(3)
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t1 values (15)
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t2 values(3)
--- 287,299 ----
  insert into t1 values (15)
  master-bin.000001     #       Query   1       #       use `mysqltest1`; grant CREATE ROUTINE, EXECUTE on mysqltest1.* to "zedjzlcsjhd"@127.0.0.1
  master-bin.000001     #       Query   1       #       use `mysqltest1`; grant SELECT on mysqltest1.t1 to "zedjzlcsjhd"@127.0.0.1
  master-bin.000001     #       Query   1       #       use `mysqltest1`; create procedure foo4()
  deterministic
  begin
  insert into t2 values(3);
  insert into t1 values (5);
  end
+ master-bin.000001     #       Query   1       #       use `mysqltest1`; grant SELECT, INSERT on mysqltest1.t2 to "zedjzlcsjhd"@127.0.0.1
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t2 values(3)                         
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t1 values (15)
  master-bin.000001     #       Query   1       #       use `mysqltest1`; insert into t2 values(3)

Reason:
GRANT replies ok (send_ok()) to client (mysqltest) before it has written the GRANT statement to binlog. Then in the test, client sends another query (CREATE PROCEDURE) (via another connection), which can (race condition) get into binlog  before the GRANT does. Hence the random wrong order in SHOW BINLOG EVENTS.
REVOKE is also affected probably. CREATE|RENAME|DROP USER may too.
It's even a bigger problem (see similar BUG#14262): even though we binlog before send_ok(), there is still a chance that we binlog after releasing locks on "mysql" tables, which would be wrong. So check (and fix if needed) that too please.

How to repeat:
see description.

Suggested fix:
see description
[19 Dec 2005 18:12] Joerg Bruehe
In 5.0.17, this occurred in these tests, details (like line numbers) slightly varying:

1)
intelxeon3-5.0-standard.log   ps-protocol
sunfire100a-64bit-5.0-cluster.log   ps-protocol
sunfire100a-64bit-5.0-max.log   ps-protocol
sunfire100a-64bit-5.0-pro.log   ps-protocol
sunfire100c-5.0-cluster.log   ps-protocol
sunfire100c-5.0-max.log   ps-protocol
sunfire100c-5.0-pro.log   ps-protocol
sunfire100c-5.0-standard.log   normal

2)
cane-linuxthreads-5.0-pro-gpl.log   normal
cane-linuxthreads-5.0-standard.log   ps-protocol
hammer-glibc23-5.0-cluster.log   normal
hammer-glibc23-5.0-cluster.log   ps-protocol
hammer-glibc23-5.0-debug.log   normal
hammer-glibc23-5.0-debug.log   ps-protocol
hammer-glibc23-5.0-max.log   normal
hammer-glibc23-5.0-max.log   ps-protocol
hammer-glibc23-5.0-pro-gpl.log   normal
hammer-glibc23-5.0-pro-gpl.log   ps-protocol
hammer-glibc23-5.0-pro.log   ps-protocol
osx-tiger-x86-5.0-cluster.log   ps-protocol
osx-tiger-x86-5.0-max.log   normal
osx-tiger-x86-5.0-pro-gpl.log   ps-protocol
osx-tiger-x86-5.0-standard.log   normal

3)
powermacg5-5.0-debug.log   ps-protocol
xserve-5.0-debug.log   normal
xserve-5.0-max.log   normal
xserve-5.0-pro-cert.log   normal
xserve-5.0-pro-gpl-cert.log   normal
xserve-5.0-pro-gpl-cert.log   ps-protocol
xserve-5.0-pro-gpl.log   ps-protocol
xserve-5.0-pro.log   normal
xserve-5.0-pro.log   ps-protocol
xserve-5.0-standard.log   normal
xserve-64bit-5.0-debug.log   normal
xserve-64bit-5.0-max.log   normal
xserve-64bit-5.0-pro-cert.log   normal
xserve-64bit-5.0-pro-gpl-cert.log   normal
xserve-64bit-5.0-pro-gpl-cert.log   ps-protocol
[4 Jan 2006 14:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/607
[4 Jan 2006 14:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/608
[4 Jan 2006 14:39] Kristian Nielsen
Until this bug is fixed, to avoid the sporadic test failures masking other bugs, I have modified the test cases so they (hopefully) will no longer fail.

I added SELECT 1 between the GRANT statements and the establishment of a new connection. The server thread will finish binlogging the GRANT before running the SELECT 1, hence binlogging order will be deterministic.

As part of the bugfix, the test cases must be restored! To do this remove the SLEEP 1 (there are comments mentioning this under BUG#14931).

In mysql-5.0 this is test case rpl_sp.

In mysql-5.1-new this is test cases rpl_stm_sp and rpl_row_sp000.
[27 Mar 2009 18:07] Andrei Elkin
The bug ceased as the current sources 5.0 prove.

1. Per sending Ok before to binlog, for all alleged cases

    if (!(res= mysql_create_user(thd, lex->users_list)))
      send_ok(thd);

    if (!(res= mysql_drop_user(thd, lex->users_list)))
      send_ok(thd);

    if (!(res = mysql_revoke_all(thd, lex->users_list)))
      send_ok(thd);

    if (!(res= mysql_rename_user(thd, lex->users_list)))
      send_ok(thd);

  mysql_grant()
  ...
    write_bin_log(thd, TRUE, thd->query, thd->query_length);
  }

  rw_unlock(&LOCK_grant);
  close_thread_tables(thd);

  if (!result)
    my_ok(thd);

binlogging happens before sending OK.

2. Per possibility of binlogging after releasing a lock on the table,
   it's no longer relevant as well e.g

  mysql_create_user():

  if (some_users_created)
    write_bin_log(thd, FALSE, thd->query, thd->query_length);

  rw_unlock(&LOCK_grant);
  close_thread_tables(thd);  // *Unlocks tables*

The final point, a similar effect of sending OK too early has still existed
for INSERT (one connection) / SELECT (second connection) on myisam table:
Bug #36618  myisam insert not immediately visible to select from another client.