Bug #34647 rpl_temporary discovers more than one dump thread and fails to select
Submitted: 18 Feb 2008 21:00 Modified: 28 Jul 2008 15:24
Reporter: Andrei Elkin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.23 and up OS:Any
Assigned to: Andrei Elkin CPU Architecture:Any
Tags: disabled, pushbuild, sporadic, test failure

[18 Feb 2008 21:00] Andrei Elkin
Description:
This is a rather rare failure of the test script 

rpl.rpl_temporary              [ fail ]

mysqltest: At line 148: query 'select id from information_schema.processlist where command='Binlog Dump' into @id' failed: 1172: Result consisted of more than one row

The result from queries just before the failure was:

The lines from 
https://intranet.mysql.com/secure/pushbuild/getlog.pl?dir=mysql-5.1-maint2&entry=msvensson...
most probably mean there were more than one (expected) instance of the dump thread on master.

The error log of the failing run 
https://intranet.mysql.com/secure/pushbuild/getlog.pl?dir=mysql-5.1-new-rpl&entry=ssandber...
does not reveal much
:
CURRENT_TEST: rpl.rpl_temporary
080216 14:21:15 [Warning] The syntax 'for replication startup options' is deprecated and will be removed in MySQL 5.2. Please use 'CHANGE MASTER' instead.
080216 14:21:15 [Note] Plugin 'InnoDB' disabled by command line option
080216 14:21:15 [Note] Event Scheduler: Loaded 0 events
080216 14:21:15 [Note] /export/home/pushbuild/pb/mysql-5.1-new-rpl/476/mysql-5.1.24-rc-pb476/sql/mysqld: ready for connections.
Version: '5.1.24-rc-pb476-log'  socket: '/tmp/pbtmp-ps_mix-6/slave.sock'  port: 10062  MySQL Pushbuild Edition, build 476
080216 14:21:15 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log '/tmp/var-ps_mix-6/log/slave-relay-bin.000001' position: 4
080216 14:21:15 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:10060',replication started in log 'FIRST' at position 4
080216 14:21:15 [Note] Slave: received end packet from server, apparent master shutdown: 
080216 14:21:15 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'master-bin.000001' at postion 2768
080216 14:21:15 [Note] /export/home/pushbuild/pb/mysql-5.1-new-rpl/476/mysql-5.1.24-rc-pb476/sql/mysqld: Normal shutdown

How to repeat:
Watch pb

Suggested fix:
Find out how more than one dump thread appeared, make changes in the test to guarantee there is only one at a time.
[31 Mar 2008 12:00] Vasil Dimov
I am seeing this on the latest 5.1-BK (5.1.24):

ChangeSet@1.2555, 2008-03-30 14:12:27+03:00, aelkin@mysql1000.(none) +3 -0

MySQL compiled with
./BUILD/autorun.sh && ./configure --with-plugins=innobase && make -j2

It does not happen if I run the test standalone:

./mysql-test-run.pl rpl_temporary

It happens if I run
./mysql-test-run.pl
or
./mysql-test-run.pl --suite=rpl

I added this command in the test (just before the failing command) to see what are the selected rows:

select * from information_schema.processlist where command='Binlog Dump';

result:

rpl.rpl_switch_stm_row_mixed 'row' [ pass ]            628
rpl.rpl_temp_table 'row'       [ skipped ]   Doesn't support --binlog-format='row'
rpl.rpl_temporary 'row'        [ fail ]

mysqltest: At line 150: query 'select id from information_schema.processlist where command='Binlog Dump' into @id' failed: 1172: Result consisted of more than one row

The result from queries just before the failure was:
< snip >
drop temporary table t3;
SET TIMESTAMP=1040324219;
SET @@session.pseudo_thread_id=2;
insert into t2 select count(*) from t3;
SET TIMESTAMP=1040324224;
SET @@session.pseudo_thread_id=2;
drop temporary table t3;
select * from t2 ORDER BY f;
f
5
7
drop table t1,t2;
create temporary table t3 (f int);
create temporary table t4 (f int);
create table t5 (f int);
select * from information_schema.processlist where command='Binlog Dump';
ID	USER	HOST	DB	COMMAND	TIME	STATE	INFO
23	root	localhost:53998	NULL	Binlog Dump	0	Has sent all binlog to slave; waiting for binlog to be updated	NULL
10	root	localhost:54034	NULL	Binlog Dump	4	Writing to net	NULL
select id from information_schema.processlist where command='Binlog Dump' into @id;

More results from queries before failure can be found in /tmp/mysql-5.1-bk/mysql-test/var/log/rpl_temporary.log

Aborting: rpl.rpl_temporary failed in default mode. 
To continue, re-run with '--force'.
Stopping All Servers
Autoreleasing /tmp/mysql-test-ports:200
[19 Jun 2008 7:01] Sven Sandberg
BUG#35763 is a dup of this.
[19 Jun 2008 7:03] Sven Sandberg
Fails also in 6.0 main: 
url: https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0&order=7
host: 'sol10-amd64-a' -max GCC
suite: rpl_binlog_row
[7 Jul 2008 11:26] Joerg Bruehe
Same test failure in the 5.1.26-rc build.

There, it occurred in
- *all* 24 test runs on i5OS (32 + 64 bit, all 6 configs each, normal + PS run),
- 6 (out of 12) tests on Solaris 10 AMD64,
- 4 (out of 12) tests on Solaris 10 x86.
[16 Jul 2008 6:55] Alexander Nozdrin
Another symptom:
rpl.rpl_temporary 'row'        [ fail ]

mysqltest: At line 149: query 'select id from information_schema.processlist where command='Binlog Dump' into @id' failed: 1172: Result consisted of more than one row

The result from queries just before the failure was:
< snip >
SET @@session.pseudo_thread_id=2;
insert into t3 select * from t1 where f>=4;
SET TIMESTAMP=1040324211;
SET @@session.pseudo_thread_id=1;
drop temporary table t3;
SET TIMESTAMP=1040324219;
SET @@session.pseudo_thread_id=2;
insert into t2 select count(*) from t3;
SET TIMESTAMP=1040324224;
SET @@session.pseudo_thread_id=2;
drop temporary table t3;
select * from t2 ORDER BY f;
f
5
7
drop table t1,t2;
create temporary table t3 (f int);
create temporary table t4 (f int);
create table t5 (f int);
select id from information_schema.processlist where command='Binlog Dump' into @id;
[17 Jul 2008 10:35] Alexander Nozdrin
A test case is disabled in 6.0 due to often failures.
[17 Jul 2008 15:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/49940

2705 Andrei Elkin	2008-07-17
      Bug #34647 rpl_temporary discovers more than one dump thread and fails to select
      
      The test failure was due to unexpected presense of an old inactive instance of 
      the dump thread.
      
      Fixed adding synchronization barrier to ensure there is no stale dump 
      thread after reset master.
[18 Jul 2008 10:01] Sven Sandberg
The approved patch is http://lists.mysql.com/commits/50009 (was not automatically added here).
[18 Jul 2008 14:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/50040

2707 Andrei Elkin	2008-07-18 [merge]
      merge
[18 Jul 2008 14:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/50041

2707 Andrei Elkin	2008-07-18 [merge]
      merge
[18 Jul 2008 14:21] Andrei Elkin
pushed to 5.1-bugteam
[22 Jul 2008 18:36] Bugs System
Pushed into 5.1.28
[22 Jul 2008 20:03] Paul DuBois
Test case change. No changelog entry needed.

Setting report to Patch queued pending push into 6.0.x.
[28 Jul 2008 14:45] Bugs System
Pushed into 6.0.7-alpha  (revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (version source revid:alik@mysql.com-20080725172155-fnc73o50e4tgl23k) (pib:3)
[28 Jul 2008 15:24] Paul DuBois
Test case change. No changelog entry needed.
[28 Jul 2008 16:44] Bugs System
Pushed into 5.1.28  (revid:davi.arnaut@sun.com-20080722182431-0i2f1yc4uocime9q) (version source revid:davi.arnaut@sun.com-20080722182431-0i2f1yc4uocime9q) (pib:3)
[13 Sep 2008 22:14] Bugs System
Pushed into 6.0.6-alpha  (revid:aelkin@mysql.com-20080718082055-4ogg41y7e5oiwchz) (version source revid:hakan@mysql.com-20080716105246-eg0utbybp122n2w9) (pib:3)