Bug #10747 federated test aborts with dedicated port options and --with-ndbcluster
Submitted: 19 May 2005 14:52 Modified: 22 Jun 2005 4:55
Reporter: Ingo Strüwing Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.7 OS:Linux (Linux/x86)
Assigned to: Bugs System CPU Architecture:Any

[19 May 2005 14:52] Ingo Strüwing
Description:
When running the federated test --with-ndbcluster --master_port=59011 --slave_port=59012 it aborts:

TEST                            RESULT
-------------------------------------------------------
ERROR: /home/mydev/mysql-5.0-5000/mysql-test/var/run/slave.pid was not created in 400 seconds;  Aborting

mysql-test/var/log/slave.err:
CURRENT_TEST: federated
050519 17:03:31 [ERROR] Can't start server: Bind on TCP/IP port: Address already in use
050519 17:03:31 [ERROR] Do you already have another mysqld server running on port: 59012 ?
050519 17:03:31 [ERROR] Aborting

I double checked before starting the test that _no_ mysqld was running on my system and that port number 59011 and 59012 were _not_ in use. During the 400 seconds period, however, two mysqld were running, one used port 59011 and the other used 59012. So the test seemed to try to start another slave mysqld.

If the federated test really needs another slave mysqld, then please make sure that it can be assigned a dedicated port number via an option to mysql-test-run, similar to --master_port, --slave_port, or --ndbcluster_port. I desperately want to run two or more mysql-test-run at the same time.

The funny thing is that it happens only with --with-ndbcluster but not with --skip-ndbcluster. Regardless if the --ndbcluster_port option is present or not.

How to repeat:
cd mysql-test
./mysql-test-run --with-ndbcluster --master_port=59011 --slave_port=59012 federated
[29 May 2005 0:57] Jorge del Conde
Thanks for your bug report.  I was able to reproduce this with 5.0.7 from bk.
[7 Jun 2005 18:22] Patrick Galbraith
<patg>	I have a bug logged against federated by ingo that he can't run two simultaneous tests, and requests that I should add to the test suite federated having it's own master/slave server
[19:34] 	<joerg>	kent: So what is the resolution for the first build?  Plain "max" with tests, or with yassl and no tests?
[19:35] 	<patg>	kent: so, that would require the modification of mysql-test-run
[19:37] 	<kent>	patg: Why can't he run two simultaneous tests?
[19:37] 	<Matt>	kent, sorry not up-to-date with the demands for yaSSL and the status of how it currently builds. so i have no opinion
[19:37] 	*	kent wants to avoid even more servers started, it slows us down
[19:37] 	<patg>	kent: one sec, I'll post his bug
[19:38] 	<patg>	kent: http://bugs.mysql.com/bug.php?id=10747
[19:38] 	<patg>	kent: that's my first feeling. Maybe there's something else to be done for this.
[19:40] 	<kent>	patg: I heard something about the server now using more ports, is that the federated doing that?
[19:40] 	<kent>	patg: But that we just have to change so the port range reserved is a bit larger. In that case, a simple correction.
[19:40] 	<patg>	kent: all I need are two servers, and I have SLAVE_PORT and MASTER_PORT which I use. Well, really only SLAVE_PORT
[19:41] 	<patg>	kent: oh, really? 
[19:41] 	<patg>	kent: range, where is that set in mysql-test-run?
[19:42] 	<kent>	patg: I will look, just a minute.
[19:59] 	<joerg>	Yes, that was my idea about this.
[20:03] 	<kent>	patg: It is a second master that is started when cluster is used. It is more a documentation flaw, lack of error tests in the script, and bad flag name. A --master_port=1000 means the master "port range" starts at 1000.
[20:04] 	<kent>	patg: So the second master conflicts with the first slave.
[20:06] 	<patg>	kent: hmm, so --master_port's range is in conflict with --slave_port's?
[20:06] 	<kent>	patg: Yes. As he put them close, it will not work.
[20:07] 	<patg>	kent: I've asked him to come to the channel
[20:07] 	-->	ingo (mydev@10.100.50.51) has joined #engineering
[20:07] 	<ingo>	Hi.
[20:07] 	<patg>	ingo: kent tells me it is more of a range issue than a federated issue
[20:08] 	<patg>	<kent> patg: It is a second master that is started when cluster is used. It is more a documentation flaw, lack of error tests in the script, and bad flag name. A --master_port=1000 means the master "port range" starts at 1000.
[20:08] 	<patg>	[20:04]  <kent> patg: So the second master conflicts with the first slave.
[20:08] 	<kent>	ingo: The ndb team did a change, so that when --with-ndbcluster is given, it will start two masters.
[20:09] 	<patg>	kent: should we inform paul dubois about this issue?
[20:09] 	<ingo>	Hm. What I need in my environment is predictable port numbers for everything.
[20:09] 	<ingo>	I want to run several mysql-test-run at the same time.
[20:09] 	<ingo>	So every port used by them must be configurable from the command line.
[20:10] 	<ingo>	If this is possible and documented, I am pleased.
[20:10] 	<kent>	info: So for the 4x slave test, you want to configure 2x masters and 4x slaves? Isn't better documentation and check in the script for conflicts a better idea?
[20:12] 	<patg>	kent: those flags need to be changed, I would say to --master_port_start_range or something.
[20:12] 	<ingo>	If the script solves the usage of un-used ports, it would be ok for me too.
[20:12] 	<patg>	kent: are there no checks to see if a port is indeed being used?
[20:12] 	<ingo>	I meant "if the script cares for the usage of un-used ports".
[20:13] 	<kent>	patg: Not in the old script, no. It calls mysqladmin to shut down possibly old servers
[20:13] 	<patg>	yikes.
[20:13] 	*	patg knows, yes, he wanted to help out with this ;)
[20:13] 	<kent>	patg: How do you test if a port is in use from Bourne shell?
[20:14] 	<patg>	kent: not sure, but I can find out.
[20:14] 	<patg>	kent: my guess, netstat -a, then grep
[20:14] 	<kent>	patg: The Perl version does have code to check port usage, but I didn't think of using it in this case. I use it to see if a shutdown is complete enough to restart the server.
[20:14] 	<patg>	kent: using netstat -a ?
[20:16] 	<kent>	ingo, patg: In this case I'm not prepared to invest time in improving the old mysql-test-run.sh script, but someone else is free to do that. You need to specify the port numbers "with some space between them". I will make a note to add the test to the new script.
[20:19] 	<patg>	kent: so, what needs to be done is that mysql-test-run.sh needs to make sure there are adequate ranges/ports for each server started given a range?
[8 Jun 2005 8:24] Ingo Strüwing
Selecting the port numbers with some "space" between them did indeed help. From my point of view the problem is solved now. it might be good to document this behaviour somehow though. Be it as a comment in mysql-test-run or even better in an --help option. Or even in the reference manual.
[22 Jun 2005 4:55] Patrick Galbraith
Found out that a larger range of ports needs to be used. There should eventually be better handling of this in the test script, but since the script is currently being re-written, this is something that can be worked out later since it's not really a bug.