Bug #41014 falcon_bug_39708 fails in pushbuild in 6.0-rpl: "succeeded - should have failed"
Submitted: 25 Nov 2008 14:04 Modified: 19 Dec 2008 14:40
Reporter: Sven Sandberg Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0 OS:Any
Assigned to: John Embretsen CPU Architecture:Any
Tags: 6.0-rpl-green, F_TEST, falcon_bug_39708, pushbuild, test failure

[25 Nov 2008 14:04] Sven Sandberg
Description:
pushbuild failure started in 6.0-rpl on 2008-11-24:

falcon.falcon_bug_39708                  [ fail ]

CURRENT_TEST: falcon.falcon_bug_39708
mysqltest: At line 4: query 'CREATE TABLE t1(a VARCHAR(140) CHARACTER SET utf8, KEY(a)) ENGINE=falcon' succeeded - should have failed with errno 1071...
Failed to read from '/dev/shm/var-falcon-112/log/falcon_bug_39708.log', errno: 22

 - saving '/dev/shm/var-falcon-112/log/falcon.falcon_bug_39708/' to '/dev/shm/var-falcon-112/log/falcon.falcon_bug_39708/'

Retrying test, attempt(2/3)...

How to repeat:
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-rpl&order=109 debx86-b/falcon
xref: http://tinyurl.com/6fstev
[27 Nov 2008 16:13] Kevin Lewis
John, Can you determine why this test fails using a different MTR than we have in Falcon?  It says the CREATE failed but does not say why.  Can you reproduce this?
[27 Nov 2008 18:07] John Embretsen
Strangely enough, behaviour differs between the branches mysql-6.0-falcon-team and mysql-6.0-rpl. I started the server using:

./mysql-test-run.pl --mysqld=--falcon-page-size=2k --start-and-exit

(in mysql-6.0-rpl branch I had to set the environment variables MTR_VERSION=1 first)

--- --- --- --- --- ---
mysql-6.0-falcon-team:
--- --- --- --- --- ---

mysql> SHOW VARIABLES LIKE '%falcon_page_size%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| falcon_page_size | 2048  | 
+------------------+-------+
1 row in set (0.01 sec)

mysql> CREATE TABLE t1(a VARCHAR(140) CHARACTER SET utf8, KEY(a)) ENGINE=falcon;
ERROR 1071 (42000): Specified key was too long; max key length is 540 bytes

--- --- --- --- --- ---
mysql-6.0-rpl:
--- --- --- --- --- ---

mysql> SHOW VARIABLES LIKE '%falcon_page_size%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| falcon_page_size | 2048  | 
+------------------+-------+
1 row in set (0.00 sec)

mysql> CREATE TABLE t1(a VARCHAR(140) CHARACTER SET utf8, KEY(a)) ENGINE=falcon;
Query OK, 0 rows affected (0.08 sec)

The test expects the CREATE TABLE statement to fail, as it does in the mysql-6.0-falcon-team branch. I will try to investigate further.
[27 Nov 2008 18:46] John Embretsen
Falcon versions in the respective branches are (from storage/falcon/StorageVersion.h):

mysql-6.0-falcon-team:
----------------------
#define FALCON_VERSION	"T1.3-6"
#define FALCON_DATE		"27 November, 2008"

mysql-6.0-rpl:
----------------------
#define FALCON_VERSION	"T1.3-5"
#define FALCON_DATE		"14 November, 2008"

In other words, the -rpl branch seems to have a pretty much up-to-date version of Falcon, making this issue seem even more strange.
[28 Nov 2008 4:47] Kevin Lewis
John,

I suspect that this function is returning two different numbers because of the lack of a current storageConnection in this handler.  In other words, max_supported_key_length() might be called at a different time.  If that is the case, then I would like Lars-Erik to investigate if the storageConnection is really needed to know the page size, and thus the max key length.

uint StorageInterface::max_supported_key_length(void) const
{
	// Assume 4K page unless proven otherwise.
	if (storageConnection)
		return storageConnection->getMaxKeyLength();

	return MAX_INDEX_KEY_LENGTH_4K;  // Default value.
}
[4 Dec 2008 14:27] John Embretsen
Kevin, 

you are probably right when it comes to StorageInterface::max_supported_key_length in ha_falcon.cpp. When running the test (falcon_bug_39708) in a debugger I see that in the -rpl branches there does not seem to be a storageConnection when the method is accessed, as opposed to when doing the same in main or falcon-team branches. I did not do any further debugging in that area.

However, I have done some digging in order to figure out why the behavior is different between the branches in question. My conclusion is that it has to do with how the server is initialized before the "real" server and the test or test suite is started. The MTR does this in a way that is similar to the mysql_install_db script.

With the new MTR in the -rpl branches the server is bootstrapped with fewer options than in main and other branches. It seems that there is one specific option which makes all the difference in this case:

--loose-skip-falcon

This option is included in the boostrap invocation of mysqld in MTR v1, but is not included in MTR v2 (which is in the -rpl branches only).

Below is the full set of options from the respective mysqld initializations (I have replaced full paths with "/srcdir/"), obtained by running

./mtr --suite=falcon --start-and-exit --skip-ndb --script-debug

MTRv1 (main 6.0 and falcon branches):
--------------------------------------

mysqld  
--no-defaults 
--bootstrap 
--basedir=/srcdir/mysql-test 
--datadir=/srcdir/mysql-test/var/master-data 
--loose-skip-innodb 
--loose-skip-ndbcluster 
--loose-skip-falcon 
--disable-sync-frm 
--loose-disable-debug 
--tmpdir=. 
--core-file 
--language=/srcdir/sql/share/english 
--character-sets-dir=/srcdir/sql/share/charsets

MTR v2 (rpl branches and mysql-6.0_with_tests_from_6.0-rpl):
-------------------------------------------------------------

mysqld  
--no-defaults 
--bootstrap 
--basedir=/srcdir/mysql-test 
--datadir=/srcdir/mysql-test/var/master-data
--loose-skip-innodb 
--loose-skip-ndbcluster 
--tmpdir=. 
--core-file 
--language=/srcdir/sql/share/english 
--character-sets-dir=/srcdir/sql/share/charsets

So, it seems that if falcon is skipped during bootstrap the --falcon-page-size=2k option takes effect when the server is subsequently started (if the option is included). This probably this means that a storageConnection is obtained at the correct time.

If falcon is *not* skipped during bootstrap (and --falcon-page-size=2k is not specified at bootstrap) then to the user it seems that Falcon remembers the default value from the bootstrap (although this is not shown when doing SHOW VARIABLES). This probably means that the Falcon handler does not have a storageConnection when the table is created in this case.

I hope this makes any sense. From my point of view it seems that this requires a fix in Falcon so that options specified at server startup (as opposed to bootstrap/initialization) are adhered to, regardless of whether falcon was disabled during bootstrap or not. I don't know if --falcon-page-size is the only option affected by this.

Assigning this to Lars-Erik as suggested by Kevin earlier.
[4 Dec 2008 14:39] John Embretsen
How to provoke this issue in mysql-6.0-falcon-team branch (Unix/Linux style):

1. Compile the source as you normally do. Let's call the top-level directory "SRCDIR".

2. Initialize the database/server:

   ./scripts/mysql_install_db \
       --no-defaults \
       --srcdir=${SRCDIR} \
       --datadir=${SRCDIR}/var

3. Start mysqld:

   ./sql/mysqld \
       --no-defaults \
       --basedir=${SRCDIR} \
       --language=${SRCDIR}/sql/share/english \
       --falcon-page-size=2k

4. Connect to the server:

   ./client/mysql -u root -h 127.0.0.1

5. Verify value of falcon-page-size option and try to create table:

   mysql> use test;
   Database changed

   mysql> SHOW VARIABLES LIKE '%falcon_page_size%';
   +------------------+-------+
   | Variable_name    | Value |
   +------------------+-------+
   | falcon_page_size | 2048  |
   +------------------+-------+
   1 row in set (0.00 sec)

   mysql> CREATE TABLE t1(a VARCHAR(140) CHARACTER SET utf8, KEY(a)) ENGINE=falcon;

If the CREATE TABLE statement succeeded, you have reproduced the bug.

If you include --loose-skip-falcon when running mysql_install_db (step 2), you will get an error instead (correct behavior):

ERROR 1071 (42000): Specified key was too long; max key length is 540 bytes
[8 Dec 2008 14:24] Lars-Erik Bjørk
It does not seem that the existence of a storageConnection is the difference between these two cases. Neither of the cases have a storageConnection set when StorageInterface::max_supported_key_length is called. The storageConnection is not created until later when processing the create statement.

In the case where an error is returned (correct behavior) this is done in Index::checkMaxKeyLength, which calls Database::getMaxKeyLength to get the maximum key length.

In the case where Falcon is bootstrapped from mysql_install_db, no page-size option is given, so Falcon is bootstrapped with a default page_size of 4k.
When subsequently starting mysqld with the option --falcon-page-size=2k, the database is already created with a page-size of 4k, and the option seems to be disregarded. This results in dbb->pageSize in the previously mentioned method (Database::getMaxKeyLength) actually being 4096.

In the case where Falcon is not bootstrapped by mysql_install_db, the database will be created when starting mysqld, and the --falcon-page-size=2k option will be used.

I am not sure what is the proper behavior here, or if it is possible to switch page size without recreating the entire database (probably not?)

The best way may be to either avoid bootstrapping Falcon, or giving the page-size option to the bootstrap process (may not be suitable for mtr?)
[9 Dec 2008 14:36] John Embretsen
Let me try to summarize our discoveries so far and propose a course of action...

 - This test fails with MTR v2 because:

    a) The test (as it is now) requires a certain page-size in Falcon (2K).
       This requires a clean server restart during the test suite.

    b) Falcon's page-size can only be set at the moment when Falcon is
       initialized. Restarting the server does not help unless existing 
       Falcon tablespace files are deleted first.

   Experiments detailed above show that this could have been caused by the
   fact that MTR v2 is not including the option --loose-skip-falcon at
   bootstrap, thus "freezing" the page-size value.

It is interesting to note that if I add --loose-skip-falcon back to the bootstrap options in 

   mysql-test/mysql-test-run.pl:mysql_install_db (line 2592) and
   mysql-test/lib/v1/mysql-test-run.pl:install_db (line 3147)

   (based on mysql-6.0_with_tests_from_rpl-6.0 branch 2008-12-09)
   
and run falcon_bug_39708 both alone and before/after other tests, the test passes when using MTR v2, but fails when setting the environment variable MTR_VERSION=1.

So, it seems that MTR v2 does more cleanup of the mysql-test/var/ directory between server restarts than MTR "v1.2" (MTR_VERSION=1 in rpl branches) and MTR v1 (main branch).

Adding the skip option to the bootstrap command works because MTR v2 uses separate data dirs for bootstrap and test server. Cleaning up the test server's data dir only does not help because the bootstrap directory still contains the falcon tablespace files which store the page-size value.

My conclusion is: It's complicated. Having a specification document for MTR v2 would really help in this case and other similar cases that I'm guessing we will see once more teams start using the new MTR.

I propose the following step to fix this:

 - add back --loose-skip-falcon to bootstrap options in MTR v2 and MTR v1.2.
   (-rpl branches and mysql-6.0_with_tests_from_rpl-6.0 branch)

  This will make the test pass when MTR v2 is used (as in Pushbuild), but it
  may still fail if MTR v2 is used with environment variable MTR_VERSION=1
  (depends on test ordering).

In addition, I propose to also do the following:

 - document that the falcon_page_size variable is fixed at Falcon
   initialization time and that it is not possible to change the value after
   Falcon's default tablespaces have been created.

 - file request to change SHOW VARIABLES to show values actually used by the
   storage engine, not values that are specified on the command line but
   ignored (this applies to falcon_page_size in particular).

 - in order to save number of server restarts and test running time, modify
   falcon_bug_39708 to test for corresponding behavior using the default
   page-size (4K) (some more investigation is needed to find out if this covers
   all intentions of the original test).
[16 Dec 2008 14:07] John Embretsen
Filed new Bug#41502 (MTR v2 should not load plugins during server bootstrap) to track changes to MTR related to this issue. Once that bug is fixed, this test should stop failing with MTR v2.

Filed new Bug#41331 (SHOW VARIABLES should report values used by Falcon instead of ignored values) as a request to make the server/Falcon report the page size value actually used when running the SHOW VARIABLES command.
[19 Dec 2008 9:42] John Embretsen
Test no longer fails in mysql-6.0-rpl branch. The fix for Bug#41502 resolved this issue. Please re-open if you see it happening again.

However, I forgot to also fix the "v1" version of mysql-test-run.pl that is currently included in the rpl branches. This means that if you set the environment variable MTR_VERSION=1, the v1 version will be used and the test will fail the same way for the same reason.

I probably won't push a fix for this unless anyone requests it. Patch is as follows:

=== modified file 'mysql-test/lib/v1/mysql-test-run.pl'
--- mysql-test/lib/v1/mysql-test-run.pl 2008-11-14 08:45:32 +0000
+++ mysql-test/lib/v1/mysql-test-run.pl 2008-12-19 09:37:17 +0000
@@ -3144,6 +3144,7 @@
   mtr_add_arg($args, "--basedir=%s", $path_my_basedir);
   mtr_add_arg($args, "--datadir=%s", $data_dir);
   mtr_add_arg($args, "--loose-skip-innodb");
+  mtr_add_arg($args, "--loose-skip-falcon");
   mtr_add_arg($args, "--loose-skip-ndbcluster");
   mtr_add_arg($args, "--tmpdir=.");
   mtr_add_arg($args, "--core-file");
[19 Dec 2008 12:01] John Embretsen
Slight mind-lapse today... Please disregard the latest comment about patching the "v1" version of MTR in the rpl branches.

The patch for MTR_VERSION will not work when other falcon tests run before falcon_bug_39708. This is most likely because

  a) MTR "v1" (MTR_VERSION=1) does not clean up falcon tablespaces 
     between test runs that require server restarts.
  b) MTRv1 (in main branch) runs a sub-routine called rm_falcon_tables 
     after the servers have been stopped. The "v1" version of MTR in 
     rpl branches does not contain this function. This function deletes 
     the falcon tablespaces, which is where the value of the 
     falcon_page_size variable is stored.

So for the patch to work with MTR_VERSION=1 further changes are required, e.g. adding back the rm_falcon_tables function and calling it at appropriate places.
[19 Dec 2008 14:40] John Embretsen
Marking (at Kevin's request) this as duplicate of 

Bug #41502 - MTR v2 should not load plugins during server bootstrap

as that was the root cause for this failure (even though this bug is actually older). This way we get more accurate bug statistics.