Bug #32833 Falcon: crash creating utf8 index
Submitted: 29 Nov 2007 2:26 Modified: 3 Dec 2007 14:26
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0.4-alpha-debug OS:Linux (SUSE 10 64-bit)
Assigned to: Kevin Lewis CPU Architecture:Any

[29 Nov 2007 2:26] Peter Gulutzan
Description:
I create a table with a UTF32 column.
I populate it with 1,000,000 rows.
I change the UTF32 column to UTF8.
I try to create an index.
Crash.

The crash does not happen if engine=innodb.

How to repeat:
delimiter //
drop table t//
drop procedure p//
CREATE TABLE t (int_column INT, char_column CHAR(5) CHARACTER SET UTF32) ENGINE=FALCON
PARTITION BY KEY(char_column)//
SET @@autocommit=0//
CREATE PROCEDURE p ()
BEGIN
  DECLARE v_rownum INT DEFAULT 0;
  DECLARE v_random_character CHAR(1);
  DECLARE v_random_string VARCHAR(50);
  WHILE v_rownum < 1000000 DO
    IF V_rownum MOD 1000 = 0 THEN COMMIT; SELECT v_rownum; END IF;
    SET @stmt1 = CONCAT('INSERT INTO t VALUES (',
    v_rownum, ',', '0x',v_rownum, ');');
SELECT v_rownum,v_random_string,@stmt1;
    PREPARE stmt1 FROM @stmt1;
    EXECUTE stmt1;
    SET v_rownum = v_rownum + 1;
    END WHILE;
  END//
DELIMITER ;
CALL p();
ALTER TABLE t MODIFY COLUMN char_column CHAR(5) CHARACTER SET utf8;
CREATE INDEX i ON t (char_column);
[29 Nov 2007 12:31] MySQL Verification Team
Thank you for the bug report.

mysql> ALTER TABLE t MODIFY COLUMN char_column CHAR(5) CHARACTER SET utf8;
Query OK, 1000000 rows affected, 65535 warnings (28.01 sec)
Records: 1000000  Duplicates: 0  Warnings: 0

mysql> CREATE INDEX i ON t (char_column);
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>
[29 Nov 2007 19:55] Kevin Lewis
Peter, what codebase did you use to reproduce this?  Usually, we should try to reproduce in the mysql-6.0-falcon tree.  We have been merging down from mysql-6.0 this week.  But this is what I get in my current code;

mysql> CREATE TABLE t (int_column INT, char_column CHAR(5) CHARACTER SET UTF32) ENGINE=FALCON;
ERROR 1115 (42000): Unknown character set: 'UTF32'

This seems to be related to WL#1213.  Our code base does not have that yet.
[29 Nov 2007 20:19] Peter Gulutzan
Hi Kevin,

when I put '6.0.4-alpha-debug' in the bug's 'Version' field,
that means I'm using mysql-6.0. I would not report a bug in
a team tree without saying so.

Yes, the utf32 character set is due to WL#1213.
I marked the bug "Falcon" because the crash only happens with Falcon.
[1 Dec 2007 16:53] Kevin Lewis
Once the WL1213 code was merged into the falcon tree, and I was able to correclty merge it into my working tree, I found that I could not reproduce this crash.  Assuming it was really a bug in the main mysql-6.0 tree, it must be fixed now.  

I adapted the testcase to the following which is different in that it insert 100 records instead of the 1,000,000 records.  I did this so the test will run faster, but it does not crash with 1,000,000 records either.  It inserts UTF32 characters incremented by 1000.

--source include/have_falcon.inc
SET STORAGE_ENGINE = Falcon;
#
# Bug #32833: : Falcon: crash creating utf8 index
#
--echo *** Bug #32833 ***

--disable_warnings
DROP TABLE IF EXISTS t1;
DROP PROCEDURE IF EXISTS p1;
--enable_warnings

--echo # Create table
CREATE TABLE t1 (int_column INT, char_column CHAR(5) CHARACTER SET UTF32) 
  ENGINE=FALCON PARTITION BY KEY(char_column);

--echo # Create and call procedure
SET @@autocommit=0;
delimiter //;
CREATE PROCEDURE p1 ()
BEGIN
  DECLARE v_rownum INT DEFAULT 0;
  DECLARE v_random_character CHAR(1);
  DECLARE v_random_string VARCHAR(50);
  WHILE v_rownum < 1000 DO
    IF V_rownum MOD 100 = 0 THEN COMMIT; END IF;
    # IF V_rownum MOD 1000 = 0 THEN SELECT v_rownum; END IF;
    SET @stmt1 = CONCAT('INSERT INTO t1 VALUES (',
      v_rownum, ',', '0x',v_rownum, ');');
    # SELECT v_rownum,v_random_string,@stmt1;
    PREPARE stmt1 FROM @stmt1;
    EXECUTE stmt1;
    SET v_rownum = v_rownum + 1000;
    END WHILE;
	SELECT v_rownum;
  END//
DELIMITER ;//
CALL p1();

--echo # Change column from UTF32 to UTF8
ALTER TABLE t1 MODIFY COLUMN char_column CHAR(5) CHARACTER SET utf8;

--echo # Add index to UTF8 column
CREATE INDEX i ON t1 (char_column);

--echo # Final cleanup
DROP TABLE t1;
DROP PROCEDURE p1;
[2 Dec 2007 20:23] Hakan Küçükyılmaz
Works now.
[3 Dec 2007 3:02] Kevin Lewis
I changed the test script a little so that there are more records put into the partitioned file and the utf32 character values are distributed in value by an odd number; 357 

This gives an even distribution of 2802 values between 0 and 1,000,000. I also disabled warnings during the select and the alter table since a lot of these utf32 values do not seem to convert nicely to utf8. That is probably expected, but someone (maybe Bar) should look at how these conversions are done and determine if it is correct.  The same conversions happen for myisam, it looks like.  This test script is mysql-test/suite/falcon/t/falcon_bug_32833.test.
[3 Dec 2007 10:02] Bugs System
Pushed into 6.0.4-alpha
[3 Dec 2007 14:26] MC Brown
A note has been added to the 6.0.4 changelog: 

Creating an index on a Falcon table with a column using UTF32 that has been converted to UTF8 would cause a server crash.
[4 Dec 2007 11:49] Alexander Barkov
Kevin,

I can confirm that many of the characters in Peter's
example are not good UTF-8 sequences. So warnings are expected.
[4 Dec 2007 11:55] Alexander Barkov
Kevin,

with your change with magic number 357 many numbers
are not valid utf8 codes. But I think it's ok.

Please make sure that mysqld without your patch does
crash with your modified test case.