Bug #32391 Character sets: crash with --character-set-server
Submitted: 14 Nov 2007 19:48 Modified: 24 Sep 2010 16:58
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:6.0.5-alpha-debug OS:Linux (SUSE 10 64-bit)
Assigned to: Alexander Barkov CPU Architecture:Any

[14 Nov 2007 19:48] Peter Gulutzan
Description:
I'm using the mysql-5.2-rpl team tree.

I try to start mysqld with --character-set-server and
one of the new character sets.
Crash.

How to repeat:
mysqld --character-set-server=utf16

Example:

linux:/home/pgulutzan # /usr/local/mysql/libexec/mysqld --user=root --character-set-server=utf16
071114 12:31:47  InnoDB: Started; log sequence number 0 1268783
mysqld: ctype-ucs2.c:1297: my_strnncollsp_utf16: Assertion `(tlen % 2) == 0' failed.
071114 12:31:47 - mysqld got signal 6;
...
[15 Nov 2007 2:13] MySQL Verification Team
Thank you for the bug report. Verified as described:

[miguel@skybr 5.2a]$ libexec/mysqld --character-set-server=utf16
071115  0:12:33  InnoDB: Started; log sequence number 0 46409
mysqld: ctype-ucs2.c:1297: my_strnncollsp_utf16: Assertion `(tlen % 2) == 0' failed.
071115  0:12:33 - mysqld got signal 6;
[6 Dec 2007 7:35] Alexander Barkov
I can't repeat with mysql-6.0.4-rpl.

Peter, can you please verify if this bug still happens?
[7 Dec 2007 18:33] Peter Gulutzan
It's still crashing with 6.0.5-alpha-debug.
[25 Nov 2008 16:44] Peter Gulutzan
It still crashes with source pulled today.

I build with BUILD/compile-pentium-debug-max. SUSE 32-bit.

The problem now is this line in strings/ctype-ucs2.c my_strnnncollsp_utf16:
  DBUG_ASSERT((tlen % 2) == 0);
It fails because tlen == 5.

There is no my.cnf file.

Here are the last 50 lines of mysqld.trace.
T@1    : | <ha_recover
T@1    : | >ft_init_stopwords
T@1    : | | >_mymalloc
T@1    : | | | enter: Size: 344
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) locking
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) locked
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) unlocking
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) locking
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) locked
T@1    : | | | mutex: THR_LOCK_malloc (0x9614460) unlocking
T@1    : | | | exit: ptr: 0x963c870
T@1    : | | <_mymalloc
T@1    : | | >init_tree
T@1    : | | | enter: tree: 0x963c870  size: 8
T@1    : | | | >init_alloc_root
T@1    : | | | | enter: root: 0x963c99c
T@1    : | | | <init_alloc_root
T@1    : | | <init_tree
T@1    : | | >alloc_root
T@1    : | | | enter: root: 0x963c99c
T@1    : | | | >_mymalloc
T@1    : | | | | enter: Size: 8136
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) locking
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) locked
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) unlocking
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) locking
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) locked
T@1    : | | | | mutex: THR_LOCK_malloc (0x9614460) unlocking
T@1    : | | | | exit: ptr: 0x9b891a8
T@1    : | | | <_mymalloc
T@1    : | | | exit: ptr: 0x9b891b8
T@1    : | | <alloc_root
T@1    : | | >NdbTableImpl::~NdbTableImpl
T@1    : | | | info: this: 0x960b660
T@1    : | | <NdbTableImpl::~NdbTableImpl
T@1    : | | >NdbTableImpl::~NdbTableImpl
T@1    : | | | info: this: 0x960b540
T@1    : | | <NdbTableImpl::~NdbTableImpl
T@1    : | | >NdbMutex_Destroy
T@1    : | | | info: NdbMem_Free 0x9615040
T@1    : | | <NdbMutex_Destroy
T@1    : | | >NdbMutex_Destroy
T@1    : | | | info: NdbMem_Free 0x9615020
T@1    : | | <NdbMutex_Destroy
[25 Dec 2008 10:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62321

2823 Alexander Barkov	2008-12-25
      Bug#32391 Character sets: crash with --character-set-server
      
      Problem:
      Crashed on initialization of the build-in stopwords
      when started with --character-set-server=utf16.
      ft_init_topwords() mistakenly compared the built-in
      stopwords using "utf16" as character set, which lead
      to exit on DBUG_ASSERT((slen % 2) == 0) in
      my_strnncollsp_utf16() when comparing a word with an even length
      (for example word="about", len=5).
      
      Fix:
      - using latin1 when initializing the built-in stopwords.
      - adding conversion from latin1 to "real" multi-byte character sets
        (UCS2, UTF16, UTF32) when searching stopwords.
      - additional fix: stopwords are now searched according
        to the column collation, that is when the built-in
        stopwords are in use, the word "ABOUT" is treated as
        a stopword only in a case insensitive collation.
      
      Changeset:
      
      - New files added to test --character-set-server=utf16:
      
        mysql-test/r/ctype_utf16_def.result
        mysql-test/t/ctype_utf16_def-master.opt
        mysql-test/t/ctype_utf16_def.test
      
      - Moving the character set conversion function
        from /sql to /strings. Adding the function prototype:
      
        include/m_ctype.h
        strings/ctype.c
      
      -  Adding tests for stopword case sensitivity:
      
        mysql-test/r/fulltext.result
        mysql-test/t/fulltext.test
      
      
      - Moving most of the converstion code to /strings:
        sql/sql_string.cc
      
      - The main fix: splitting code into more separate functions,
        loading stopwords into two trees (for case sensitive and 
        case insensitive searches), adding conversion code:
        
        storage/myisam/ft_parser.c
        storage/myisam/ft_stopwords.c
        storage/myisam/ftdefs.h
[5 Aug 2010 6:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115055

3162 Alexander Barkov	2010-08-05
      Bug#32391 Character sets: crash with --character-set-server 
      
      Problem:
      mysqld crashed on initialization of the build-in stopwords
      when started with --character-set-server=utf16.
      ft_init_topwords() mistakenly compared the built-in 
      stopwords using "utf16" as character set, which lead
      to exit on DBUG_ASSERT((slen % 2) == 0) in
      my_strnncollsp_utf16() when comparing a word with an even length
      (for example word="about", len=5).
      
      Fix:
      Using latin1 when initializing the built-in stopwords   
      for the "tricky" character sets.
[12 Aug 2010 5:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115542

3162 Alexander Barkov	2010-08-12
      Bug#32391 Character sets: crash with --character-set-server 
      
      Problem:
      mysqld crashed on initialization of the built-in stopwords
      when started with --character-set-server=utf16.
      ft_init_stopwords() mistakenly compared the built-in 
      stopwords using "utf16" as character set, which lead
      to exit on DBUG_ASSERT((slen % 2) == 0) in
      my_strnncollsp_utf16() when comparing a word with an even length
      (for example word="about", len=5).
      
      Fix:
      Using latin1 when initializing the built-in stopwords   
      for the "tricky" character sets.
[13 Aug 2010 13:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115681

3204 Alexander Barkov	2010-08-13
      Bug#32391 Character sets: crash with --character-set-server 
      
      Problem:
      mysqld crashed on initialization of the built-in stopwords
      when started with --character-set-server=utf16.
      ft_init_stopwords() mistakenly compared the built-in 
      stopwords using "utf16" as character set, which lead
      to exit on DBUG_ASSERT((slen % 2) == 0) in
      my_strnncollsp_utf16() when comparing a word with an even length
      (for example word="about", len=5).
      
      Fix:
      Using latin1 when initializing the built-in stopwords   
      for the "tricky" character sets.
[13 Aug 2010 13:42] Alexander Barkov
Pushed into mysql-trunk-bugfixing (mysql-5.6.0-m4)
Pushed into mysql-next-mr-bugfixing (mysql-5.6.99-m5)
[19 Aug 2010 6:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116165

3187 Alexander Barkov	2010-08-19
      Backporting Bug#32391 Character sets: crash with --character-set-server
      from mysql-trunk-bugfixing (5.6.1-m5) from mysql-5.5-bugfixing (5.5.6-m3).
[19 Aug 2010 6:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116166

3187 Alexander Barkov	2010-08-19
      Backporting Bug#32391 Character sets: crash with --character-set-server
      from mysql-trunk-bugfixing (5.6.1-m5) from mysql-5.5-bugfixing (5.5.6-m3).
[19 Aug 2010 6:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116167

3188 Alexander Barkov	2010-08-19
      Backporting Bug#32391 Character sets: crash with --character-set-server
      from mysql-trunk-bugfixing (5.6.1-m5) from mysql-5.5-bugfixing (5.5.6-m3).
[19 Aug 2010 6:59] Alexander Barkov
backported to mysql-5.5-bugfixing (5.5.6-m3)
[25 Aug 2010 9:23] Bugs System
Pushed into mysql-5.5 5.5.6-m3 (revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (version source revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (merge vers: 5.5.6-m3) (pib:20)
[30 Aug 2010 8:31] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (version source revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (merge vers: 5.6.1-m4) (pib:21)
[30 Aug 2010 8:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (version source revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (pib:21)
[24 Sep 2010 16:58] Paul DuBois
Noted in 5.5.6, 5.6.1 changelogs.

If the server was started with character_set_server set to utf16, it
crashed during full-text stopword initialization. Now the stopword
file is loaded and searched using latin1 if character_set_server is
ucs2, utf16, or utf32. If any table was created with FULLTEXT indexes
while the server character set was ucs2, utf16, or utf32, it should
be repaired using this statement:

REPAIR TABLE tbl_name QUICK;