Bug #28468 Access violation while SELECT with utf8_general_cs comparison
Submitted: 16 May 2007 13:48 Modified: 8 Oct 2007 3:58
Reporter: Anton Myshkovsky Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.0.37 OS:Windows
Assigned to: Assigned Account CPU Architecture:Any
Tags: Contribution

[16 May 2007 13:48] Anton Myshkovsky
Description:
Access Violation.

The call stack is:
> 00000000()	
> mysqld-debug.exe!check_simple_equality(Item * left_item=0x1e8287e0, Item * right_item=0x1e828878, Item * item=0x1e828900, COND_EQUAL * cond_equal=0x2620e264)  Line 6843 + 0x28 bytes	C++

> mysqld-debug.exe!check_equality(Item * item=0x1e828900, COND_EQUAL * cond_equal=0x2620e264, List<Item> * eq_list=0x2620e214)  Line 6975 + 0x15 bytes	C++

> mysqld-debug.exe!build_equal_items_for_cond(Item * cond=0x1e828900, COND_EQUAL * inherited=0x00000000)  Line 7146 + 0x11 bytes	C++

> mysqld-debug.exe!build_equal_items(THD * thd=0x1e7ff998, Item * cond=0x1e828900, COND_EQUAL * inherited=0x00000000, List<st_table_list> * join_list=0x1e7ffda0, COND_EQUAL * * cond_equal_ref=0x1e82983c)  Line 7270 + 0xd bytes	C++

> mysqld-debug.exe!optimize_cond(JOIN * join=0x1e828a68, Item * conds=0x1e828900, List<st_table_list> * join_list=0x1e7ffda0, Item::cond_result * cond_value=0x1e8297b4)  Line 8324 + 0x22 bytes	C++

> mysqld-debug.exe!JOIN::optimize()  Line 676 + 0x32 bytes	C++

> mysqld-debug.exe!mysql_select(THD * thd=0x1e7ff998, Item * * * rref_pointer_array=0x1e7ffde0, st_table_list * tables=0x1e828600, unsigned int wild_num=0, List<Item> & fields={...}, Item * conds=0x1e828900, unsigned int og_num=0, st_order * order=0x00000000, st_order * group=0x00000000, Item * having=0x00000000, st_order * proc_param=0x00000000, unsigned __int64 select_options=2156677632, select_result * result=0x1e828a58, st_select_lex_unit * unit=0x1e7ffa40, st_select_lex * select_lex=0x1e7ffcb8)  Line 2083 + 0x8 bytes	C++

> mysqld-debug.exe!handle_select(THD * thd=0x1e7ff998, st_lex * lex=0x1e7ff9d8, select_result * result=0x1e828a58, unsigned long setup_tables_done_option=0)  Line 256 + 0x9f bytes	C++

> mysqld-debug.exe!mysql_execute_command(THD * thd=0x1e7ff998)  Line 2632 + 0x13 bytes	C++

> mysqld-debug.exe!mysql_parse(THD * thd=0x1e7ff998, char * inBuf=0x1e8282e0, unsigned int length=76)  Line 5974 + 0x9 bytes	C++

> mysqld-debug.exe!dispatch_command(enum_server_command command=COM_QUERY, THD * thd=0x1e7ff998, char * packet=0x1e8201b1, unsigned int packet_length=77)  Line 1790 + 0x1d bytes	C++

> mysqld-debug.exe!do_command(THD * thd=0x1e7ff998)  Line 1572 + 0x31 bytes	C++

> mysqld-debug.exe!handle_one_connection(void * arg=0x1e7ff998)  Line 1198 + 0x9 bytes	C++

> mysqld-debug.exe!pthread_start(void * param=0x1e75fc98)  Line 62 + 0x7 bytes	C

> mysqld-debug.exe!_callthreadstart()  Line 293 + 0xf bytes	C

> mysqld-debug.exe!_threadstart(void * ptd=0x1e801588)  Line 277	C

> kernel32.dll!7c80b683() 	

How to repeat:
-- utf8_general_cs
-- test for collations
--

SET NAMES utf8;
SET storage_engine=innoDB;

DROP DATABASE IF EXISTS MYDB;
CREATE DATABASE MYDB 
	DEFAULT CHARACTER SET utf8
	DEFAULT COLLATE utf8_general_cs;
USE MYDB;

SET @KEY = _utf8 0xD090D091D092D093D094D095D096D097D098D099D09AD09BD09CD09DD09ED09FD0A0D0A1D0A2D0A3D0A4D0A5D0A6D0A7D0A8D0A9D0AAD0ABD0ACD0ADD0AED0AFD081D0B0D0B1D0B2D0B3D0B4D0B5D0B6D0B7D0B8D0B9D0BAD0BBD0BCD0BDD0BED0BFD180D181D182D183D184D185D186D187D188D189D18AD18BD18CD18DD18ED18FD191 COLLATE utf8_general_cs;

-- primary
CREATE TABLE T1 (
	ID VARCHAR(255) NOT NULL,
	PRIMARY KEY (ID)	
);

-- insert test
INSERT INTO T1 (ID) VALUES ( @KEY );
INSERT INTO T1 (ID) VALUES ( UPPER(@KEY));
INSERT INTO T1 (ID) VALUES ( LOWER(@KEY));

-- case insensitive search
SELECT ID, HEX(ID) AS HID FROM T1 WHERE ID = @KEY;

Suggested fix:
Now collation handlers for utf8_general_cs are declared in the ctype-utf8.c as the following:
static MY_COLLATION_HANDLER my_collation_cs_handler =
{
    NULL,		/* init */
    my_strnncoll_utf8_cs,
    my_strnncollsp_utf8_cs,
    my_strnxfrm_utf8,
    my_like_range_simple,
    my_wildcmp_mb,
    my_strcasecmp_utf8,
    my_instr_mb,
    my_hash_sort_utf8,
    my_propagate_simple
};
and the 'strnxfrmlen' member is missed. So the 'propagate' becomes NULL.

It seems that 'my_collation_cs_handler' should be declared as the following:
static MY_COLLATION_HANDLER my_collation_cs_handler =
{
    NULL,		/* init */
    my_strnncoll_utf8_cs,
    my_strnncollsp_utf8_cs,
    my_strnxfrm_utf8,
    my_strnxfrmlen_utf8,
    my_like_range_simple,
    my_wildcmp_mb,
    my_strcasecmp_utf8,
    my_instr_mb,
    my_hash_sort_utf8,
    my_propagate_simple
};
[6 Aug 2007 11:21] Valeriy Kravchuk
Thank you for a problem report and suggested fix. For whatever reason, I do not see utf8_general_cs in current MySQL binaries (only utf8_general_ci is supported). So, please, check with a newer version, 5.0.45, and inform about the results.
[6 Sep 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[5 Oct 2007 4:17] Jan Lindström
Normal mysql distribution does not contain UTF8 case sensitive charset and collation. However, it you compile with HAVE_UTF8_GENERAL_CS set you will see this  
bug. I have tested 5.0.45 and 5.0.46 and they still have the same problem.
[5 Oct 2007 6:22] Anton Myshkovsky
Please see comment from Jan Lindström [5 Oct 6:17].
[8 Oct 2007 3:58] Alexander Barkov
This problem was fixed in 5.0.38 under terms of bug#22378
Please see here for details:
http://bugs.mysql.com/bug.php?id=22378

Marking as duplicate.

Many thanks for reporting anyway!