Bug #37885 row_search_for_mysql may gap lock unnecessarily with SQL comments in query
Submitted: 4 Jul 2008 17:00 Modified: 20 Jun 2010 0:58
Reporter: Jeremy Cole (Basic Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.0/5.1 OS:Any
Assigned to: Marko Mäkelä CPU Architecture:Any
Tags: gap lock, innodb, lock, qc

[4 Jul 2008 17:00] Jeremy Cole
Description:
I found this while researching Bug #35226 ("RBR event crashes slave") for a customer of ours.

In row/row0sel.c, row_search_for_mysql calls dict_accept to try to find if the query starts with SELECT to determine if it's a SELECT statement:

03404         if (trx->isolation_level <= TRX_ISO_READ_COMMITTED
03405             && prebuilt->select_lock_type != LOCK_NONE
03406             && trx->mysql_query_str) {
03407 
03408                 /* Scan the MySQL query string; check if SELECT is the first
03409                 word there */
03410                 ibool   success;
03411 
03412                 dict_accept(*trx->mysql_query_str, "SELECT", &success);
03413 
03414                 if (success) {
03415                         /* It is a plain locking SELECT and the isolation
03416                         level is low: do not lock gaps */
03417 
03418                         set_also_gap_locks = FALSE;
03419                 }
03420         }

In dict/dict0dict.c, dict_accept calls dict_scan_to to pick out the next (first) word in the query string:

02414 dict_accept(
02415 /*========*/
02416                                 /* out: if string was accepted, the pointer
02417                                 is moved after that, else ptr is returned */
02418         const char*     ptr,    /* in: scan from this */
02419         const char*     string, /* in: accept only this string as the next
02420                                 non-whitespace string */
02421         ibool*          success)/* out: TRUE if accepted */
02422 {
02423         const char*     old_ptr = ptr;
02424         const char*     old_ptr2;
02425 
02426         *success = FALSE;
02427         
02428         while (ib_isspace(*ptr)) {
02429                 ptr++;
02430         }
02431 
02432         old_ptr2 = ptr;
02433         
02434         ptr = dict_scan_to(ptr, string);
02435         
02436         if (*ptr == '\0' || old_ptr2 != ptr) {
02437                 return(old_ptr);
02438         }
02439 
02440         *success = TRUE;
02441 
02442         return(ptr + ut_strlen(string));
02443 }

In dict/dict0dict.c, dict_scan_to does not understand comments, and thus will match words from comments:

02373 dict_scan_to(
02374 /*=========*/
02375                                 /* out: scanned up to this */
02376         const char*     ptr,    /* in: scan from */
02377         const char*     string) /* in: look for this */
02378 {
02379         char    quote   = '\0';
02380 
02381         for (; *ptr; ptr++) {
02382                 if (*ptr == quote) {
02383                         /* Closing quote character: do not look for
02384                         starting quote or the keyword. */
02385                         quote = '\0';
02386                 } else if (quote) {
02387                         /* Within quotes: do nothing. */
02388                 } else if (*ptr == '`' || *ptr == '"') {
02389                         /* Starting quote: remember the quote character. */
02390                         quote = *ptr;
02391                 } else {
02392                         /* Outside quotes: look for the keyword. */
02393                         ulint   i;
02394                         for (i = 0; string[i]; i++) {
02395                                 if (toupper((int)(unsigned char)(ptr[i]))
02396                                                 != toupper((int)(unsigned char)
02397                                                 (string[i]))) {
02398                                         goto nomatch;
02399                                 }
02400                         }
02401                         break;
02402                 nomatch:
02403                         ;
02404                 }
02405         }
02406 
02407         return(ptr);
02408 }

Because of the test against old_ptr2 in dict_accept, dict_accept won't accept a word in a comment as an actual match:

02436         if (*ptr == '\0' || old_ptr2 != ptr) {
02437                 return(old_ptr);
02438         }

This will keep a dict_accept(..., "SELECT", ...) from matching a query string such as "/* SELECT */ UPDATE ..." (due to the string not being the first non-whitespace found.  However, it does fail in the case where a dict_accept(..., "SELECT", ...) will fail to match a legitimate query such as "/* foo */ SELECT ...".  It appears that this could cause set_also_gap_locks to be set incorrectly, which in my understanding would mean that gaps are locked when they shouldn't be.

How to repeat:
Unfortunately I don't have a good test case for this, but have tested that dict_scan_to doesn't understand comments and will match a word within comments by using PHP to send a query with a comment and stepping through the code with GDB.  Note that you can't use the MySQL command line client for this due to the fact that it strips SQL comments before sending the query string to the server.

Suggested fix:
One of a few solutions:

1. Use something like trx->mysql_thd->lex->sql_command == SQLCOM_SELECT instead of searching for the word "SELECT" in the query string.  (This would be the best solution IMHO, and ought to be a lot more efficient in any case.)

2. Strip comments from the query string before sending it to these functions (and perhaps even before storing the string in trx->mysql_query_str).

3. Make dict_scan_to understand the various SQL comment styles and skip over comments appropriately.
[4 Jul 2008 17:07] Jeremy Cole
This is the code I used to test things with GDB:

create table t (i int, primary key(i)) engine=innodb;
insert into t (i) values (1),(2),(3),(4),(5);

Then using PHP:

<?php

mysql_connect("127.0.0.1:3307", "root", "");
mysql_select_db("test");

mysql_query("set transaction isolation level read committed");
mysql_query("begin");
mysql_query("/* SELECT */ update t set i=4 where i=2");

?>

If you place a breakpoint on dict_accept you should be able to step through and prove that the code will match SELECT inside dict_scan_to and fail the old_ptr2 != ptr test, returning old_ptr.
[4 Jul 2008 21:08] MySQL Verification Team
Debugging on Windows Vista 64-bit (5.1 server)

Attachment: debugger-screen-shot.png (image/png, text), 118.24 KiB.

[5 Jul 2008 0:46] MySQL Verification Team
Hi Jeremy,

I was able to verify what you have reported (see attached picture) however doing a debug trace with InnoDB and MyISAM engines I noticed that the actual query ignores the comment /* SELECT */ and executes an update query which fails because a duplicate primary key according your test case:

c:\dbs>cat c:\temp\innodb.trace
network_init: general: IP Socket is 3351
mi_rnext: error: Got error: -1,  errno: 137
mi_rnext: error: Got error: -1,  errno: 137
mi_rnext: error: Got error: -1,  errno: 137
my_register_filename: error: Got error 2 on open
handle_connections_sockets: general: Waiting for connections.
dispatch_command: query: set transaction isolation level read committed
dispatch_command: query: begin
dispatch_command: query: /* SELECT */ update t set i=4 where i=2
my_message_sql: error: error: 1062  message: 'Duplicate entry '4' for key 'PRIMARY''
kill_server: error: Got error: 6 from SetEvent of smem_event_connect_request
5.1\bin\mysqld: missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "?func"
5.1\bin\mysqld: missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "?func"

c:\dbs>cat c:\temp\myisam.trace
network_init: general: IP Socket is 3351
mi_rnext: error: Got error: -1,  errno: 137
mi_rnext: error: Got error: -1,  errno: 137
mi_rnext: error: Got error: -1,  errno: 137
my_register_filename: error: Got error 2 on open
handle_connections_sockets: general: Waiting for connections.
dispatch_command: query: set transaction isolation level read committed
dispatch_command: query: begin
dispatch_command: query: /* SELECT */ update t set i=4 where i=2
mi_update: error: key: 0  errno: 121
my_message_sql: error: error: 1062  message: 'Duplicate entry '4' for key 'PRIMARY''
kill_server: error: Got error: 6 from SetEvent of smem_event_connect_request
5.1\bin\mysqld: missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "?func"
5.1\bin\mysqld: missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "?func"

So I assume the server removes the comment when executing the query in both cases.
[7 Jul 2008 14:53] Jeremy Cole
Hi Miguel,

Yes, the server removes the comment when it parses it, I was not claiming otherwise.  MySQL itself is fine.  The point is that _InnoDB's_ parser/tokenizer doesn't understand comments, and could be caused to misbehave because of it.  These are the places dict_accept is used and what it's looking for:

innobase/dict/dict0dict.c:2968: ptr = dict_accept(ptr, "ALTER", &success);
innobase/dict/dict0dict.c:2975: ptr = dict_accept(ptr, "TABLE", &success);
innobase/dict/dict0dict.c:3024:         ptr = dict_accept(ptr, "CONSTRAINT", &success);
innobase/dict/dict0dict.c:3068: ptr = dict_accept(ptr, "FOREIGN", &success);
innobase/dict/dict0dict.c:3078: ptr = dict_accept(ptr, "KEY", &success);
innobase/dict/dict0dict.c:3084: ptr = dict_accept(ptr, "(", &success);
innobase/dict/dict0dict.c:3098:         ptr = dict_accept(ptr, "(", &success);
innobase/dict/dict0dict.c:3127: ptr = dict_accept(ptr, ",", &success);
innobase/dict/dict0dict.c:3133: ptr = dict_accept(ptr, ")", &success);
innobase/dict/dict0dict.c:3161: ptr = dict_accept(ptr, "REFERENCES", &success);
innobase/dict/dict0dict.c:3222: ptr = dict_accept(ptr, "(", &success);
innobase/dict/dict0dict.c:3252: ptr = dict_accept(ptr, ",", &success);
innobase/dict/dict0dict.c:3258: ptr = dict_accept(ptr, ")", &success);
innobase/dict/dict0dict.c:3274: ptr = dict_accept(ptr, "ON", &success);
innobase/dict/dict0dict.c:3281: ptr = dict_accept(ptr, "DELETE", &success);
innobase/dict/dict0dict.c:3284:         ptr = dict_accept(ptr, "UPDATE", &success);
innobase/dict/dict0dict.c:3301: ptr = dict_accept(ptr, "RESTRICT", &success);
innobase/dict/dict0dict.c:3307: ptr = dict_accept(ptr, "CASCADE", &success);
innobase/dict/dict0dict.c:3319: ptr = dict_accept(ptr, "NO", &success);
innobase/dict/dict0dict.c:3322:         ptr = dict_accept(ptr, "ACTION", &success);
innobase/dict/dict0dict.c:3341: ptr = dict_accept(ptr, "SET", &success);
innobase/dict/dict0dict.c:3350: ptr = dict_accept(ptr, "NULL", &success);
innobase/dict/dict0dict.c:3548: ptr = dict_accept(ptr, "DROP", &success);
innobase/dict/dict0dict.c:3555: ptr = dict_accept(ptr, "FOREIGN", &success);
innobase/dict/dict0dict.c:3562: ptr = dict_accept(ptr, "KEY", &success);
innobase/row/row0sel.c:3412:            dict_accept(*trx->mysql_query_str, "SELECT", &success);
[7 Jul 2008 15:13] MySQL Verification Team
Thank you for the feedback.
[7 Jul 2008 16:57] Heikki Tuuri
Jeremy,

thank you for noticing this bug. The occurrence in row0sel.c is actually the only buggy one, as the foreign key parser already strips the comments from the SQL string.

The bug will cause InnoDB to lock gaps even on low isolation levels if the SELECT ... query is preceded with a comment in the SQL string!

In 5.1, row0sel.c calls the following function:

/**************************************************************************
Determines whether a string starts with the specified keyword. */

ibool
dict_str_starts_with_keyword(
/*=========================*/
                                        /* out: TRUE if str starts
                                        with keyword */
        void*           mysql_thd,      /* in: MySQL thread handle */
        const char*     str,            /* in: string to scan for keyword */
        const char*     keyword)        /* in: keyword to look for */
{
        struct charset_info_st* cs      = innobase_get_charset(mysql_thd);
        ibool                   success;

        dict_accept(cs, str, keyword, &success);
        return(success);
}

Fix: change dict_scan_to() to ignore SQL comments.

Assigning this to Marko.

Regards,

Heikki
[20 Aug 2008 13:19] Marko Mäkelä
Making dict_scan_to() skip comments would break dict_accept(). The right thing seems to be to skip leading comments and whitespace in dict_accept(). And an even better solution could be to rely on thd_sql_command().
[21 Aug 2008 7:13] Marko Mäkelä
I verified this with the following PHP:

mysql_connect("127.0.0.1:3306", "root", "");
mysql_select_db("test");

mysql_query("set transaction isolation level read committed");
mysql_query("begin");
mysql_query("/* foo */ SELECT * FROM t where i=2 FOR UPDATE");

With dict_accept() patched to skip leading comments in addition to leading white space, this worked. However, I would rather follow the fundamental rule 5 of qmail <http://cr.yp.to/qmail/guarantee.html>: Don't parse. Let us check 
thd_sql_command(trx->mysql_thd) == SQLCOM_SELECT instead of depending on trx->mysql_query_str. The code in row_search_for_mysql() will also become simpler and quicker:

	if (trx->isolation_level <= TRX_ISO_READ_COMMITTED
	    && prebuilt->select_lock_type != LOCK_NONE
	    && trx->mysql_thd != NULL
	    && thd_is_select(trx->mysql_thd)) {
		/* It is a plain locking SELECT and the isolation
		level is low: do not lock gaps */

		set_also_gap_locks = FALSE;
	}

I checked in sql_yacc.yy that SQLCOM_SELECT is not used for any other keyword than SELECT.
[18 Sep 2008 12:13] Heikki Tuuri
This is a low-impact bug. We will only fix this in 5.1 and later. Let us keep 5.0 as is.
[12 Mar 2009 21:35] Paul DuBois
Noted in 5.1.31, 6.0.10 changelogs.

The presence of a /* ... */ comment preceding a query could cause
InnoDB to use unnecessary gap locks.
[5 May 2010 15:19] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 15:57] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[28 May 2010 6:13] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:41] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:09] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 23:03] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:19] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:06] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:47] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)