Bug #41566 Quotes within comments not correctly ignored by statement parser
Submitted: 17 Dec 2008 20:45 Modified: 23 Jun 2009 13:31
Reporter: Jeff Kolesky Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S3 (Non-critical)
Version:5.1.6 OS:Any
Assigned to: Jess Balint CPU Architecture:Any

[17 Dec 2008 20:45] Jeff Kolesky
Description:
If there is an apostrophe in a comment in a statement that is being sent through the driver, the apostrophe is still recognized as a quote and puts the state machine in EscapeTokenizer into the "inQuotes" state.  This throws off the state of the parser which can lead to further parse errors.

How to repeat:
Run a statement such as the following through a jdbc connection:

String sql = "-- Customer's zip code will be fixed\n" +
  "update address set zip_code = 99999\n" +
  "where not regexp '^[0-9]{5}([[.-.]])?([0-9]{4})?$'";

When a statement like that is run through connector-j, the EscapeTokenizer does not recognize that the first apostrophe is in a comment and thus sets "inQuotes" to true.  When that happens, the quote count is off by one and thus the regular expression does not appear to be in quotes.  With the parser not thinking the regular expression is in quotes, the curly braces are recognized as escape sequences and are removed from the regular expression, breaking it.  The server ends up seeing the sql like this:

-- Customer's zip code will be fixed
update address set zip_code = '99999'
where not regexp '^[0-9]([[.-.]])?([0-9])?$'

There is an obvious work-around (remove the apostrophe from the comment).

Suggested fix:
--- EscapeTokenizer.java	2008-12-17 15:41:14.000000000 -0500
+++ EscapeTokenizer.java	2008-12-17 15:43:28.000000000 -0500
@@ -104,7 +104,7 @@
 				this.sawVariableUse = true;
 			}
 
-			if (c == '\'' || c == '"') {
+			if (!this.inComment && (c == '\'' || c == '"')) {
 				if (this.inQuotes && c == quoteChar) {
 					if (this.pos + 1 < this.sourceLength) {
 						if (this.source.charAt(this.pos + 1) == quoteChar) {
[12 Feb 2009 14:55] Tonci Grgin
Hi Jeff and thanks for your report.

I am not able to repeat this behavior using latest c/J sources and MySQL server 5.1.30, although same code is there. Would you mind testing with latest c/J snapshot and, if it fails, attaching complete test case.
[12 Feb 2009 20:35] Jess Balint
fix + test

Attachment: bug41566.diff (text/x-diff), 1.15 KiB.

[2 Jun 2009 5:59] Jess Balint
Pushed for release in 5.1.8
[23 Jun 2009 13:31] Tony Bedford
An entry was added to the 5.1.8 changelog:

If there was an apostrophe in a comment in a statement that was being sent through Connector/J, the apostrophe was still recognized as a quote and put the state machine in EscapeTokenizer into the inQuotes state. This led to further parse errors.

For example, consider the following statement:

String sql = "-- Customer's zip code will be fixed\n" +
  "update address set zip_code = 99999\n" +
  "where not regexp '^[0-9]{5}([[.-.]])?([0-9]{4})?$'";

When passed through Connector/J, the EscapeTokenizer did not recognize that the first apostrophe was in a comment and thus set inQuotes to true. When that happened, the quote count was incorrect and thus the regular expression did not appear to be in quotes. With the parser not detecting that the regular expression was in quotes, the curly braces were recognized as escape sequences and were removed from the regular expression, breaking it. The server thus received SQL such as:

-- Customer's zip code will be fixed
update address set zip_code = '99999'
where not regexp '^[0-9]([[.-.]])?([0-9])?$'