Bug #113240 | Prepared statements can be statefully corrupted | ||
---|---|---|---|
Submitted: | 28 Nov 2023 1:49 | Modified: | 29 Nov 2023 18:16 |
Reporter: | Kent Ross | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 8.0+ | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | json, prepared statments |
[28 Nov 2023 1:49]
Kent Ross
[28 Nov 2023 1:55]
Kent Ross
This bug affects every version of mysql 8.0 i have been able to get my hands on, but not 5.7. This bug is not related to re-binding @a in the shell client: mysql> create table t ( -> id integer primary key, -> val json -> ); Query OK, 0 rows affected (0.00 sec) mysql> insert into t(id, val) values (1, '1'), (2, '"two"'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> prepare stmt from 'select *, val = ? from t'; Query OK, 0 rows affected (0.00 sec) Statement prepared mysql> set @one = 1; Query OK, 0 rows affected (0.00 sec) mysql> set @two = 'two'; Query OK, 0 rows affected (0.00 sec) mysql> execute stmt using @two; +----+-------+---------+ | id | val | val = ? | +----+-------+---------+ | 1 | 1 | 0 | | 2 | "two" | 1 | +----+-------+---------+ 2 rows in set (0.00 sec) mysql> execute stmt using @one; +----+-------+---------+ | id | val | val = ? | +----+-------+---------+ | 1 | 1 | 1 | | 2 | "two" | 0 | +----+-------+---------+ 2 rows in set (0.00 sec) mysql> execute stmt using @two; +----+-------+---------+ | id | val | val = ? | +----+-------+---------+ | 1 | 1 | 0 | | 2 | "two" | 0 | +----+-------+---------+ 2 rows in set, 2 warnings (0.00 sec) mysql> select version(); +-----------+ | version() | +-----------+ | 8.2.0 | +-----------+ 1 row in set (0.00 sec)
[28 Nov 2023 11:02]
MySQL Verification Team
Hi Mr. Ross, Thank you for your bug report. However, this is not a bug. MySQL is not very much strict typing SQL machine, but it still except some type correctness. You should have viewed the warnings from the last query and would understand where you made a mistake in your queries. With those warnings corrected, we get the following results: id val val = ? 1 1 0 2 "two" 0 id val val = ? 1 1 1 2 "two" 0 id val val = ? 1 1 0 2 "two" 0 Not a bug.
[28 Nov 2023 21:23]
Kent Ross
Regardless of what you have said, this is a bug. The "corrected" example you yourself wrote fails to be able to match row 2. The warnings do not indicate a solution, but a proximate cause. The order in which the parameters are bound changes the behavior. The prepared statement executes correctly and without error any number of times with one or more different string values bound to the argument. Once the prepared statement is executed one time with a numeric value bound to that argument, it becomes impossible to ever use a string value for that argument from then on. This buggy behavior was not present in mysql server 5.7, and for what it's worth it is not present in mariadb-latest. The documentation for prepared statements makes zero mention of types. There is no indication that types are intended to be constrained by the prepared statement itself, further evidence that this is an unintended bug.
[29 Nov 2023 11:14]
MySQL Verification Team
Hi, If we compare just the results which we get with two, we get: id val val = ? 1 1 0 2 "two" 0 id val val = ? 1 1 0 2 "two" 0 that is , identical results. You declare a column to be JSON, then you do not enter proper JSON, but mix integer and strings, get proper warnings and you report it as MySQL bug ???? Not a bug.
[29 Nov 2023 12:24]
MySQL Verification Team
HI, We have also consulted with our Development Team in charge of the prepared statements. They concur 100 % with us that this is not a bug. You could also read our Manual, where it is all explained. For example, the manual that says (ch. 13.5.1): " Using a different data type for a given parameter or user variable within a prepared statement for executions of the statement subsequent to the first execution causes the statement to be reprepared. This is less efficient; it may also lead to the parameter’s (or variable’s) actual type to vary, and thus for results to be inconsistent, with subsequent executions of the prepared statement. For these reasons, it is advisable to use the same data type for a given parameter when re-executing a prepared statement. " A possible workaround is also found in the manual: " Repreparation does not occur if: A parameter is an operand of a CAST(). (Instead, a cast to the derived type is attempted, and an exception raised if the cast fails.) " It is recommended to read all relevant chapters in the Manual, before filing a bug report.
[29 Nov 2023 18:16]
Kent Ross
> For example, the manual that says (ch. 13.5.1): > >> Using a different data type for a given parameter or user variable within a prepared statement for executions of the statement subsequent to the first execution causes the statement to be reprepared. This is less efficient; it may also lead to the parameter’s (or variable’s) actual type to vary, and thus for results to be inconsistent, with subsequent executions of the prepared statement. For these reasons, it is advisable to use the same data type for a given parameter when re-executing a prepared statement. This indicates that the current behavior is a bug, because the statement is not being reprepared even though the parameter in question is not an operand of a CAST(). > You declare a column to be JSON, then you do not enter proper JSON, but mix integer and strings, get proper warnings and you report it as MySQL bug ???? Yes, because the database is not behaving according to convention, common sense, or its own documentation (as you yourself quoted above). This seems like a clear regression. Under normal circumstances (in unprepared queries and during the first execution of a prepared statement) both numeric and string values are comparable to JSON values with `=`, as there are suitable conversion rules for both. It is possible to execute the same prepared statement multiple times with different bound argument data types if the `val` column does not have the "json" data type, for example if it has the "text" data type. What is the chapter in the documentation that says that this is not allowed in mysql 8, but only when the value being compared is of the "json" data type? I do not believe there is one.
[30 Nov 2023 11:06]
MySQL Verification Team
Hi , Our documentation answers all your questions and it is a conclusion of the team in charge of prepared statements that there is no bug here.
[31 Jan 14:10]
John Scott
Simpler test case: drop database if exists string_prepare_test; create database string_prepare_test; use string_prepare_test; create table preparetest (id int not null primary key auto_increment, teststring varchar(255) not null) default charset utf8mb4 engine=InnoDB; prepare sample from 'insert into preparetest (teststring) values (?)'; set @a = 1234; execute sample using @a; set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f'; execute sample using @a; select * from preparetest; Results on 5.7.44: -------------- drop database if exists string_prepare_test -------------- -------------- create database string_prepare_test -------------- -------------- create table preparetest (id int not null primary key auto_increment, teststring varchar(255) not null) default charset utf8mb4 engine=InnoDB -------------- -------------- prepare sample from 'insert into preparetest (teststring) values (?)' -------------- -------------- set @a = 1234 -------------- -------------- execute sample using @a -------------- -------------- set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f' -------------- -------------- execute sample using @a -------------- -------------- select * from preparetest -------------- id teststring 1 1234 2 23e835ea-b110-4deb-b2d9-bd83fdca678f Results on 8.0.35: Enter password: -------------- drop database if exists string_prepare_test -------------- -------------- create database string_prepare_test -------------- -------------- create table preparetest (id int not null primary key auto_increment, teststring varchar(255) not null) default charset utf8mb4 engine=InnoDB -------------- -------------- prepare sample from 'insert into preparetest (teststring) values (?)' -------------- -------------- set @a = 1234 -------------- -------------- execute sample using @a -------------- -------------- set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f' -------------- -------------- execute sample using @a -------------- -------------- select * from preparetest -------------- id teststring 1 1234 2 99999999999999999999999999999999999999999999999999999999999999999 I'm confused and unsure whether the documentation explains this. Bug or not, the behavior has changed between versions. Which could be construed as a regression issue / bug.
[31 Jan 14:28]
MySQL Verification Team
Hi Mr. Ross, No, there is no bug there ....... You are simply mixing data types, which is NOT allowed. We get fantastic results with 8.0.40: ------------------------------------------------ ysql> drop table if exists preparetest; Query OK, 0 rows affected, 1 warning (0.04 sec) mysql> mysql> create table preparetest (id int not null primary key auto_increment, teststring varchar(255) not null) default charset utf8mb4 engine=InnoDB; Query OK, 0 rows affected (0.10 sec) mysql> mysql> prepare sample from 'insert into preparetest (teststring) values (?)'; Query OK, 0 rows affected (0.06 sec) Statement prepared mysql> mysql> set @a = '1234'; Query OK, 0 rows affected (0.00 sec) mysql> mysql> execute sample using @a; Query OK, 1 row affected (0.01 sec) mysql> mysql> set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f'; Query OK, 0 rows affected (0.00 sec) mysql> mysql> execute sample using @a; Query OK, 1 row affected (0.00 sec) mysql> mysql> select * from preparetest; +----+--------------------------------------+ | id | teststring | +----+--------------------------------------+ | 1 | 1234 | | 2 | 23e835ea-b110-4deb-b2d9-bd83fdca678f | +----+--------------------------------------+ 2 rows in set (0.00 sec) mysql> mysql> drop table if exists preparetest; Query OK, 0 rows affected (0.00 sec) mysql> mysql> show tables; Empty set (0.00 sec) ------------------------------------------------ Not a bug !!!!!!
[31 Jan 14:50]
John Scott
This isn't Mr. Ross, this is a different responder. I'm from AWS and I have a customer issue in RDS based on the behavior of Ruby which uses scalar variables. Does not quote integers but quotes strings. (and I've verified the same in Perl), where prepared statements in these languages do not behave the same between MySQL versions. It may not be a bug but can you explain why MySQL 8 behaves differently than 5.7, just for our information? There is probably a code work around like using to_s in ruby to force the string encapsulation - we will look at that. But this is an issue of an actual application breaking after upgrade to MySQL 8.0.
[31 Jan 15:01]
MySQL Verification Team
HI Mr. Scott, According to SQL Standards 2011 and 2016, mixing data types in a single expression or inserting one data type, like integer into a CHAR or VARCHAR columns (and several others) is STRICTLY forbidden. MySQL is not yet fully compliant with SQL standard in that respect, but we are slowly changing the behaviour in order to be SQL compliant. Hence, please do not ask us to be less compliant with SQL standard. Thanks for your kind words .....
[31 Jan 15:19]
John Scott
I'm not trying to be flippant and appreciate your responses. The conversion of the numeric to the string is working (no more strict than it was in 5.7). It is the behavior of the subsequent properly quoted string that is being incorrectly converted. Personally I still feel this should be examined. We have a work-around, so we're good. Thank you for your time.
[31 Jan 15:21]
Roy Lyseng
If you have a test case, please add it and we can evaluate whether we consider it a bug or not. As already specified, mixing different data types in different calls to prepared statements may cause repreparations that cause change in behavior. This is not a bug, it is documented in the user manual. If you need to use varying data types in various executions, we recommend allocating one prepared statement for each type of parameter.
[31 Jan 15:30]
John Scott
If the statement were being successfully re-prepared, then the properly quoted string should be treated as a string in the subsequent statement. Please check to be sure that the re-preparation you describe is actually occurring in this scenario.
[31 Jan 15:37]
Roy Lyseng
The query was indeed reprepared in the second execution. But the repreparation means it is ready to accept more executions with the same type of parameter. The repreparation is there in case the user supplies a parameter with different type than what the SQL standard prescribes. We accept that, but not an arbitrary variation of data types in subsequent execution.
[31 Jan 15:54]
John Scott
I see. So if I understand correctly: - statement is prepared when I do "prepare..." - expects a string - 1st statement is executed with integer value - cast is done, statement is re-prepared, now expecting integer on the next one. - 2nd statement is run and the string is incorrectly attempted CAST to integer. If you run the specific UUID through cast, you'll see it gets cast to 9999999 because of the "23e" at the beginning. (after 2nd statement is run, it is detected that now we have a string again, and we re-prepare). So the next one (if we ran one) would have a correct string value. And indeed that is the case if I add a 3rd insert with the same value as the 2nd: Enter password: -------------- drop database if exists string_prepare_test -------------- -------------- create database string_prepare_test -------------- -------------- create table preparetest (id int not null primary key auto_increment, teststring varchar(255) not null) default charset utf8mb4 engine=InnoDB -------------- -------------- prepare sample from 'insert into preparetest (teststring) values (?)' -------------- -------------- set @a = 1234 -------------- -------------- execute sample using @a -------------- -------------- set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f' -------------- -------------- execute sample using @a -------------- -------------- set @a = '23e835ea-b110-4deb-b2d9-bd83fdca678f' -------------- -------------- execute sample using @a -------------- -------------- select * from preparetest -------------- id teststring 1 1234 2 99999999999999999999999999999999999999999999999999999999999999999 3 23e835ea-b110-4deb-b2d9-bd83fdca678f So what you're saying is - this is the "new" expected behavior in 8.0 forward. Whereas in 5.7 - all values would be correct. I understand that even in the previous world you can't "guarantee" that mixing types across a prepared statement is going to result in correct data. BUT - the behavior is different. That's the reality and applications that use scalar data types may have to add code to force string behavior in MySQL prepared statements, where they previously did not have to.
[31 Jan 19:05]
John Scott
Humble request for a warning when a data type change is detected in a bind variable.
[31 Jan 20:18]
Oleg Khozyayinov
Some more information on this. When we got 999999 instead of UUID, binlog did not match the data in table. binlog (and data in replica) had correct UUID, while table data had 999999. So, it looks like broken consistency between binlog and actual data in the InnoDB table
[1 Feb 14:28]
John Scott
Unfortunately I don't think the difference in binlog is a bug based on additional testing. In MIXED mode, a prepared statement on the primary server just ends up being a statement in the binlog. And the statement that was run was with the string uuid, not with the '999...' The issue comes only with a server side prepared statement after a type change and in MIXED and STATEMENT modes, these come across as normal statements to the replica: # at 1184 #250201 14:20:42 server id 999 end_log_pos 1184 CRC32 0xef50d57c Intvar SET INSERT_ID=1/*!*/; #250201 14:20:42 server id 999 end_log_pos 1341 CRC32 0xe36779d0 Query thread_id=8 exec_time=0 error_code=0 SET TIMESTAMP=1738419642/*!*/; insert into preparetest (teststring) values (1234) /*!*/; # at 1341 #250201 14:20:42 server id 999 end_log_pos 1372 CRC32 0xb22fafbe Xid = 11 COMMIT/*!*/; # at 1372 #250201 14:20:42 server id 999 end_log_pos 1451 CRC32 0x32db7cd5 Anonymous_GTID last_committed=4 sequence_number=5 rbr_only=no original_committed_timestamp=1738419642446089 immediate_commit_timestamp=1738419642446089 transaction_length=445 # original_commit_timestamp=1738419642446089 (2025-02-01 14:20:42.446089 UTC) # immediate_commit_timestamp=1738419642446089 (2025-02-01 14:20:42.446089 UTC) /*!80001 SET @@session.original_commit_timestamp=1738419642446089*//*!*/; /*!80014 SET @@session.original_server_version=80034*//*!*/; /*!80014 SET @@session.immediate_server_version=80034*//*!*/; SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/; # at 1451 #250201 14:20:42 server id 999 end_log_pos 1563 CRC32 0xc7beff87 Query thread_id=8 exec_time=0 error_code=0 SET TIMESTAMP=1738419642/*!*/; BEGIN /*!*/; # at 1563 # at 1595 #250201 14:20:42 server id 999 end_log_pos 1595 CRC32 0x54a0df38 Intvar SET INSERT_ID=2/*!*/; #250201 14:20:42 server id 999 end_log_pos 1786 CRC32 0xb85b850b Query thread_id=8 exec_time=0 error_code=0 SET TIMESTAMP=1738419642/*!*/; insert into preparetest (teststring) values ('23e835ea-b110-4deb-b2d9-bd83fdca678f') /*!*/; # at 1786 Contrasted with - In binlog_format=ROW, it's correct. ### INSERT INTO `string_prepare_test`.`preparetest` ### SET ### @1=2 /* INT meta=0 nullable=0 is_null=0 */ ### @2='99999999999999999999999999999999999999999999999999999999999999999' /* VARSTRING(1020) meta=1020 nullable=0 is_null=0 */ # at 1816 Bottom line is - this is probably expected behavior. If primary server had been in ROW format, all would be equal.
[1 Feb 14:50]
John Scott
Just so it doesn't get lost in the scrollback, I want to reiterate the request: > Humble request for a warning when a data type change is detected.
[1 Feb 16:23]
Roy Lyseng
Warnings tend to be ignored. It is possible to check the status variable Com_stmt_reprepare. It is usually incremented when a statement is reprepared due to metadata changes, but also when supplied parameter values do not match the anticipated type, and a reprepare is forced.
[3 Feb 11:50]
MySQL Verification Team
Thank you, Roy.
[3 Feb 18:31]
John Scott
Appreciate your time on this.