Bug #55397 cannot select from innodb_trx when trx_query contains blobs that aren't strings
Submitted: 20 Jul 2010 12:47 Modified: 11 Jan 2011 1:51
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB Plugin storage engine Severity:S2 (Serious)
Version:5.5.5/5.1 OS:Any (windows, linux)
Assigned to: Jimmy Yang CPU Architecture:Any

[20 Jul 2010 12:47] Shane Bester
when other connections are running queries containing blobs, we cannot select from innodb_trx table:

mysql> select trx_query from information_schema.innodb_trx;
Empty set, 1 warning (0.00 sec)

mysql> show warnings;
| Level   | Code | Message
| Warning | 1366 | Incorrect string value: '\x96\xF0\x84P\xD6*...
1 row in set (0.01 sec)

running set names utf8 or set names binary in the connection
didn't help either.

How to repeat:
in one connection run a long query with some blob data. then in other connection run: select trx_query from information_schema.innodb_trx;

you'll get no data back sometimes, and have a warning instead.

Suggested fix:
make the column type BLOB instead of VAR_STRING, the same as processlist table.
[20 Jul 2010 13:02] MySQL Verification Team

start transaction with consistent snapshot;
select sleep(100),'═══';


select * from information_schema.innodb_trx;#fails
show warnings;
select * from information_schema.processlist;#works
[20 Jul 2010 13:11] MySQL Verification Team
C:\DBS>c:\dbs\5.1\bin\mysql -uroot --port=3306 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.49-Win X64-log Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.1 >select * from information_schema.innodb_trx;#fails
Empty set, 1 warning (0.03 sec)

mysql 5.1 >show warnings;
| Level   | Code | Message                                                                 |
| Warning | 1366 | Incorrect string value: '\xCD\xCD\xCD'' for column 'trx_query' at row 1 |
1 row in set (0.00 sec)
[20 Jul 2010 13:14] MySQL Verification Team
Thank you for the bug report.

mysql 5.1 >show variables like "%version%";
| Variable_name           | Value               |
| innodb_version          | 1.0.9               |
| protocol_version        | 10                  |
| version                 | 5.1.49-Win X64-log  |
| version_comment         | Source distribution |
| version_compile_machine | unknown             |
| version_compile_os      | Win64               |
6 rows in set (0.00 sec)
[28 Sep 2010 9:13] Vasil Dimov
Cannot reproduce here:

12:06:23 mysql> select trx_query from information_schema.innodb_trx;
| trx_query               |
| select sleep(100),'???' |

12:06:51 mysql> show variables like "%version%";
| Variable_name           | Value               |
| innodb_version          | 1.0.12              |
| protocol_version        | 10                  |
| version                 | 5.1.51              |
| version_comment         | Source distribution |
| version_compile_machine | amd64               |
| version_compile_os      | unknown-freebsd8.1  |
[28 Sep 2010 12:38] Marko Mäkelä
I believe that we should encode the thd_query_string(thd) from innobase_get_charset(thd) to system_charset_info.
[21 Dec 2010 11:19] Marko Mäkelä
I think that we need to convert the query to system_charset_info (UTF-8). From there, MySQL should convert it to the connection charset. Declaring the column as BLOB is cheating, and it can produce unexpected results when different clients use different character encodings.

I would test this as follows:

Initiate three connections to mysqld. One in latin1, others in utf8 and latin2. For proper interpretation of the results, the terminal should output UTF-8 encoding:

echo -e "set character_set_client=latin1;set character_set_connection=latin1;set character_set_results=latin1;select sleep(100),'how are you?\xbfC\xf3mo est\xe1s?'"|mysql|recode latin1..utf8

echo -e "set character_set_client=utf8;set character_set_connection=utf8;set character_set_results=utf8;select * from information_schema.processlist;"|mysql

echo -e "set character_set_client=latin2;set character_set_connection=latin2;set character_set_results=latin2;select * from information_schema.processlist;"|mysql|recode latin2..utf8

Note that in both cases, the INFO is truncated at the first non-ASCII character, ¿. This looks wrong; we should try to display the whole SQL statement, like this:

$ echo -e 'how are you?\xbfC\xf3mo est\xe1s?'|recode latin1..utf8
how are you?¿Cómo estás?

Note that ¿ does not exist in latin2 and is likely converted to ? by mysql. Below, it is converted to ? by GNU Recode. This is because code point 0xbf in latin1 is U+00BF INVERTED QUESTION MARK, and in latin2 it is U+017C LATIN SMALL LETTER Z WITH DOT ABOVE.

$ echo -e 'how are you?\xbfC\xf3mo est\xe1s?'|recode latin1..latin2|recode latin2..utf8
how are you??Cómo estás?

These tests are for the INFORMATION_SCHEMA.PROCESSLIST. Similar tests should be conducted for INFORMATION_SCHEMA.innodb_trx.
[5 Jan 2011 10:15] Jimmy Yang
There are two issues to be fixed in the case:

1) We always use system_charset_info to display the stmt query string (trx_query), however the code did not do appropriate conversion from stmt charset to system_charset_info before push to store(), so the error display cannot be fixed by user adjusting client charset etc. The correct way is to remember the stmt string charset, and use it for the store(), and let MySQL to convert the string to the client charset before push to user.

2) fields[IDX_TRX_QUERY]->store() calls MySQL's well_formed_copy_nchars() and  check_string_copy_error() on designated charset. So if there are character cannot be recognized by well_formed_copy_nchars for the charset, it will stop and check_string_copy_error() will report an error. In the case information_schema.processlist, it will continue reporting the row, while in our case, we will not report the row. I think if this is due to charset issue, then we should continue report the row. Ideally, well_formed_copy_nchars() could replace those unrecognized character with some common symbols.
[7 Jan 2011 3:30] Jimmy Yang
Note function well_formed_copy_nchars() always convert the string stored from input string charset to utf8, and then convert to the output session charset. So we need to clearly tell store() what is the charset for the input string:

gdb) c

Breakpoint 8, well_formed_copy_nchars (to_cs=0x87c1d40, to=0x904177e "", 
    to_length=3072, from_cs=0x88133f8, 
    from=0x8c40aa4 "select sleep(100),'═══'", from_length=29, 
    nchars=1024, well_formed_error_pos=0xaa19ca08, 
    cannot_convert_error_pos=0xaa19ca04, from_end_pos=0xaa19ca00)
    at sql_string.cc:954

gdb) p to_cs->csname
$11 = 0x881c428 "utf8"
(gdb) p from_cs->csname
$12 = 0x881cee0 "cp850"

A few tests show the effective of this fix:

Let's use session 1 as the query session (select sleep(100),'═══')
Use session 2 as session issues "select * from information_schema.innodb_trx"

session 1: latin1, session 2: utf8

mysql> select  trx_query from information_schema.innodb_trx;
| trx_query                                 |
| select sleep(100),'═══' |
1 row in set (4.33 sec)

Change session 2 to latin1:
mysql> select  trx_query from information_schema.innodb_trx;
| trx_query                     |
| select sleep(100),'═══' |
1 row in set (4.35 sec)

session 1: cp850, session 2: latin1
mysql> select  trx_query from information_schema.innodb_trx;
| trx_query                     |
| select sleep(100),'���������' |
1 row in set (4.55 sec)

Change session 2 to cp850:
mysql> select  trx_query from information_schema.innodb_trx;
| trx_query                     |
| select sleep(100),'═══' |
1 row in set (2.32 sec)
[8 Jan 2011 15:08] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:vasil.dimov@oracle.com-20110108150732-8qygun7nuaqf9d3u) (version source revid:vasil.dimov@oracle.com-20110108150646-44c9j7ck64ocey31) (merge vers: 5.6.2) (pib:24)
[8 Jan 2011 15:09] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:vasil.dimov@oracle.com-20110108150508-gpanhz48z8069qot) (version source revid:vasil.dimov@oracle.com-20110108150048-b1y9m8xe72hay0ch) (merge vers: 5.5.9) (pib:24)
[8 Jan 2011 15:12] Bugs System
Pushed into mysql-5.1 5.1.55 (revid:vasil.dimov@oracle.com-20110108145923-0tjqdpa5w3d6tuwn) (version source revid:vasil.dimov@oracle.com-20110108145923-0tjqdpa5w3d6tuwn) (merge vers: 5.1.55) (pib:24)
[11 Jan 2011 1:49] John Russell
Adding to change log:

It was not possible to query the information_schema.innodb_trx
table while other connections were running queries involving BLOB types.
[2 Nov 2015 12:36] MySQL Verification Team
followup to this: