Bug #77473 | Truncated data with subquery & UTF8 | ||
---|---|---|---|
Submitted: | 24 Jun 2015 12:58 | Modified: | 20 Jul 2015 14:58 |
Reporter: | T. M. | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 5.5.39, 5.6.25 | OS: | Windows (x64) |
Assigned to: | Tor Didriksen | CPU Architecture: | Any |
Tags: | Subquery UTF8 truncated |
[24 Jun 2015 12:58]
T. M.
[24 Jun 2015 16:01]
Hartmut Holzgraefe
select length(data) as len from ( select rpad('a', 32767, 'é') as data ) as Sub; returns the correct result 65533, starting with a rpad argument of 32768 up to 65535 the result is 65535, then with a parameter of >=65536 the result is correct again looks as if a too small two-byte integer type is used for the subquery result so that the result value is truncatd by integer overflow ... Also reproduced on MySQL 5.6.14 and 5.7.4
[25 Jun 2015 7:08]
MySQL Verification Team
Hello! Thank you for the report and test case. Thanks, Umesh
[26 Jun 2015 14:11]
Hartmut Holzgraefe
The problem is not restricted to checking the length in the outer query, it also occurs when just passing through the inner result ... and it's not only RPAD() that is affected, I get the same with REPEAT(): $ mysql -Ne "set names utf8; select data from (select rpad('', 10000, 'é') as data ) as Sub;" | wc 1 1 20001 $ mysql -Ne "set names utf8; select data from (select rpad('', 40000, 'é') as data ) as Sub;" | wc 1 1 65535 $ mysql -Ne "set names utf8; select data from (select rpad('', 70000, 'é') as data ) as Sub;" | wc 1 1 140001 $ mysql -Ne "set names utf8; select data from (select repeat('é', 40000) as data ) as Sub;" | wc 1 1 65535
[30 Jun 2015 19:23]
Hartmut Holzgraefe
Truncated results happen even earlier when using a character with 3-byte UTF-8 sequence: select length(data) as len from ( select repeat('☃', ...) as data ) as Sub; parameter -> result 21844 -> 65532 21845 -> 65535 21846 -> 65535 ... 65535 -> 65535 65536 -> 196608
[20 Jul 2015 14:58]
Paul DuBois
Noted in 5.7.9, 5.8.0 changelogs. For some string functions, data was truncated when evaluated in subqueries due to incorrect space calculations when creating temporary tables to hold intermediate results.