Bug #96950 CONCAT() can generate corrupted output
Submitted: 19 Sep 2019 16:48 Modified: 6 Feb 2020 19:16
Reporter: Jay Edgar Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Data Types Severity:S2 (Serious)
Version:5.6, 5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: concat, corruption

[19 Sep 2019 16:48] Jay Edgar
Description:
The Item_func_concat::val_str() function which is used by CONCAT() has a special case for when the res2 is a substring of tmp_value, but the check for this condition is incorrect.  The current check looks like this:

      else if (tmp_value.is_alloced() && res2->ptr() >= tmp_value.ptr() &&
	       res2->ptr() <= tmp_value.ptr() + tmp_value.alloced_length())

The last portion of this check is invalid as it will include the case where res2->ptr() is _immediately_ the data in tmp_value and attempt to treat res2 as a substring of tmp_value which it is not.

In the scenario that I saw, after going through this case incorrectly the length() of res was greater than the alloced_len() and when the string was copied later, only alloced_len() bytes were copied to the new location, though length() bytes were assumed which means that random bytes from the new location were incorrectly included in the result, i.e. corruption.

I checked the code in 5.6, 5.7, and 8.0.  The 5.6 and 5.7 versions were the same and had the bug.  8.0 has completely rewritten this function and is much simpler and doesn't appear to have the problem.  I have not checked versions prior to 5.6.

How to repeat:
Very difficult.  The example I had used a CONCAT() of about 48 or so different strings, some of which were generated via TO_BASE64().  However the corruption is extremely dependent on the way the allocator lays out memory allocations.  We were using the jemalloc allocator, so it would be unlikely that my test case could reproduce it on standard MySQL.

Suggested fix:
The second part of the check needs to check only for res2->ptr() being less than the calculation of the end of tmp_value, not less then _and_ equal.

      else if (tmp_value.is_alloced() && res2->ptr() >= tmp_value.ptr() &&
	       res2->ptr() < tmp_value.ptr() + tmp_value.alloced_length())
[23 Sep 2019 12:08] MySQL Verification Team
Hello Mr. Edgar,

Thank you for your bug report.

I have analysed carefully the code that you are referring to and I conclude that you are correct.

Verified as reported for MySQL 5.6 and 5.7.
[26 Sep 2019 5:50] Tor Didriksen
Posted by developer:
 
concat an concat_ws do indeed have multiple problems in 5.7 and below.
They were both completely rewritten in the patch for
    Bug#22893669 CONCAT: "SOURCE AND DESTINATION OVERLAP IN MEMCPY"
[26 Sep 2019 12:21] MySQL Verification Team
Thank you Tor ....
[6 Feb 2020 19:16] Paul DuBois
Posted by developer:
 
Fixed in 5.6.48, 5.7.30.

CONCAT() and CONCAT_WS() could produce incorrect results in rare
cases due to incorrect substring handling.
[7 Feb 2020 12:49] MySQL Verification Team
Thank you, Paul .....