Bug #111647 | Blob partial update in undo record may lead potential crash | ||
---|---|---|---|
Submitted: | 4 Jul 2023 2:41 | Modified: | 6 Jul 2023 2:36 |
Reporter: | andrew zhao | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: JSON | Severity: | S3 (Non-critical) |
Version: | 8.0.33 | OS: | Linux |
Assigned to: | CPU Architecture: | x86 |
[4 Jul 2023 2:41]
andrew zhao
[4 Jul 2023 2:51]
andrew zhao
During the update sql executing, two parts of binary diff is calculated. First binary diff is array meta. It is showed in the memory layout and marked as CHANGED. This binary diff's length is 2 bytes. Another binary diff is array data copy. However, this json type is array and we need to remove the last one element in the array. According to the current code logic, we don't need any array data copy. The remove data is marked as freed and the binary diff length is 0.
[4 Jul 2023 3:15]
andrew zhao
The potential crash happen if the select sql goes the row_search_mvcc process. In trx_undo_read_blob_update func, lob_diff.m_length is the parsed undo partial update length. Because server calculated blob binary diff and record it into undo record, this could be zero. lob_undo_data.copy_old_data may have different behavior on 8.0.25 and 8/0/33 when handling malloc array size is 0 condition. In 8.0.25, it returns nullptr and pass this nullptr to undo_ptr. That leads to a crash when execute mach_read_next_compressed(&undo_ptr); In 8.0.33, it returns array meta instead of nullptr. /* Copy the data only if the lob_undo is not null. */ if (lob_seq != nullptr) { undo_ptr = lob_undo_data.copy_old_data(undo_ptr, lob_diff.m_length); } else { undo_ptr += lob_diff.m_length; } ... /* Read the number of LOB index entries modified. */ ulint n_entry = mach_read_next_compressed(&undo_ptr); In some other day, if the return value of malloc array with length 0 changed to be nullptr. This crash may happen. The reproduce step of crash in 8.0.25 is: client1 create table t (a int primary key, b json); insert into t values (1, '[ "abc", "def" ]'); update t set b=JSON_SET(b, '$[0]', REPEAT('w', 10000)); client2 set SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ; begin; select * from t where a =1\G client1 UPDATE t SET b = JSON_REMOVE(b, '$[1]') where a = 1; client2 select * from t where a =1\G The stack is the same as https://bugs.mysql.com/bug.php?id=107953
[4 Jul 2023 12:29]
andrew zhao
To CORRECT the memory lay out after exec UPDATE t SET b = JSON_REMOVE(b, '$[1]') where a = 1; command. Memory layout should be: 0x02 - type: small JSON array CHANGED 0x01 - number of elements (low byte) CHANGED 0x00 - number of elements (high byte) 0x12 - number of bytes (low byte) 0x00 - number of bytes (high byte) 0x0C - type of element 0 (string) 0x0A - offset of element 0 (low byte) 0x00 - offset of element 0 (high byte) [Free] 0x0C - type of element 1 (string) [Free] 0x0E - offset of element 1 (low byte) [Free] 0x00 - offset of element 1 (high byte) 0x03 - length of element 0 'a' 'b' - content of element 0 'c' 0x03 - length of element 1 'd' 'e' - content of element 1 'f' Two binary diff is calculated. First part is number of elements. In this case this binary diff is calculated to be 2 bytes(marked as CHANGED). The second binary diff is array meta binary diff. In normal case, to ensure the array meta is continuous with no hollow, some binary meta need to be copied. However, removed element is the last one of this array. No binary meta data need to be copied. (marked as FREE)
[4 Jul 2023 12:38]
MySQL Verification Team
Hi Mr. zhao, Thank you for your bug report. However, 8.0.24 is a bug fix release, just like 8.0.33 is a bug fix release. That means that bugs reported for 8.0.24 are fixed in later release. So, 8.0.33 bugs will be fixed in 8.0.34 or some later release. Hence, it is expected behaviour that 8.0.33 has no problem. We repeated your test case and everything worked quite fine with 8.0.33. Can't repeat.
[5 Jul 2023 2:37]
andrew zhao
To simplify what I want to say. There are two parts of code in 8.0.33(current newest version) is not reasonable. JSON_REMOVE may produce a zero length binary diff. It is unnecessary and could cause a potential crash. Code in trx_undo_read_blob_update listed above could parse this zero length diff and could crash which depends on the different behavior of malloc array with 0 length. Different release acts different behavior. (Details lists above). If this behavior changes again, crash may happen.
[5 Jul 2023 12:56]
MySQL Verification Team
HI Mr. zhao, Yes, malloc(0) is a hard error. However, we test it in so many places that it is now very unlikely that it could happen. Also, many system malloc libraries test it as well. We can not verify a report based on "it might happen" .......
[6 Jul 2023 2:36]
andrew zhao
Thank you for your response.
[6 Jul 2023 12:14]
MySQL Verification Team
You are welcome .....