Bug #116657 Retrieving list of records that contain an empty blob causes undefined behaviour
Submitted: 14 Nov 2024 12:44 Modified: 11 Dec 2024 0:30
Reporter: Stuart Smith Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / ODBC Severity:S3 (Non-critical)
Version:8.0.40 OS:Windows (Windows11)
Assigned to: Rafal Somla CPU Architecture:x86

[14 Nov 2024 12:44] Stuart Smith
Description:
Retrieving a list of records that contain an empty blob causes undefined behavior or invalid value. I've been able to reproduce using a perl script and C++ application making use of the ODBC driver and confirm it fails in both implementations.

How to repeat:
1. Create a table that contains a blob field.
CREATE TABLE blob_test (data BLOB)

2. Insert a record - blob field with 1 or more bytes.

my $x = '12345';
INSERT INTO blob_test (data) VALUES (?)

3. Insert another record - empty blob field (zero bytes)
my $y = '';
INSERT INTO blob_test (data) VALUES (?)

4. Retrieve records and list values.

SELECT data FROM blob_test

Output for perl script:

BLOB field lengths:
Length: 5 bytes
Content: 12345

Length: 11 bytes
Content: 12345e=file

Output for mysql cli:
mysql> select * from blob_test;
+-------+
| data  |
+-------+
| 12345 |
|       |
+-------+

Suggested fix:
I've tried to go down that rabbit hole, but no luck. I can confirm it works up to 8.0.26. I've tried several versions all the way up to 8.0.40 with undefined behavior.
[14 Nov 2024 12:50] Stuart Smith
Prevents us from upgrading MySql 8.0.40 ODBC driver
[14 Nov 2024 13:37] MySQL Verification Team
Hi Mr. Smith,

Thank you for your bug report.

Can you let us know the exact releases of C/C++ C/Perl, C/ODBC, version of Windows and it's ODBC system, so that we can attempt to repeat the behaviour.

We are waiting on your feedback.
[14 Nov 2024 14:00] Stuart Smith
Perl details: 5, version 38, subversion 2 (v5.38.2)
C++ details: MSVC version 19.29.30151 (192930151)
ODBC Driver Version: 08.00.0040 ODBC Version: 03.80.0000 Driver ODBC Version: 03.80
ODBC Manager: Windows ODBC Manager 64-bit

FYI: All of the above is 64bit
[15 Nov 2024 3:31] Bogdan Degtyariov
Thanks for the additional details.
Since you mentioned it can you provide the C++ test case as well?
[15 Nov 2024 5:23] Stuart Smith
Unfortunately I can't provide the C++ code as it's part of a solution. I did however include the ODBC SQL trace, hopefully that can help. When I compared the sql logs they SQLExecute, SQLFetch and bindings had the same settings.
[15 Nov 2024 6:39] Bogdan Degtyariov
I was able to reproduce the problem with one of my standard test cases.
The bug is verified.
[15 Nov 2024 7:18] Stuart Smith
Brilliant news! Shout if you need anything more.
[3 Dec 2024 9:12] Bogdan Degtyariov
Posted by developer:
 
The issue affects getting results via columns bound through SQLBindCol() and
obtained directly from SQLGetData().

Each of these functions can return the length of written data via pointer passed
as the last parameter for both functions. The problem manifests itself by
returning the wrong length of written data when the data is not written
in case of the empty result. The data buffer is not modified and it can contain
remains of data left from retrieving the previous row, which is expected.

However, the expected returned length for the empty data column should be zero.
Instead it takes a random number. This can potentially cause an Access Violation
due to out of buffer access.

The error in the processing logic inside the ODBC driver was as follows:

 1. Empty (length is zero) column data is received from the server
 2. Driver checks lengths of the result in bound buffers (SSPS case)
    or lengths array (non-SSPS case).
 3. If length turns out to be zero it assumes the data arrived from the
    fake result set artificially created for some of the catalog functions
    without checking if it is really true.
 4. Driver attempts to measure the length of the data as string in the data
    buffer. Which can lead to undefined behavior since the data buffer was
    not modified and can contain anything.
    
The solution to the problem is to check for the fake result set before
attempting to interpret the buffer as char* and measuring its length.
The catalog functions produce data buffers acknowledging that this measurement
will be done later, but this is not always safe for any data buffer and
therefore zero data length should be interpreted literally if the result set
is coming from the server (not "fake" - generated by catalog functions).
[5 Dec 2024 2:53] Bogdan Degtyariov
Posted by developer:
 
A better solution is to remove checks for the "fake" result set in this logic and make catalog functions initialize lengths for IRD descriptor instead.
This will allow data processing to rely on the provided lengths rather than trying measuring them on the fly with potentially undefined behavior.
This is implemented in the patch.
[9 Dec 2024 0:05] Bogdan Degtyariov
Posted by developer:
 
The patch implementing the solution with initializing lengths in IRD descriptor (see previous comment) has been pushed into the trunk.
[11 Dec 2024 0:30] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Connector/ODBC 9.2.0 release, and here's the proposed changelog entry from the documentation team:

Retrieving a list of records that contained an empty BLOB field could
return an unexpected result.

Thank you for the bug report.