Bug #118299 | Multi-valued Index on Composite Key Returns Incomplete Results for Rows with Empty JSON Array('[]') | ||
---|---|---|---|
Submitted: | 29 May 6:57 | Modified: | 21 Aug 2:18 |
Reporter: | Yichang SONG (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 8.0.42, 8.4.5 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[29 May 6:57]
Yichang SONG
[29 May 9:46]
Yichang SONG
I examined WL#8955: Add support for multi-valued indexes, which sheds critical light on the handling of empty arrays and provides context for the observed behavior: " If multi-value key part's generation function returns empty array, nothing is inserted into the index, and the data record will not be accessible via index scan. " This statement explicitly confirms that the observed behavior—where no index entry is created for rows where CAST(c2 AS UNSIGNED ARRAY) evaluates to an empty array ('[]'), making them inaccessible via this index scan—is per design. However, WL#8955 also includes crucial expectations and warnings: " F6) Regular access method (index scan, ref, range (plain or MRR)) over multiple-valued indexes should be supported. ...... F8) Usage of multi-valued index in any other way, except described in F6 (e.g index scan, or using only prefix) should be prohibited, to avoid performance regressions/wrong results. " F8 clearly anticipates scenarios where queries might lead to "wrong results" if they don't fully conform to the "regular access methods" described in F6. However, our current query, SELECT c1 FROM t3 WHERE c1=1;, while seemingly a "regular access method" as per F6, paradoxically results in "wrong results" due to the specific handling of empty arrays. Given that, perhaps the fix for this bug could be implemented at the server layer by classifying SELECT c1 FROM t3 WHERE c1=1; as falling under F8 (prohibited usage), thereby preventing this query from automatically (or even with FORCE INDEX) selecting the multi-valued index.
[29 May 11:27]
MySQL Verification Team
Hello Yichang Song, Thank you for the report and feedback. regards, Umesh
[21 Aug 2:18]
Yichang SONG
I'd like to add a further observation to this report. The issue described also appears to affect aggregate functions like COUNT(*). For instance, using the same test case, the query: SELECT COUNT(*) FROM t3 WHERE c1 = 1; returns an incorrect result of 1, when the expected result is 2. This behavior seems quite similar to what is described in the S1 (Critical) Bug #104898 (https://bugs.mysql.com/bug.php?id=104898), which also reports incorrect COUNT(*) results when a multi-valued index is utilized. This suggests the problem isn't limited to failing to retrieve specific rows via SELECT *, but that the optimizer's use of this index can lead to incorrect outcomes for different types of queries, including aggregations. This could be particularly misleading for users who rely on such aggregate results for data analysis.