Description:
In the following code path:
row_vers_impl_x_locked_low()
└── row_vers_find_matching()
└── row_clust_vers_matches_sec()
Within row_clust_vers_matches_sec(), the following code appears problematic:
/** If the reconstructed values do not match the secondary index then we know
we should report no match. We compare the strings in binary mode to make it
more robust, because a thread which has changed "a" to "A" should prevent
concurrent transactions from peeking into the new binary representation,
say via CONVERT(column_name, binary). */
dtuple_set_types_binary(entry, dtuple_get_n_fields(entry));
return (0 == cmp_dtuple_rec(entry, sec_rec, sec_index, sec_offsets));
Here, it uses binary comparison to match entry with sec_rec. This can lead to different comparison results even when the actual values (considering the original charset) would be the same.
------
Analysis & Risk
1. If sec_rec is not delete-marked:
The function is expected to find the first version of the primary record that does not match sec_rec (expect cmp_dtuple_rec() to return 1 for mismatch).
Binary comparison might cause it to return an earlier mismatch than expected.
This results in a false positive result but still technically correct (no correctness impact).
2. If sec_rec has a delete mark:
The function should find the first primary record version that matches sec_rec (expect cmp_dtuple_rec() to return 0 for match).
Binary comparison may produce a mismatch for entries that would match under the original charset.
This can cause the real active transaction to be skipped, leading to the false belief that no active transaction exists.
Consequently, this can lead to an incorrect assumption that there is no implicit lock on sec_rec.
How to repeat:
This is something I noticed while reviewing the code, so I’m reporting it here for further discussion.
According to the comment in the code I pasted above, it looks like it was trying to handle charset comparisons by forcing binary comparison. But honestly, I haven’t figured out what the exact situation is that would require this.
However, if that situation really exists, then I believe this bug report I’m posting does make sense.