Bug #111290 No matching rows when joining multiple tables on binary/varbinary key
Submitted: 6 Jun 2023 7:48 Modified: 8 Aug 2023 22:10
Reporter: Tomasz Szlagowski Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.23, 8.0.33, 5.7 OS:Any
Assigned to: CPU Architecture:Any

[6 Jun 2023 7:48] Tomasz Szlagowski
Description:
I _think_ the problem I'm experiencing is happening because the ORM we're using created `binary(16)` primary keys, but `varbinary(16)` foreign keys.

Everything was mostly working fine, except recently we've upgraded to 8.0.23 and some complex queries are now not working as expected.

The problem is that some rows that should be matched are not.

What makes me think this is a bug:
- the problem did not happen on MySQL 5.6 as far as I can tell
- adding `AND parents.id = 0x93222796caba43ca979f5c96eb6898b7` to the query makes it work (return 1 result instead of 0)

The most reliable workaround is to use the same type everywhere (either `binary(16)` or `varbinary(16)`).

Otherwise doing less joins / nested subqueries at once (using CTE or splitting complex queries into multiple ones on the app side) you might be able to avoid this bug.

How to repeat:
DROP TABLE IF EXISTS `foos`;
DROP TABLE IF EXISTS `bars`;
DROP TABLE IF EXISTS `parents`;

CREATE TABLE `parents` (
  `id` binary(16) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

CREATE TABLE `foos` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `parent_id` varbinary(16) NOT NULL,
  `text` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_foos_on_parent_id` (`parent_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12345 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

CREATE TABLE `bars` (
  `id` binary(16) NOT NULL,
  `parent_id` varbinary(16) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_bars_on_parent_id` (`parent_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

-- Create some test data
INSERT INTO parents VALUES (0x93222796caba43ca979f5c96eb6898b7);
INSERT INTO foos (parent_id, text) SELECT parents.id, 'correct output' FROM parents;
INSERT INTO bars (id, parent_id) SELECT 0x79cea9ab6fe14a8ebdfed711a7727763, parents.id FROM parents;

-- Reproduce the problem:
SELECT foos.text
FROM foos
JOIN parents ON parents.id = foos.parent_id
JOIN bars ON bars.parent_id = parents.id
WHERE bars.parent_id = 0x93222796caba43ca979f5c96eb6898b7;
[6 Jun 2023 9:09] MySQL Verification Team
Hello Tomasz,

Thank you for the report and test case.

regards,
Umesh
[7 Jun 2023 12:24] huahua xu
Hi Tomasz,

It is incorrect that the method `Item_hex_string::clone_item` clone '0x93222796caba43ca979f5c96eb6898b7' to '0x8b3f9333488f5170' during the optimizer propagates constant values in the conditions.

Item_hex_string::make_hex_str(const char * str, unsigned __int64 str_length) 
Item_hex_string::hex_string_init(const char * str, unsigned int str_length) 
Item_hex_string::Item_hex_string(const char * str, unsigned int str_length) 
Item_hex_string::clone_item() 
change_cond_ref_to_const(THD * thd, I_List<COND_CMP> * save_list, Item * and_father, Item * cond, Item * field, Item * value) 
change_cond_ref_to_const(THD * thd, I_List<COND_CMP> * save_list, Item * and_father, Item * cond, Item * field, Item * value) 
propagate_cond_constants(THD * thd, I_List<COND_CMP> * save_list, Item * and_father, Item * cond) 
propagate_cond_constants(THD * thd, I_List<COND_CMP> * save_list, Item * and_father, Item * cond) 
optimize_cond(THD * thd, Item * * cond, COND_EQUAL * * cond_equal, mem_root_deque<TABLE_LIST *> * join_list, Item::cond_result * cond_value) 
JOIN::optimize()
Query_block::optimize(THD * thd) 
Query_expression::optimize(THD * thd, TABLE * materialize_destination, bool create_iterators)
Sql_cmd_dml::execute_inner(THD * thd) 
Sql_cmd_dml::execute(THD * thd)
mysql_execute_command(THD * thd, bool first_level) 
...
[7 Jun 2023 12:28] huahua xu
Additionally, you could see the detailed source code about the method `Item_hex_string::make_hex_str`
[7 Jun 2023 14:24] huahua xu
The patch would fix the bug issue.

Attachment: Item_hex_string_clone_item_function_with_error_data.patch (application/octet-stream, text), 3.13 KiB.

[8 Aug 2023 22:10] Jon Stephens
Documented fix as follows in the MySQL 8.2.0 changelog:

    A join on a BINARY column with a VARBINARY column of the same
    size, having matching values equal in size to that of the BINARY
    column in both columns, did not produce any matching rows.

Closed.
[3 Nov 2023 20:58] Jean-François Gagné
This bug, reported in 8.0 and 5.7, is marked as fixed in 8.2: will this be fixed in 8.0 ?  And in 5.7 ?