Bug #69666 | Strange result types for LPAD/RPAD and CONCAT when mixing binary and non-binary | ||
---|---|---|---|
Submitted: | 3 Jul 2013 19:09 | Modified: | 22 Jul 2014 19:40 |
Reporter: | Arthur O'Dwyer | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Data Types | Severity: | S3 (Non-critical) |
Version: | 5.5.31, 5.6.21 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[3 Jul 2013 19:09]
Arthur O'Dwyer
[21 Jul 2014 18:10]
Sveta Smirnova
Thank you for the report. In the concat part you have few rows similar to select lpad(concat("пирожки", cA), 16, "A") from foo; I mean all have "A", but not bA or _binary "A" like in simple LPAD example. And result from some of these rows do not correspond to result of simple LPAD. Is this a typo? If not please explain why results should be same?
[21 Jul 2014 21:51]
Arthur O'Dwyer
@Sveta: As described in the #comment above, I'm using lpad(X, 16, "A") as a proxy for collation(X). Originally this was because I didn't know the collation(X) builtin existed. Does this help clear up your confusion? mysql> select 1 as 'LHS constant',1 as 'LHS binary',1 as 'RHS constant',1 as 'RHS binary', lpad(_binary "пирожки", 15, _binary "A") as 'lpad', collation(lpad(_binary "пирожки", 15, _binary "A")) as 'collation.....', coercibility(lpad(_binary "пирожки", 15, _binary "A")) as 'coerc' from foo union all -> select 1,1,1,0, lpad(_binary "пирожки", 15, "A"), collation(lpad(_binary "пирожки", 15, "A")), coercibility(lpad(_binary "пирожки", 15, "A")) from foo union all -> select 1,1,0,1, lpad(_binary "пирожки", 15, bA), collation(lpad(_binary "пирожки", 15, bA)), coercibility(lpad(_binary "пирожки", 15, bA)) from foo union all -> select 1,1,0,0, lpad(_binary "пирожки", 15, cA), collation(lpad(_binary "пирожки", 15, cA)), coercibility(lpad(_binary "пирожки", 15, cA)) from foo union all -> select 1,0,1,1, lpad("пирожки", 15, _binary "A"), collation(lpad("пирожки", 15, _binary "A")), coercibility(lpad("пирожки", 15, _binary "A")) from foo union all -> select 1,0,1,0, lpad("пирожки", 15, "A"), collation(lpad("пирожки", 15, "A")), coercibility(lpad("пирожки", 15, "A")) from foo union all -> select 1,0,0,1, lpad("пирожки", 15, bA), collation(lpad("пирожки", 15, bA)), coercibility(lpad("пирожки", 15, bA)) from foo union all -> select 1,0,0,0, lpad("пирожки", 15, cA), collation(lpad("пирожки", 15, cA)), coercibility(lpad("пирожки", 15, cA)) from foo; +--------------+------------+--------------+------------+------------------------+-----------------+-------+ | LHS constant | LHS binary | RHS constant | RHS binary | lpad | collation..... | coerc | +--------------+------------+--------------+------------+------------------------+-----------------+-------+ | 1 | 1 | 1 | 1 | Aпирожки | binary | 4 | | 1 | 1 | 1 | 0 | Aпирожки | binary | 4 | | 1 | 1 | 0 | 1 | Aпирожки | binary | 2 | | 1 | 1 | 0 | 0 | Aпирожки | utf8_general_ci | 2 | | 1 | 0 | 1 | 1 | Aпирожки | binary | 4 | | 1 | 0 | 1 | 0 | AAAAAAAAпирожки | utf8_general_ci | 4 | | 1 | 0 | 0 | 1 | Aпирожки | binary | 2 | | 1 | 0 | 0 | 0 | AAAAAAAAпирожки | utf8_general_ci | 2 | +--------------+------------+--------------+------------+------------------------+-----------------+-------+ 8 rows in set (0.00 sec) mysql> select 0,1,1,1, lpad(bpiroshk, 15, _binary "A") as 'lpad', collation(lpad(bpiroshk, 15, _binary "A")) as 'collation.....', coercibility(lpad(bpiroshk, 15, _binary "A")) as 'coerc' from foo union all -> select 0,1,1,0, lpad(bpiroshk, 15, "A"), collation(lpad(bpiroshk, 15, "A")), coercibility(lpad(bpiroshk, 15, "A")) from foo union all -> select 0,1,0,1, lpad(bpiroshk, 15, bA), collation(lpad(bpiroshk, 15, bA)), coercibility(lpad(bpiroshk, 15, bA)) from foo union all -> select 0,1,0,0, lpad(bpiroshk, 15, cA), collation(lpad(bpiroshk, 15, cA)), coercibility(lpad(bpiroshk, 15, cA)) from foo union all -> select 0,0,1,1, lpad(cpiroshk, 15, _binary "A"), collation(lpad(cpiroshk, 15, _binary "A")), coercibility(lpad(cpiroshk, 15, _binary "A")) from foo union all -> select 0,0,1,0, lpad(cpiroshk, 15, "A"), collation(lpad(cpiroshk, 15, "A")), coercibility(lpad(cpiroshk, 15, "A")) from foo union all -> select 0,0,0,1, lpad(cpiroshk, 15, bA), collation(lpad(cpiroshk, 15, bA)), coercibility(lpad(cpiroshk, 15, bA)) from foo union all -> select 0,0,0,0, lpad(cpiroshk, 15, cA), collation(lpad(cpiroshk, 15, cA)), coercibility(lpad(cpiroshk, 15, cA)) from foo; +---+---+---+---+------------------------+-----------------+-------+ | 0 | 1 | 1 | 1 | lpad | collation..... | coerc | +---+---+---+---+------------------------+-----------------+-------+ | 0 | 1 | 1 | 1 | Aпирожки | binary | 2 | | 0 | 1 | 1 | 0 | Aпирожки | binary | 2 | | 0 | 1 | 0 | 1 | Aпирожки | binary | 2 | | 0 | 1 | 0 | 0 | Aпирожки | binary | 2 | | 0 | 0 | 1 | 1 | AAAAAAAAпирожки | utf8_general_ci | 2 | | 0 | 0 | 1 | 0 | AAAAAAAAпирожки | utf8_general_ci | 2 | | 0 | 0 | 0 | 1 | Aпирожки | binary | 2 | | 0 | 0 | 0 | 0 | AAAAAAAAпирожки | utf8_general_ci | 2 | +---+---+---+---+------------------------+-----------------+-------+ 8 rows in set (0.00 sec)
[21 Jul 2014 21:51]
Arthur O'Dwyer
(continued from above) mysql> select 1,1,1,1, lpad(concat(_binary "пирожки", _binary "A"), 16, "A") as 'lpad concat', collation(lpad(concat(_binary "пирожки", _binary "A"), 16, "A")) as 'collation.....', coercibility(lpad(concat(_binary "пирожки", _binary "A"), 16, "A")) as 'coerc' from foo union all -> select 1,1,1,0, lpad(concat(_binary "пирожки", "A"), 16, "A"), collation(lpad(concat(_binary "пирожки", "A"), 16, "A")), coercibility(lpad(concat(_binary "пирожки", "A"), 16, "A")) from foo union all -> select 1,1,0,1, lpad(concat(_binary "пирожки", bA), 16, "A"), collation(lpad(concat(_binary "пирожки", bA), 16, "A")), coercibility(lpad(concat(_binary "пирожки", bA), 16, "A")) from foo union all -> select 1,1,0,0, lpad(concat(_binary "пирожки", cA), 16, "A"), collation(lpad(concat(_binary "пирожки", cA), 16, "A")), coercibility(lpad(concat(_binary "пирожки", cA), 16, "A")) from foo union all -> select 1,0,1,1, lpad(concat("пирожки", _binary "A"), 16, "A"), collation(lpad(concat("пирожки", _binary "A"), 16, "A")), coercibility(lpad(concat("пирожки", _binary "A"), 16, "A")) from foo union all -> select 1,0,1,0, lpad(concat("пирожки", "A"), 16, "A"), collation(lpad(concat("пирожки", "A"), 16, "A")), coercibility(lpad(concat("пирожки", "A"), 16, "A")) from foo union all -> select 1,0,0,1, lpad(concat("пирожки", bA), 16, "A"), collation(lpad(concat("пирожки", bA), 16, "A")), coercibility(lpad(concat("пирожки", bA), 16, "A")) from foo union all -> select 1,0,0,0, lpad(concat("пирожки", cA), 16, "A"), collation(lpad(concat("пирожки", cA), 16, "A")), coercibility(lpad(concat("пирожки", cA), 16, "A")) from foo; +---+---+---+---+-------------------------+-----------------+-------+ | 1 | 1 | 1 | 1 | lpad concat | collation..... | coerc | +---+---+---+---+-------------------------+-----------------+-------+ | 1 | 1 | 1 | 1 | AпирожкиA | binary | 4 | | 1 | 1 | 1 | 0 | AпирожкиA | binary | 4 | | 1 | 1 | 0 | 1 | AпирожкиA | binary | 2 | | 1 | 1 | 0 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | | 1 | 0 | 1 | 1 | AпирожкиA | binary | 4 | | 1 | 0 | 1 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 4 | | 1 | 0 | 0 | 1 | AпирожкиA | binary | 2 | | 1 | 0 | 0 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | +---+---+---+---+-------------------------+-----------------+-------+ 8 rows in set (0.00 sec) mysql> select 0,1,1,1, lpad(concat(bpiroshk, _binary "A"), 16, "A") as 'lpad concat', collation(lpad(concat(bpiroshk, _binary "A"), 16, "A")) as 'collation.....', coercibility(lpad(concat(bpiroshk, _binary "A"), 16, "A")) as 'coerc' from foo union all -> select 0,1,1,0, lpad(concat(bpiroshk, "A"), 16, "A"), collation(lpad(concat(bpiroshk, "A"), 16, "A")), coercibility(lpad(concat(bpiroshk, "A"), 16, "A")) from foo union all -> select 0,1,0,1, lpad(concat(bpiroshk, bA), 16, "A"), collation(lpad(concat(bpiroshk, bA), 16, "A")), coercibility(lpad(concat(bpiroshk, bA), 16, "A")) from foo union all -> select 0,1,0,0, lpad(concat(bpiroshk, cA), 16, "A"), collation(lpad(concat(bpiroshk, cA), 16, "A")), coercibility(lpad(concat(bpiroshk, cA), 16, "A")) from foo union all -> select 0,0,1,1, lpad(concat(cpiroshk, _binary "A"), 16, "A"), collation(lpad(concat(cpiroshk, _binary "A"), 16, "A")), coercibility(lpad(concat(cpiroshk, _binary "A"), 16, "A")) from foo union all -> select 0,0,1,0, lpad(concat(cpiroshk, "A"), 16, "A"), collation(lpad(concat(cpiroshk, "A"), 16, "A")), coercibility(lpad(concat(cpiroshk, "A"), 16, "A")) from foo union all -> select 0,0,0,1, lpad(concat(cpiroshk, bA), 16, "A"), collation(lpad(concat(cpiroshk, bA), 16, "A")), coercibility(lpad(concat(cpiroshk, bA), 16, "A")) from foo union all -> select 0,0,0,0, lpad(concat(cpiroshk, cA), 16, "A"), collation(lpad(concat(cpiroshk, cA), 16, "A")), coercibility(lpad(concat(cpiroshk, cA), 16, "A")) from foo; +---+---+---+---+-------------------------+-----------------+-------+ | 0 | 1 | 1 | 1 | lpad concat | collation..... | coerc | +---+---+---+---+-------------------------+-----------------+-------+ | 0 | 1 | 1 | 1 | AпирожкиA | binary | 2 | | 0 | 1 | 1 | 0 | AпирожкиA | binary | 2 | | 0 | 1 | 0 | 1 | AпирожкиA | binary | 2 | | 0 | 1 | 0 | 0 | AпирожкиA | binary | 2 | | 0 | 0 | 1 | 1 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | | 0 | 0 | 1 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | | 0 | 0 | 0 | 1 | AпирожкиA | binary | 2 | | 0 | 0 | 0 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | +---+---+---+---+-------------------------+-----------------+-------+ 8 rows in set (0.00 sec)
[22 Jul 2014 16:45]
Sveta Smirnova
Thank you for the feedback. > @Sveta: As described in the #comment above, I'm using lpad(X, 16, "A") as a proxy for collation(X). Originally this was because I didn't know the collation(X) builtin existed. Does this help clear up your confusion? I still don't understand why you compare different things and expect same result. Lets examine row #4. select lpad(_binary "пирожки", 15, cA) from foo; It returns: 1 | 1 | 0 | 0 | Aпирожки | utf8_general_ci | 2 | Corresponding row for CONCAT is: select lpad(concat(_binary "пирожки", cA), 16, "A") from foo; And it returns: | 1 | 1 | 0 | 0 | AAAAAAAAпирожкиA | utf8_general_ci | 2 | Issue here is that in the first case you call LPAD on following VAR_STRING with binary collation, number and VAR_STRING with collation utf8_general_ci: mysql> select _binary "пирожки", 15, cA from foo; Field 1: `` Catalog: `def` Database: `` Table: `` Org_table: `` Type: VAR_STRING Collation: binary (63) Length: 14 Max_length: 14 Decimals: 31 Flags: NOT_NULL BINARY Field 2: `15` Catalog: `def` Database: `` Table: `` Org_table: `` Type: LONGLONG Collation: binary (63) Length: 2 Max_length: 2 Decimals: 0 Flags: NOT_NULL BINARY NUM Field 3: `cA` Catalog: `def` Database: `test` Table: `foo` Org_table: `foo` Type: VAR_STRING Collation: utf8_general_ci (33) Length: 15 Max_length: 1 Decimals: 0 Flags: +----------------+----+------+ | | 15 | cA | +----------------+----+------+ | пирожки | 15 | A | +----------------+----+------+ 1 row in set (0.01 sec) In the case if you use CONCAT LPAD receives result of the CONCAT and it is called on VAR_STRING with collation utf8_general_ci, number and VAR_STRING with collation utf8_general_ci: mysql> select concat(_binary "пирожки", 15, cA), 16, "A" from foo; Field 1: `concat(_binary "пирожки", 15, cA)` Catalog: `def` Database: `` Table: `` Org_table: `` Type: VAR_STRING Collation: utf8_general_ci (33) Length: 63 Max_length: 17 Decimals: 31 Flags: Field 2: `16` Catalog: `def` Database: `` Table: `` Org_table: `` Type: LONGLONG Collation: binary (63) Length: 2 Max_length: 2 Decimals: 0 Flags: NOT_NULL BINARY NUM Field 3: `A` Catalog: `def` Database: `` Table: `` Org_table: `` Type: VAR_STRING Collation: utf8_general_ci (33) Length: 3 Max_length: 1 Decimals: 31 Flags: NOT_NULL +------------------------------------------+----+---+ | concat(_binary "пирожки", 15, cA) | 16 | A | +------------------------------------------+----+---+ | пирожки15A | 16 | A | +------------------------------------------+----+---+ 1 row in set (0.01 sec) So I don't get why do you expect different results here.
[22 Jul 2014 18:01]
Arthur O'Dwyer
@Sveta: Okay, let's forget about CONCAT for the time being, because it's confusing. Let's stick to *just* the LPAD function. mysql> select lpad(_binary "пирожки", 15, cA), collation(lpad(_binary "пирожки", 15, cA)) from foo; +----------------------------------------+---------------------------------------------------+ | lpad(_binary "пирожки", 15, cA) | collation(lpad(_binary "пирожки", 15, cA)) | +----------------------------------------+---------------------------------------------------+ | Aпирожки | utf8_general_ci | +----------------------------------------+---------------------------------------------------+ 1 row in set (0.00 sec) mysql> select lpad(_binary "пирожки", 15, 'A'), collation(lpad(_binary "пирожки", 15, 'A')) from foo; +-----------------------------------------+----------------------------------------------------+ | lpad(_binary "пирожки", 15, 'A') | collation(lpad(_binary "пирожки", 15, 'A')) | +-----------------------------------------+----------------------------------------------------+ | Aпирожки | binary | +-----------------------------------------+----------------------------------------------------+ Here 'A' and cA have the same type and collation, and so the customer naturally expects that the result of the LPAD will be the same in both cases. And it is, **in value.** We can see here that the LPAD was performed UTF8-wise, rather than byte-wise. (Do you understand what I mean by those terms? I can back up and explain those if you want.) **However,** as you can see, MySQL has secretly decided that one of those UTF8-wise LPAD results ought to get the binary (bytewise) collation from now on, whereas the other one remains UTF8. This is unexpected behavior, impossible to explain to the customer. We can return to CONCAT later, but for now let's stick to only LPAD.
[22 Jul 2014 19:03]
Sveta Smirnova
Arthur, I agree that behavior which you provide in last comment is buggy, but before I mark this report as "Verified" lets finish with CONCAT.
[22 Jul 2014 19:18]
Arthur O'Dwyer
Here's one of the self-inconsistent cases from CONCAT. However, the main issue here is that LPAD is inconsistent with CONCAT. If you're planning to accept that as a bug, you should probably just accept it now, and we can always come back to CONCAT later. mysql> select collation(concat(cpiroshk, _binary "A")) from foo; +------------------------------------------+ | collation(concat(cpiroshk, _binary "A")) | +------------------------------------------+ | utf8_general_ci | +------------------------------------------+ 1 row in set (0.00 sec) mysql> select collation(concat(cpiroshk, bA)) from foo; +---------------------------------+ | collation(concat(cpiroshk, bA)) | +---------------------------------+ | binary | +---------------------------------+
[22 Jul 2014 19:40]
Sveta Smirnova
Thank you for the feedback. Verified as described.