| Bug #95960 | The logical operators &, |, and ^ do not ignore newlines in TEXT | ||
|---|---|---|---|
| Submitted: | 24 Jun 2019 18:55 | Modified: | 22 Jul 2019 16:13 |
| Reporter: | Manuel Rigger | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 8.0.16, 5.7.26, 5.6.44 | OS: | Ubuntu |
| Assigned to: | CPU Architecture: | x86 | |
[24 Jun 2019 18:55]
Manuel Rigger
[25 Jun 2019 5:32]
MySQL Verification Team
Hello Manuel Rigger, Thank you for the report. regards, Umesh
[21 Jul 2019 15:36]
Oleksandr Peresypkin
A patch for version 8.0 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug-95960-v8.0.patch.txt (text/plain), 1.30 KiB.
[21 Jul 2019 15:36]
Oleksandr Peresypkin
A patch for version 5.7 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug-95960-v5.7.patch.txt (text/plain), 1.23 KiB.
[21 Jul 2019 15:37]
Oleksandr Peresypkin
A patch for version 5.6 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug-95960-v5.6.patch.txt (text/plain), 1.23 KiB.
[22 Jul 2019 16:13]
Manuel Rigger
Thanks a lot for fixing this!
[26 Sep 2019 12:36]
Tor Didriksen
Posted by developer:
Thanks for the suggestion and the patch. Unfortunately we have several similar functions for converting a sting to some integer type, and they do slightly different things.
Some will ignore ' \t\n\v\f\r' others will ignore only ' \t'. E.g. this should also work I presume:
SELECT ('\n1' & 1) as and_result,
('\n1' | 0) as or_result,
('\n1' ^ 1) as xor_result;
# hex( convert('\n1' using utf16)) == 0x000A0031
# hex( convert('\n1' using utf32)) == 0x0000000A00000031
SELECT (_utf16 0x000A0031 & 1) as and_result,
(_utf16 0x000A0031 | 0) as or_result,
(_utf16 0x000A0031 ^ 1) as xor_result;
SELECT (_utf32 0x0000000A00000031 & 1) as and_result,
(_utf32 0x0000000A00000031 | 0) as or_result,
(_utf32 0x0000000A00000031 ^ 1) as xor_result;
And, should we take this even further, and ignore things like unicode NO-BREAK SPACE and THREE-PER-EM SPACE etc. etc.
Anyways, we will consider making this slightly more consistent across functions and character sets.
