Bug #95960 The logical operators &, |, and ^ do not ignore newlines in TEXT
Submitted: 24 Jun 2019 18:55 Modified: 22 Jul 2019 16:13
Reporter: Manuel Rigger Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.16, 5.7.26, 5.6.44 OS:Ubuntu
Assigned to: CPU Architecture:x86

[24 Jun 2019 18:55] Manuel Rigger
Description:
The binary logical operators do not ignore newlines in TEXT values.

How to repeat:
SELECT '\n1' & 1; -- expected: 1, actual: 0
SELECT '\n1' | 0; -- expected: 1, actual: 0
SELECT '\n1' ^ 1; -- expected: 0, actual: 1

Other operators ignore the newline, for example:

SELECT '\n1' + 1; -- 2
SELECT '\n1' OR 0; -- 1
SELECT '\n1' IS TRUE; -- 1
SELECT NOT '\n1'; -- 0

The logical operators, like other operators, ignore other whitespace characters such as tabs.
[25 Jun 2019 5:32] MySQL Verification Team
Hello Manuel Rigger,

Thank you for the report.

regards,
Umesh
[21 Jul 2019 15:36] Oleksandr Peresypkin
A patch for version 8.0

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug-95960-v8.0.patch.txt (text/plain), 1.30 KiB.

[21 Jul 2019 15:36] Oleksandr Peresypkin
A patch for version 5.7

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug-95960-v5.7.patch.txt (text/plain), 1.23 KiB.

[21 Jul 2019 15:37] Oleksandr Peresypkin
A patch for version 5.6

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug-95960-v5.6.patch.txt (text/plain), 1.23 KiB.

[22 Jul 2019 16:13] Manuel Rigger
Thanks a lot for fixing this!
[26 Sep 2019 12:36] Tor Didriksen
Posted by developer:
 
Thanks for the suggestion and the patch. Unfortunately we have several similar functions for converting a sting to some integer type, and they do slightly different things.
Some will ignore ' \t\n\v\f\r' others will ignore only ' \t'. E.g. this should also work I presume:

SELECT ('\n1' & 1) as and_result,
       ('\n1' | 0) as or_result,
       ('\n1' ^ 1) as xor_result;

# hex( convert('\n1' using utf16)) == 0x000A0031
# hex( convert('\n1' using utf32)) == 0x0000000A00000031

SELECT (_utf16 0x000A0031 & 1) as and_result,
       (_utf16 0x000A0031 | 0) as or_result,
       (_utf16 0x000A0031 ^ 1) as xor_result;

SELECT (_utf32 0x0000000A00000031 & 1) as and_result,
       (_utf32 0x0000000A00000031 | 0) as or_result,
       (_utf32 0x0000000A00000031 ^ 1) as xor_result;

And, should we take this even further, and ignore things like unicode NO-BREAK SPACE and THREE-PER-EM SPACE etc. etc.
Anyways, we will consider making this slightly more consistent across functions and character sets.