Bug #68476 | Suboptimal code in my_strnxfrm_simple() | ||
---|---|---|---|
Submitted: | 24 Feb 2013 5:41 | Modified: | 3 Dec 2014 15:44 |
Reporter: | Alexey Kopytov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 5.1, 5.5, 5.6 | OS: | Any |
Assigned to: | Tor Didriksen | CPU Architecture: | Any |
[24 Feb 2013 5:41]
Alexey Kopytov
[26 Feb 2013 16:30]
MySQL Verification Team
Thank you for the bug report.
[15 Apr 2013 8:22]
Tor Didriksen
It depends on the input data. For long strings (100 bytes, like sysbench?) the special handling of dst == src is actually much faster than the other branch. For short strings the extra if/else overhead becomes noticeable.
[15 Apr 2013 8:55]
Alexey Kopytov
Tor, Let's rewrite that code to an equivalent form for clarity: if (dest != src) { const uchar *end; for ( end=src+len; src < end ; dest++, src++ ) *dest= map[*src]; } else { const uchar *end; for ( end=src+len; src < end ; src++) *src= map[*src]; dest= src; } With this in mind, is there really any difference between the branches?
[15 Apr 2013 9:05]
Tor Didriksen
I agree that the code can be re-written, I'm just pointing out that this is much faster for long strings: const uchar *end; for (end= dst + frmlen; dst < end; ++dst) *dst= map[(uchar) *dst]; (and I'm assuming that this is actually the code you are executing running sysbench) For short strings, something like this is better on my machine uint ix; for (ix= 0; ix < frmlen; ++ix) dst[ix] = map[src[ix]];
[25 Nov 2014 10:21]
Laurynas Biveinis
Bug 68476 patch for 5.7 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: bug68476.patch (application/octet-stream, text), 1.14 KiB.
[25 Nov 2014 10:22]
Laurynas Biveinis
Patch author Alexey Kopytov
[27 Nov 2014 17:09]
Tor Didriksen
Posted by developer: Benchmark results (see unit test) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcDst/0 (147 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcDst/1 (239 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcDst/2 (1173 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcDst/3 (9310 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcDst/0 (150 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcDst/1 (231 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcDst/2 (753 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcDst/3 (5908 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcDst/0 (145 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcDst/1 (233 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcDst/2 (1165 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcDst/3 (9420 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcDst/0 (151 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcDst/1 (220 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcDst/2 (767 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcDst/3 (5948 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcSrc/0 (146 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcSrc/1 (225 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcSrc/2 (992 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalSrcSrc/3 (7384 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcSrc/0 (146 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcSrc/1 (221 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcSrc/2 (732 ms) [ OK ] Strnxfrm/StrnxfrmTest.OriginalUnrolledSrcSrc/3 (6043 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcSrc/0 (145 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcSrc/1 (247 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcSrc/2 (1170 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedSrcSrc/3 (9286 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcSrc/0 (151 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcSrc/1 (235 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcSrc/2 (757 ms) [ OK ] Strnxfrm/StrnxfrmTest.ModifiedUnrolledSrcSrc/3 (5900 ms)
[1 Dec 2014 10:06]
Tor Didriksen
Bug#16403708 SUBOPTIMAL CODE IN MY_STRNXFRM_SIMPLE() Based on a contribution from Alexey Kopytov Add benchmarking tests showing that my_strnxfrm_simple is not suboptimal compared to the proposed patch. However: if we take the proposed patch, and do some loop unrolling, we get improved results for both (src == dst) and (src != dst) transformations.
[3 Dec 2014 15:44]
Paul DuBois
Noted in 5.7.6 changelog. The code in my_strnxfrm_simple() was suboptimal and was improved. Thanks to Alexey Kopytov for the patch.