Bug #74776 InnoDB checksums (new or crc32) use too much CPU on POWER8
Submitted: 11 Nov 2014 5:08 Modified: 16 Dec 2015 7:15
Reporter: Stewart Smith Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.7.7 OS:Linux
Assigned to: CPU Architecture:Any

[11 Nov 2014 5:08] Stewart Smith
Description:
When doing sysbench r/w benchmarks on POWER8 with working set greater than buffer pool size (but all ibdata in page cache), InnoDB checksum was using 35% of CPU. When I switched to CRC32 checksums, it went down to around 17% of total CPU.

Clearly, calculating checksums is a big overhead on POWER8.

For Intel CPUs, the CRC32 instruction is used, which is likely to be optimal.

The innodb algorithm appears to be completely undocumented and a bit incomprehensible to me, and I've currently abandoned efforts to improve it.

How to repeat:
sysbench r/w on POWER8

Suggested fix:
I will provide some code that implements a more optimal CRC32 implementation on POWER8.

This bug is mainly a placeholder for that code.

Although I would *LOVE* any explanation on the innodb checksum algorithm that may help in making it faster.
[18 Nov 2014 11:50] MySQL Verification Team
Hello Stewart,

Thank you for the report.

Thanks,
Umesh
[10 Apr 2015 7:57] Stewart Smith
Still applies to 5.7.7

I have a patch that fixes it - and it's conditional on building on POWER, so it should be fine to include.

Just polishing it up at the moment.

Gives ~30% improvement in sysbench read/write workloads.
[15 Dec 2015 4:42] Daniel Black
add CRC32 for Power8 code

Attachment: bug-74776.patch (text/x-patch), 58.58 KiB.

[15 Dec 2015 4:44] Daniel Black
unittest/gunit/innodb/ut0crc32-t

Before:

    1..2
    Using software crc32 implementation, CPU is little-endian
    ok 1
    Using software crc32 implementation, CPU is little-endian
        normal CRC32: real    0.148006 sec
        normal CRC32: user    0.148000 sec
        normal CRC32: sys     0.000000 sec
    big endian CRC32: real    0.144293 sec
    big endian CRC32: user    0.144000 sec
    big endian CRC32: sys     0.000000 sec
    ok 2

After:

    1..2

    Using POWER8 crc32 implementation, CPU is little-endian
    ok 1
    Using POWER8 crc32 implementation, CPU is little-endian
        normal CRC32: real    0.008097 sec
        normal CRC32: user    0.008000 sec
        normal CRC32: sys     0.000000 sec
    big endian CRC32: real    0.147043 sec
    big endian CRC32: user    0.144000 sec
    big endian CRC32: sys     0.000000 sec
    ok 2

Includes work of other IBM employees. Submitted under OCA
[16 Dec 2015 0:18] Daniel Black
fix for compling on non-power platforms

Attachment: portability.patch (text/x-patch), 1.81 KiB.

[16 Dec 2015 7:07] Daniel Black
note binlog checksums use crc32 implementation from zlib. Is it better to move crc32 implementations into mysys and get both implementations to use the same codebase?
[16 Dec 2015 7:10] Alexey Kopytov
See bug #79155 and bug #79325.
[16 Dec 2015 7:15] Stewart Smith
Note also that if InnoDB logs use CRC32 instead of custom checksum, recovery would be a bunch faster too (especially on POWER).
[22 Dec 2015 5:19] Daniel Black
Replacing previous patches is https://github.com/mysql/mysql-server/pull/41 which hopefully solves all outstanding crc32 optimization bugs.