Bug #99118 ARM CRC32 intrinsic call to accelerate table-checksum (not crc32c but crc32)
Submitted: 31 Mar 2020 5:20 Modified: 4 Nov 2020 15:43
Reporter: Krunal Bauskar Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.19 OS:Linux
Assigned to: CPU Architecture:ARM
Tags: Contribution, contributions

[31 Mar 2020 5:20] Krunal Bauskar
Using CRC32 ARM intrinsic calls to calculate the checksum.
There are 2 types of checksum: crc32, crc32c.
- crc32 is traditional crc32 found in most of the zip utilities.
- crc32c (crc32 Castagnoli) uses different polynomial and a new generation
  the platform can compute full 32 bit crc32c in 3 cycles.
- MySQL uses both of them. crc32 is used for calculating table and binlog
  checksum and crc32c for by InnoDB for page-checksum.
- ARM ACLE added intrinsic support for both crc32 variants.
  (As per my reading I haven't seen the support for crc32 on x86-sse.
   x86-sse has crc32c support).
- Currently, MySQL calculates crc32 using zlib (software-based approach).
  Said patch help optimize use of crc32 (on ARM) by leveraging the
  corresponding ARM ACLE supported crc32 variants (crc32[b|h|w|d]).


Approach help reduces checksum time for the table.
Our local testing revealed gain upto 50% (reduction in time)

in mil		table-checksum timing in sec
table-size      w/o patch	w/ patch %reduction
0.1		0.15		0.06		60
1		1.5		0.69		54
5		7.55		3.55		53
10		14.94		6.92		54
50		75.25		34.5		54


You can also access the patch here for easy accessibility.
(physical copy attached below with OCA confirmation)



Is testing done?

* checksum performance testing .. gain observed.
* general performance testing (on ARM) .. no regression observed
* mtr testing on ARM and x86.. done (patch doesn't change approach for x86)

How to repeat:
* Patch is created against mysql-8.0.19 tag.
* Apply the patch and load any table.
* Execute table checksum (checksum table <table-name>)
* Try the same w/o patch.
* You will observe a significant difference in checksum timing.
  [In our local testing on ARM we have observed timing reduced up to 50%].

Suggested fix:
* Apply the patch to use ARM crc32 intrinsic.
[31 Mar 2020 5:20] Krunal Bauskar
use arm crc32 intrinsic call to calculate table checksum

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: crc32.checksum.patch (text/x-patch), 7.03 KiB.

[31 Mar 2020 6:29] MySQL Verification Team
Hello Krunal,

Thank you for the report and contribution.

[4 Nov 2020 15:43] Paul DuBois
Posted by developer:
Fixed in 8.0.23.

CRC calculations for binlog checksums are faster on ARM platforms.

Thanks to Krunal Suresh Bauskar for the contribution.