Bug #114468 | Inconsistent table performance: fresh data import vs after optimization | ||
---|---|---|---|
Submitted: | 24 Mar 21:51 | Modified: | 25 Mar 16:44 |
Reporter: | Przemyslaw Malkowski | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | 8.3.0, 8.0.36 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[24 Mar 21:51]
Przemyslaw Malkowski
[25 Mar 12:32]
MySQL Verification Team
Hello Przemyslaw, Thank you for the report and feedback. I'm currently analyzing the issue, it would be really good if you could please share the innodb_buffer_pool_size details if not default. Thank you. Sincerely, Umesh
[25 Mar 13:49]
Przemyslaw Malkowski
Hello Umesh, BP was at default 128M, so much smaller than the table size. The issue is visible better when disk reads are involved :)
[25 Mar 14:10]
MySQL Verification Team
Hello Przemyslaw, Thank you for the details, I'm able to reproduce issue with provided steps. Sincerely, Umesh
[25 Mar 15:38]
MySQL Verification Team
Test results - 8.0.36
Attachment: 114468_8.0.36.results (application/octet-stream, text), 23.94 KiB.
[25 Mar 15:45]
MySQL Verification Team
Hello Przemyslaw, One more question please. Are using SSD or HDD? Thank you. Sorry for asking too much :) Sincerely, Umesh
[25 Mar 16:44]
Przemyslaw Malkowski
No worries :) I did test on both NVMe flash disk, as well as on a RAID array made from HDDs (spinning drives). In both cases, the impact was similar.
[9 Apr 4:22]
huahua xu
Hi all: By analyzing the test result data, I think that the performance difference mainly comes from linear readahead. In addition, the Innob did not specify the expected page number for a new allocating page during rebuilding B-tree index for bulk load, which caused an interesting phenomenon. Add special debugging code: ``` -dberr_t Page_load::init() noexcept { +dberr_t Page_load::init() noexcept { page_t *new_page; page_no_t new_page_no; buf_block_t *new_block; @@ -393,6 +393,8 @@ dberr_t Page_load::init() noexcept { m_cur_rec = page_get_infimum_rec(new_page); ut_ad(m_is_comp == page_is_comp(new_page)); m_free_space = page_get_free_space_of_empty(m_is_comp); + + printf("level: %ld, page_no: %u \n", m_level, m_page_no); ``` output: ... level: 0, page_no: 116 level: 0, page_no: 128 ... level: 0, page_no: 172 level: 0, page_no: 192 ... level: 0, page_no: 64440 level: 0, page_no: 73792 ... level: 0, page_no: 160316 level: 0, page_no: 160832 ... It is very terrible, which works out of order accessed pages and makes linear readahead ineffective. The below code caused discontinuity in page numbers, and is the reason for the increase in the physical size of the tablespace(~12.5%). ``` page_no_t fseg_alloc_page_no(fil_space_t *space, const page_size_t &page_size, fseg_inode_t *seg_inode, page_no_t hint, byte direction, mtr_t *mtr IF_DEBUG(, bool has_done_reservation) { ... if (direction != FSP_NO_DIR && reserved - used < reserved * (fseg_reserve_pct / 100) && used >= FSEG_FRAG_LIMIT) { ret_descr = fseg_alloc_free_extent(seg_inode, space_id, page_size, mtr); if (ret_descr) { /* 3. We take any free extent (which was already assigned above =============================================================== in the if-condition to ret_descr) and take the lowest or ======================================================== highest page in it, depending on the direction ==============================================*/ ret_page = xdes_get_offset(ret_descr); if (direction == FSP_DOWN) { ret_page += FSP_EXTENT_SIZE - 1; } else if (xdes_get_state(ret_descr, mtr) == XDES_FSEG_FRAG) { ret_page += xdes_find_bit(ret_descr, XDES_FREE_BIT, true, 0, mtr); } ut_ad(!has_done_reservation || ret_page != FIL_NULL); /*-----------------------------------------------------------*/ } } ... } ```
[9 Apr 6:25]
huahua xu
The patch would specify the expected page number for a new allocating page during rebuilding B-tree index for bulk load.
Attachment: hint_page_no_for_page_load.patch (application/octet-stream, text), 2.83 KiB.
[10 Apr 8:34]
MySQL Verification Team
Hello huahua xu, Thank you for your contribution, please ensure to re-send the patch via "contribution" tab. Otherwise we would not be able to accept it. Thank you! Sincerely, Umesh