Description:
While an ALTER performing partitioning operations is in progress, the undo logs are growing significantly until the query finishes.
This happens even when there is no other workload on the system.
Interestingly, I don't see the same behavior for other ALTERs, including those using COPY algorithms.
I also cannot reproduce this problem on MySQL 8.0.18.
I don't see any mention in the documentation that the undo log may be used for any DDLs.
An example overhead for simple ~500MB table may be around 200MB growth in rollback segment size. For big tables, this may lead to significant issues with ibdata1 file size.
The undo size can be monitored with innodb metrics, like below.
Short before the ALTER is done, it looks like this:
mysql [localhost:5733] {msandbox} ((none)) > select name, subsystem, count, comment from information_schema.innodb_metrics where name like '%undo%' or name like '%rseg%';
+------------------------------+----------------+-------+-----------------------------------------------+
| name | subsystem | count | comment |
+------------------------------+----------------+-------+-----------------------------------------------+
| buffer_page_read_undo_log | buffer_page_io | 10 | Number of Undo Log Pages read |
| buffer_page_written_undo_log | buffer_page_io | 28643 | Number of Undo Log Pages written |
| trx_rseg_history_len | transaction | 58 | Length of the TRX_RSEG_HISTORY list |
| trx_undo_slots_used | transaction | 48 | Number of undo slots used |
| trx_undo_slots_cached | transaction | 47 | Number of undo slots cached |
| trx_rseg_current_size | transaction | 14258 | Current rollback segment size in pages |
| purge_undo_log_pages | purge | 58 | Number of undo log pages handled by the purge |
+------------------------------+----------------+-------+-----------------------------------------------+
7 rows in set (0.00 sec)
mysql [localhost:5733] {msandbox} ((none)) > SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM performance_schema.events_stages_current;
+-----------------------------+----------------+----------------+
| EVENT_NAME | WORK_COMPLETED | WORK_ESTIMATED |
+-----------------------------+----------------+----------------+
| stage/sql/copy to tmp table | 15676237 | 16355398 |
+-----------------------------+----------------+----------------+
1 row in set (0.00 sec)
And right after it's done:
mysql [localhost:5733] {msandbox} ((none)) > select name, subsystem, count, comment from information_schema.innodb_metrics where name like '%undo%' or name like '%rseg%';
+------------------------------+----------------+-------+-----------------------------------------------+
| name | subsystem | count | comment |
+------------------------------+----------------+-------+-----------------------------------------------+
| buffer_page_read_undo_log | buffer_page_io | 10 | Number of Undo Log Pages read |
| buffer_page_written_undo_log | buffer_page_io | 30254 | Number of Undo Log Pages written |
| trx_rseg_history_len | transaction | 75 | Length of the TRX_RSEG_HISTORY list |
| trx_undo_slots_used | transaction | 76 | Number of undo slots used |
| trx_undo_slots_cached | transaction | 76 | Number of undo slots cached |
| trx_rseg_current_size | transaction | 242 | Current rollback segment size in pages |
| purge_undo_log_pages | purge | 75 | Number of undo log pages handled by the purge |
+------------------------------+----------------+-------+-----------------------------------------------+
7 rows in set (0.00 sec)
How to repeat:
$ dbdeployer deploy single -c "innodb_monitor_enable=module_dml,module_trx,module_purge,module_buffer_page,module_undo" 5.7.28
Database installed in $HOME/sandboxes/msb_5_7_28
run 'dbdeployer usage single' for basic instructions'
. sandbox server started
$ msb_5_7_28/use test
mysql [localhost:8019] {msandbox} (test) > create table t1 (id int auto_increment primary key, a varchar(20), b int);
Query OK, 0 rows affected (0.06 sec)
mysql [localhost:8019] {msandbox} (test) > insert into t1(a) values ("foobar"),("foobar"),("foo"),("foo");
Query OK, 4 rows affected (0.02 sec)
Records: 4 Duplicates: 0 Warnings: 0
(...)
mysql [localhost:8019] {msandbox} (test) > insert into t1(a) select a from t1;
Query OK, 8388608 rows affected (1 min 19.76 sec)
Records: 8388608 Duplicates: 0 Warnings: 0
mysql [localhost:8019] {msandbox} (test) > ALTER TABLE t1 PARTITION BY HASH(id) partitions 4;
Query OK, 16777216 rows affected (1 min 26.78 sec)
Records: 16777216 Duplicates: 0 Warnings: 0
While ALTER is ongoing, monitor the rseg size using information_schema.innodb_metrics as shown above.
Suggested fix:
If that's unexpected behavior - fix it to avoid undo size problem.
If that's expected, please update the manual to explain it, also why it doesn't happen in version 8.0.