MySQL Bugs: #80969: After exceeding max undo space size the purge will not work(default 1G)

Bug #80969	After exceeding max undo space size the purge will not work(default 1G)
Submitted:	6 Apr 2016 7:58	Modified:	18 Jun 2017 9:28
Reporter:	Shahriyar Rzayev	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S1 (Critical)
Version:	5.7.11, 5.7.12	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Reading documentation:

https://dev.mysql.com/doc/refman/5.7/en/truncate-undo-tablespace.html

"Undo tablespaces that exceed the innodb_max_undo_log_size setting are marked for truncation. Selection of an undo tablespace for truncation is performed in a round-robin fashion to avoid truncating the same undo tablespace each time.

Rollback segments residing in the selected undo tablespace are made inactive so that they are not allocated to new transactions. Existing transactions that are currently using rollback segments are allowed to complete"

Tested with following configuration on CentOS 7

#UNDO Log Truncation
innodb_undo_tablespaces=2
innodb_undo_log_truncate=1
innodb_undo_logs=128

With sysbench run:

sysbench --test=/usr/share/doc/sysbench/tests/db/update_index.lua   --report-interval=10 --oltp-table-size=1000000 --max-time=0 --oltp-read-only=off --max-requests=0 --num-threads=1000 --db-driver=mysql --mysql-password=Baku12345# --mysql-db=dbtest --mysql-user=root run

The result:

[root@mysql-57 mysql]# ls -lthr | grep undo
-rw-r-----. 1 mysql mysql 1,4G апр  5 06:38 undo001
-rw-r-----. 1 mysql mysql 3,5G апр  5 06:47 undo002

[root@mysql-57 mysql]# ls -lthr | grep undo
-rw-r-----. 1 mysql mysql 1,4G апр  5 06:38 undo001
-rw-r-----. 1 mysql mysql 3,6G апр  5 06:49 undo002

[root@mysql-57 mysql]# ls -lthr | grep undo
-rw-r-----. 1 mysql mysql 1,4G апр  5 06:56 undo001
-rw-r-----. 1 mysql mysql 3,8G апр  5 06:56 undo002

And purge_invoked is not increasing:

mysql> select name,count,status from innodb_metrics where name like "%purge%";
+-----------------------------------+---------+---------+
| name                              | count   | status  |
+-----------------------------------+---------+---------+
| purge_del_mark_records            |     103 | enabled |
| purge_upd_exist_or_extern_records | 1120188 | enabled |
| purge_invoked                     |    3734 | enabled |
| purge_undo_log_pages              | 1120200 | enabled |
| purge_dml_delay_usec              |       0 | enabled |
| purge_stop_count                  |       0 | enabled |
| purge_resume_count                |       0 | enabled |
| innodb_master_purge_usec          |       0 | enabled |
+-----------------------------------+---------+---------+
8 rows in set (0,00 sec)

mysql> select name,count,status from innodb_metrics where name like "%purge%";
+-----------------------------------+---------+---------+
| name                              | count   | status  |
+-----------------------------------+---------+---------+
| purge_del_mark_records            |     103 | enabled |
| purge_upd_exist_or_extern_records | 1134935 | enabled |
| purge_invoked                     |    3784 | enabled |
| purge_undo_log_pages              | 1135200 | enabled |
| purge_dml_delay_usec              |       0 | enabled |
| purge_stop_count                  |       0 | enabled |
| purge_resume_count                |       0 | enabled |
| innodb_master_purge_usec          |       0 | enabled |
+-----------------------------------+---------+---------+

How to repeat:
See description

Suggested fix:
If it is a known issue/limitation for some reason please provide more info.

Update To Thread.
BTW we assume that 10min wait is enough, open another session while running sysbench and run following:

begin;
select * from sbtest1 limit 1;
select sleep(600);
commit;

Please let me know if any other steps needed to reproduce.

Hello Shahriyar,

Thank you for the report.
Observed this with 5.7.12 build.

Thanks,
Umesh

-- 5.7.12

rm -rf 80969
bin/mysqld --initialize-insecure --basedir=/export/umesh/server/binaries/mysql-advanced-5.7.12 --datadir=/export/umesh/server/binaries/mysql-advanced-5.7.12/80969 --innodb_undo_tablespaces=2 --innodb_undo_log_truncate=1 --innodb_undo_logs=128 -v
bin/mysqld --innodb_undo_tablespaces=2 --innodb_undo_log_truncate=1 --innodb_undo_logs=128 --basedir=/export/umesh/server/binaries/mysql-advanced-5.7.12 --datadir=/export/umesh/server/binaries/mysql-advanced-5.7.12/80969 --core-file --socket=/tmp/mysql_ushastry.sock  --port=15000 --log-error=/export/umesh/server/binaries/mysql-advanced-5.7.12/80969/log.err 2>&1 &

sysbench/sysbench --test=sysbench/tests/db/update_index.lua   --report-interval=10 --oltp-table-size=1000000 --max-time=0 --oltp-read-only=off --max-requests=0 --num-threads=1000 --db-driver=mysql mysql-socket=/tmp/mysql_ushastry.sock --mysql-db=test --mysql-user=root prepare

sysbench/sysbench --test=sysbench/tests/db/update_index.lua   --report-interval=10 --oltp-table-size=1000000 --max-time=0 --oltp-read-only=off --max-requests=0 --num-threads=1000 --db-driver=mysql mysql-socket=/tmp/mysql_ushastry.sock --mysql-db=test --mysql-user=root run

mysql> SELECT @@innodb_max_undo_log_size;
+----------------------------+
| @@innodb_max_undo_log_size |
+----------------------------+
|                 1073741824 |
+----------------------------+
1 row in set (0.00 sec)

mysql> select @@innodb_undo_log_truncate;
+----------------------------+
| @@innodb_undo_log_truncate |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

mysql>

-- file system

[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12/80969: ls -lthr|grep undo
-rw-r----- 1 umshastr common 216M Apr 13 07:37 undo002
-rw-r----- 1 umshastr common 224M Apr 13 07:37 undo001

-- after 15-25 minutes

[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12/80969:  ls -lthr | grep undo
-rw-r----- 1 umshastr common 2.3G Apr 13 08:31 undo001
-rw-r----- 1 umshastr common 5.0G Apr 13 08:32 undo002

-- 

[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12/80969:  ls -lthr | grep undo
-rw-r----- 1 umshastr common 2.3G Apr 13 08:37 undo001
-rw-r----- 1 umshastr common 5.5G Apr 13 08:37 undo002

--

[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12/80969:  ls -lthr | grep undo
-rw-r----- 1 umshastr common 2.3G Apr 13 09:09 undo001
-rw-r----- 1 umshastr common 8.9G Apr 13 09:10 undo002

--

[umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.7.12/80969:  ls -lthr | grep undo
-rw-r----- 1 umshastr common 2.3G Apr 13 09:13 undo001
-rw-r----- 1 umshastr common 9.3G Apr 13 09:14 undo002

Seems to be known issue, pls see my colleague Shane's FR on this https://bugs.mysql.com/bug.php?id=54455

Hi Umesh,

This is a different issue which is observed here which ONLY happens when external undo space enabled with purging (I did not test it without purging)

What you can observe int this case is even when long running transactions have committed  purge still does not restart while the side load (sysbench) is continue running.

Also only one of undo slot files starts to grow at this point which makes me thing what it is "disabled" and all undo load goes to the second while while it for some reason is never drained to be reset

Hello Peter,

Thank you for your observation and valuable feedback.

Regards,
Umesh

Steps for InnoDB to truncate the undo tablespace:

1) Mark UNDO tablespace for truncate if the size exceeds
"innodb_max_undo_log_size".

2) It makes all the rollback segments present in the marked undo tablespace
as "Inactive". It doesn't allow to allocate any of the rollback
segments for new transaction.

3) Truncation will scan over each rollback segment in the marked undo
tablespace and ensure it doesn't hold any active undo records.
Purge system will continue to free rollback segments that are no
longer needed.

4) Once the purge frees the rollback segements resides inside within marked
undo tablespace. Actual truncate operation is carried out on the marked undo tablespace.

In this case, purge thread lags. Marked undo tablespace holds active undo
records. By increasing the purge threads, we can reduce the active undo
records in the marked undo tablespace.

Bug page also claims that purge_invoked is not increasing. I would like to
give the following explanation about purge innodb metrics.

purge_del_mark_records - Number of delete marked records purged.

purge_upd_exist_or_extern_records - Number of purges on updates of existing
records and updates on delete marked
record with externally stored field

purge_invoked - Number of times purge invoked. Each time, it process
300(default value for innodb_purge_batch_size) undo pages.
In the given scenario, it processed 300*50 undo log pages.

purge_undo_log_pages- How many undo log pages processed by purge threads. (It
should be equal to purge_invoked * innodb_purge_batch_size)

https://dev.mysql.com/doc/refman/5.7/en/truncate-undo-tablespace.html also
mentioned about purge should process all active undo records in the marked undo tablespace then only actual truncate process starts.

4. After all rollback segments in the undo tablespace are freed, the truncate
operation runs and the undo tablespace is truncated to its initial size. The
initial size of an undo tablespace file is 10MB.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".