MySQL Bugs: #88628: increased memory allocation querying IS.TABLES and FLUSH TABLES ... FOR EXPORT

Bug #88628	increased memory allocation querying IS.TABLES and FLUSH TABLES ... FOR EXPORT
Submitted:	23 Nov 2017 16:03	Modified:	10 Jan 2018 16:09
Reporter:	Dirk LANGE	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S2 (Serious)
Version:	5.7.17	OS:	Solaris
Assigned to:		CPU Architecture:	Any

Description:
Situation:

I am running a backup scenario for MySQL tables, with data residing on ZFS.
The backup is taken while the database is online and operating.
After tables have been prepared for backup, I take a snapshot on ZFS, then send it to the backup machine.
Among the preparing steps, in the database I am
(1) using "FLUSH TABLES WITH READ LOCK" to generate a defined state in the database,
(2) querying INFORMATION_SCHEMA.TABLES for schema and name of tables of type InnoDB,
(3) using "FLUSH TABLES ... FOR EXPORT" to set all InnoDB tables into a transportable respectively restorable state.

I let run (1) and (2)+(3) in that order,
and in two different database connections, according to the FLUSH TABLES Syntax note:

"While any of these statements is in effect within the session, attempts to use FLUSH TABLES ... FOR EXPORT produce an error:
FLUSH TABLES ... WITH READ LOCK
FLUSH TABLES ... FOR EXPORT
LOCK TABLES ... READ
LOCK TABLES ... WRITE"

(1), (2) and (3) belong to one cycle to try to take a backup.
The backup process may try up to a number of cycles, until is was successful.
A backup cycle may fail, if initial conditions are not fulfilled, or runtime exceeds a limit and it is set to be killed.
A failed cycle sets a retry after certain sleep time.

System info:

* Solaris 11 on x64_64, 128GB RAM
* MySQL 5.7.17, binary from MySQL
* [MySQL configuration settings can be supplied]

Symptom:

Once a day I let run the backup.
Each time, while querying INFORMATION_SCHEMA.TABLES and running "FLUSH TABLES ... FOR EXPORT",
I experience a significant increase of the mysqld memory consumption, in sum in the range of about 0.5 to 1.5GB.
This will lead to a more or less predicted date, when to have a restart of the instance because of preventing from out of resources.
I suspect a memory leak.

In the past the query on INFORMATION_SCHEMA.TABLES was done as:
	SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE ENGINE = 'InnoDB';

Under the hope, a UNION of partitioned queries has less negative effect, I reformed it:
	SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE ENGINE = 'InnoDB' AND TABLE_SCHEMA = 'sn_1'
	UNION ALL
	SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE ENGINE = 'InnoDB' AND TABLE_SCHEMA = 'sn_2'
	...
	UNION ALL
	SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE ENGINE = 'InnoDB' AND TABLE_SCHEMA = 'sn_N';

The number of InnoDB tables is 4000+, including partitioned ones.

Example log:

The output logged is RAM virtual size (VSZ) and resident set size (RSS) in kb.
To be seen is the relevant part of an unsuccessful cycle, then of a successful one.

hooks:
#0 ... before query (2)
#1 ... after query  (2)
#2 ... before query (3)
#3 ... after query  (3)

...
20171122_171431:#0: VSZ RSS 40455968 37679604
20171122_171916:#1: VSZ RSS 40902708 38108412		--> query finished
20171122_171916:#2: VSZ RSS 40902708 38108412
20171122_171931:#3: VSZ RSS 40902708 38108476		--> query killed regularly due to failing a runtime rule
...
20171122_172725:#0: VSZ RSS 40902432 38108232
20171122_173155:#1: VSZ RSS 41355804 38542884		--> query finished
20171122_173155:#2: VSZ RSS 41355804 38542884
20171122_173237:#3: VSZ RSS 41896532 38939620		--> query finished
...

Query (1) does not contribute to increased memory allocation, so its omitted in the sample above.
Query (2) from first cycle allocates a lot of additional memory.
Query (3) from first cycle was killed, no further allocation.
Query (2) from second cycle allocates another similar lot of additional memory, though this query has run about 15 minutes before.
Query (3) from second cycle allocates a further lot of additional memory.

The increase in memory consumption appears to be gradually over the course of time while the statements are running.
The statements in one cycle typically run between 5 and 10 minutes.

Again, while the backup process tries to run, regular user access is not prohibited, but might be blocked.

I have used this backup technique under 5.6, too.
There I saw the same behaviour as now in 5.7.

The utilization of InnoDB has increased steadily.
Along with this, the cycle of pushed restarts has become visibly shorter and is virtually almost on a monthly schema now.
The increase of memory consumption has become steeper in turn.

How to repeat:
unknown, not consistently repeatable yet in the sandbox

Simulating in a sandbox environment, I have created 4000 tables of type InnoDB in 10 schemas, partitioned, though probably not necessarily needed to be partitioned.
Second, I let run random INSERTs into all tables to simulate simple table changes.
Third, I let run FLUSH (1), INFORMATION_SCHEMA.TABLES query (2), and FLUSH (3).
It happens to happen but not each time, that the increase occurred.
I also saw the effect of running "SHOW TABLE STATUS" on all the schemas or a DROP and CREATE.
Competing user activities were not available on the sandbox.

Suggested fix:
unknown

Hi!

The administration statements that you are using are intended to be used for MyISAM storage engine tables. InnoDB SE is not much affected by those statements. All that InnoDB does is prevents commits if any of those statements is ran.

We recommend MySQL Enterprise Backup, or MEB, for backing up InnoDB.

However, this needs to be documented better, hence the new status of this report.

Hello!

My database schema consists of tables of both MyISAM and InnoDB SE type.
In order to get a consistent database snapshot, it is needed to also run the administrative statements for handling MyISAM SE besides statements treating InnoDB SE, as described.
Relevant handling steps are referred to as (1),(2) and (3) in the initial Description, covering MyISAM and InnoDB SE.

To recap the initial description:

(1) "FLUSH TABLES WITH READ LOCK"
(2) querying INFORMATION_SCHEMA.TABLES for schema and name of tables of type InnoDB
(3) "FLUSH TABLES ... FOR EXPORT" for all InnoDB tables found in (2)

Nevertheless, even if not focusing in terms of backup and snapshot:

What I see and tried to describe is the very increase of memory consumption of the mysqld process at two stages.

stage 1)
After a "FLUSH TABLES WITH READ LOCK", a simple query on INFORMATION_SCHEMA.TABLES, though from a separate session and very normal, causes increase of memory consumption.
Yet I have not run any special statement on any InnoDB table at this stage.

stage 2)
In my case I proceed with the steps described to finally have a database snapshot with MyISAM and InnoDB on file system.
I use "FLUSH TABLES ... FOR EXPORT" on InnoDB tables, and there might be other use cases for running a "FLUSH TABLES ... FOR EXPORT",
e.g. transfer a set of tables to another MySQL instance.
But when doing it, it is not supposed to have continuously increased memory consumption by the mysqld process while doing it.
In my case the increase is obvious due to the amount of tables (assumption),
in other cases less obvious but still happening.

My observation is, the more concurrency by active connections, the more likely to see the effect.
An "underloaded" database is less likely to show the effect.
The longer it takes to get to a result for the query on INFORMATION_SCHEMA.TABLES, the more the increase of memory consumption takes effect.
The longer it takes that the "FLUSH TABLES ... FOR EXPORT" is being locked by other queries, the more the increase of memory consumption takes effect.

I still think this is a not normal behavior and leaking memory.

Hi!

What you are describing is not a memory leak. It is just more memory used then otherwise, because, as I wrote, all InnoDB commits are waiting for UNLOCK in order to COMMIT (which releases the memory), while the new transactions are arriving and are being processed. 

It is also irrelevant that you do not use only InnoDB, because those FLUSH statements can not be used for InnoDB tables. MEB backs up MyISAM as well.

Hello!

Understood, that "FLUSH TABLES WITH READ LOCK" is irrelevant for InnoDB, which was not my target anyway.
Nevertheless, querying INFORMATION_SCHEMA.TABLES adds more memory to mysqld, and does not release it afterwards.

Is a running query on INFORMATION_SCHEMA.TABLES also related to your statement: "all InnoDB commits are waiting for UNLOCK in order to COMMIT"?
In other words, does a running query on INFORMATION_SCHEMA.TABLES have a blocking influence on InnoDB commits?
Again, my observation is, that afterwards the extra memory does not get released.

The "FLUSH TABLES ... FOR EXPORT" is specifically supposed to act on InnoDB tables for a "transportable" stage.

When "all InnoDB commits are waiting for UNLOCK in order to COMMIT" [cited from your answer], and memory is released afterwards,
what is the reason, that memory keeps increasing drastically over time, to give a number: from 10GB to 70GB?
If to be released, isn't it supposed to be kept but re-used at the next event, and not to be accumulated much more?

It is not clear to me yet what is using and occupying the daily additionally allocated memory after the statements of my backup process have finished.
I would expect a saturation of the memory allocation curve.

Hi!

No, InnoDB can not be backed up with any of the FLUSH statements.

Regarding the usage of memory on some I_S table, is a known issue with Solaris and Linux malloc behaviour. The memory is held in reserve with some process, for faster deallocation. Read your Solaris manuals how to tune that.

Posted by developer:
 
The documentation was updated to indicate that using MySQL Enterprise Backup is the recommended method for backing up
InnoDB data.
https://dev.mysql.com/doc/refman/5.7/en/innodb-backup.html

The updated content should appear online soon.

Thank you for the bug report.

Hello,

can you please specify in more detail, what is the "known issue with Solaris and Linux malloc behaviour" "on some I_S tables".
Can you please point out with what respect specifically to look at in terms of OS tuning.
Normally, allocated memory is kept in use by a process, but being reallocated for further use when "made free" within the process.
You seem to refer to a different allocation behavior, which cases ongoing additional memory allocation.

You refer to I_S tables and memory allocation issue.
(*)
Still to be clarified, what is the cause for FLUSH TABLES ... FOR EXPORT to add more memory to the process and not to use existing allocated (but "freed") memory within the process?
I doubt that the previously used memory (by an equivalent previous FLUSH TABLES ... FOR EXPORT) is already occupied by other data, so is supposed to be reused.
Granted, that a high number of waiting COMMITs can cause the increase, but afterwards to be "made free" to be reallocated is the expectation.
Please comment on that.

Primary, my report is about increased memory consumption on regular functions in MySQL.

Secondary, there is the fundamental question about my method of doing MySQL table backup.

MEB is the proposed means to do consistent backup of MySQL tables.
This might work for us in future, so far infrastructural constraints decide for the current method.

Again, general observation is that we see the cascaded function of memory consumption over time.
When not running my method of backup, memory consumption barely increases over time.
This points to that no other data is trying to allocate more memory, thus, memory consumption would saturate.
Taking aside the queries on I_S tables until clarified (your ref. proposed OS tuning), the FLUSH TABLES ... FOR EXPORT induce this.
Again, FLUSH TABLES ... FOR EXPORT is a regular function supported by MySQL, no matter to be used in my backup method or not.
Refer to (*) above.
Please comment in more detail on what is the driver for increase here.

Can you please specify in detail, why InnoDB cannot be handled by FLUSH TABLES ... FOR EXPORT in a backup scenario?
This is fundamental to me and my method of backup for data tables as described initially.
The manual states under chapter "FLUSH TABLES Syntax", "FLUSH TABLES tbl_name [, tbl_name] ... FOR EXPORT":
"This FLUSH TABLES variant applies to InnoDB tables.
It ensures that changes to the named tables have been flushed to disk so that binary table copies can be made while the server is running."
As far as I can see, I match the constraints and information further stated there.

Further on I took a look at "FLUSH TABLES WITH READ LOCK":
The manual states under chapter "FLUSH TABLES Syntax", "FLUSH TABLES WITH READ LOCK":
"Closes all open tables and locks all tables for all databases with a global read lock.
This is a very convenient way to get backups if you have a file system such as Veritas or ZFS that can take snapshots in time.
Use UNLOCK TABLES to release the lock."
We are in a ZFS environment.
Global Read Lock is applied, that prevents new tables to be created, MyISAM and InnoDB in case.
In terms of consistent database snapshot a fulfilled prerequisite.

I did some testing regarding LOCK effects on InnoDB.
To summarize:

1) Any open transaction with data change blocks a FLUSH TABLES ... FOR EXPORT until COMMIT or ROLLBACK.
2) Any query with data change in a transaction is locked, so generally the transaction waiting, as long as a FLUSH TABLES ... FOR EXPORT is in place.
3) Any open transaction with data change allows FLUSH TABLES WITH READ LOCK, but the transaction cannot COMMIT until the UNLOCK the FLUSH.
4) Any query with data change in a transaction is locked, so generally the transaction waiting, as long as a FLUSH TABLES WITH READ LOCK is in place.
Additionally, DDL also are blocked until UNLOCK.

This as prerequisite, the two-staged method (FLUSH TABLES WITH READ LOCK, FLUSH TABLES ... FOR EXPORT) provides a consistent stage in terms of data tables to be taken a filesystem (zfs) snapshot of.
In practice and my experience, table files to be restored have re-integrated seamlessly, whether MyISAM or InnoDB, though with different method of restore.

Again, please comment in detail on where do you identify the flaw in using these FLUSH statements,
why are both, and in this combination, not applicable for preparing a filesystem backup.

Thanks!

For the behaviour of malloc() , you can read the documentation on-line for the operating systems in question.

Regarding the way to backup MySQL tables with from a mix of the storage engines, it is fully explained in our manual. Now, we also have the explanation that no FLUSH ..... statements can be used to backup InnoDB tables.

This bug is closed now and will not be dealt with any more.