Bug #90231 Auto-configuration with innodb_dedicated_server does not consider cgroup limits
Submitted: 27 Mar 2018 15:31 Modified: 5 Apr 2018 9:07
Reporter: Fernando Ipar (OCA) Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8 OS:Linux
Assigned to: CPU Architecture:Any

[27 Mar 2018 15:31] Fernando Ipar
Description:
When using innodb_dedicated_server to auto-configure Innodb's settings in an environment with memory limited by cgroups, this limit is ignored and total memory size is used for the calculation. 

How to repeat:
I can reproduce this with Docker but any way to start mysqld while limiting memory via cgroups should do. 

Start a docker container with memory limited to 256MB: 

telecaster:tmp fipar$ docker run --name mysql8 -e MYSQL_ROOT_PASSWORD=password -v ~/tmp/:/etc/mysql/conf.d -m '256m' -d mysql:8
97432ea4521c0f74d7500a4042c289801a4fde81183ab4a2b218d607196983cd

On ~/tmp/ I have this config file: 

telecaster:tmp fipar$ cat innodb_dedicated_server.cnf
[mysqld]
innodb_dedicated_server=1

Verify that the memory limit is applied to the container: 

telecaster:tmp fipar$ docker exec -it mysql8 bash

root@97432ea4521c:/# cat /sys/fs/cgroup/memory/memory.limit_in_bytes
268435456

Verify that the configuration is not applied as expected: 

mysql -ppassword -e 'select @@innodb_dedicated_server, @@innodb_buffer_pool_size'
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------------+---------------------------+
| @@innodb_dedicated_server | @@innodb_buffer_pool_size |
+---------------------------+---------------------------+
|                         1 |                1073741824 |
+---------------------------+---------------------------+

The auto configuration does make sense if we consider the total memory on the docker host where the containers run: 
root@97432ea4521c:/# head -1 /proc/meminfo
MemTotal:        2046940 kB

It is applying the rule if <=4G then set to .5 * total ram. 

Suggested fix:
If memory is limited by cgroups then use this value to calculate values for innodb_buffer_pool_size and innodb_log_file_size
[2 Apr 2018 13:49] MySQL Verification Team
Thanks for the submission. Looks like a bug on our side - verified as described.

all best
Bogdan
[4 Apr 2018 13:02] Krzysztof Kapuscik
Posted by developer:
 
Current implementation uses sysconf() to get available memory size. The cgroups feature is used by docker to set the limits; cgroups settings do not affect sysconf() or /proc/mem so the memory limit check would need to be extended.
[5 Apr 2018 8:50] Krzysztof Kapuscik
Posted by developer:
 
Short:
There is no easy and portable way to get the memory limits for application.

Long:
The amount of total memory available could be read from multiple places (/proc/mem, sysinfo, sysconf, ...).
The amount of memory available for an application could be limited by cgroups and rlimit.

Unfortunately there is no simple API available to check these limits. Some more details on the topic could be found here:
https://fabiokung.com/2014/03/13/memory-inside-linux-containers/

The limit set using cgroups is available in memory/.../memory.limit_in_bytes.
Note that there may be multiple groups (hierarchy) so application should check the group to which it belongs.

To do that the application must do:
1. Parse the /proc/self/mounts to find where cgroups file systems are mounted.
> cat /proc/self/mounts | grep memory
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0

2. Parse the /proc/self/cgroups to find the path for a controller. Note that for cgroups v2 the format is different.
> cat /crop/self/cgroup  | grep memory
2:memory:/user.slice

3. Check the limit:
> cat /sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes 
9223372036854771712

Application shall not assume any paths as these may be different (e.g. CentOS 6 mounts cgroups in /cgroups).

But for the docker things gets even more complicated:

1. cat /proc/self/mounts | grep memory
cgroup /sys/fs/cgroup/memory cgroup ro,nosuid,nodev,noexec,relatime,memory 0 0

2. cat /proc/self/cgroup  | grep memory
2:memory:/docker/f5715f59bdf09716a61d4aa27c0db910148e33706f92b541cf7ae41d23d11706

3. cat /sys/fs/cgroup/memory/docker/f5715f59bdf09716a61d4aa27c0db910148e33706f92b541cf7ae41d23d11706/memory.limit_in_bytes 
cat: /sys/fs/cgroup/memory/docker/f5715f59bdf09716a61d4aa27c0db910148e33706f92b541cf7ae41d23d11706/memory.limit_in_bytes: No such file or directory

Step 3 fails because the /proc/self/cgroup is taken from host and points to host directory and not the container directory.
On host (4GB is the limit set for docker during tests):
cat /sys/fs/cgroup/memory/docker/f5715f59bdf09716a61d4aa27c0db910148e33706f92b541cf7ae41d23d11706/memory.limit_in_bytes 
4294967296

Of course in the docker there is a file mentioned in bug report:
> cat /sys/fs/cgroup/memory/memory.limit_in_bytes 
4294967296

But using it directly could lead to errors in non-docker environment. And detecting if application is running within a container is also non-standard and depends on some environment variables or 'docker' or 'lxc' in the /proc/self/cgroups - at least according to the solutions described on multiple web pages.
[5 Apr 2018 9:07] Krzysztof Kapuscik
Posted by developer:
 
It was decided this would not be fixed until simple and portable fix is available. We are open for proposals of such changes.