Description:
Hello,
we had the following bug[1] filed in Ubuntu, even though the host was running Arch linux.
With RLIMIT_NOFILE set to a high value, this triggers an exaggerated memory allocation at service startup and causes a 16Gb RAM box to swap or even get allocation errors. The MySQL systemd service file does have a LimitNOFILE=5000 setting, but due to also having PermissionsStartOnly=true, that limit is not applied to the ExecStartPre command:
[Service]
Type=forking
User=mysql
Group=mysql
PIDFile=/run/mysqld/mysqld.pid
PermissionsStartOnly=true
ExecStartPre=/usr/share/mysql/mysql-systemd-start pre
ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid
TimeoutSec=600
Restart=on-failure
RuntimeDirectory=mysqld
RuntimeDirectoryMode=755
LimitNOFILE=5000
It turns out Arch linux has a high limit of open files out of the box (RLIMIT_NOFILE, value set to 1073741816) and this triggers an exaggerated memory allocation by mysql. The same happens in MariaDB, and was fixed[2] there. The code in that area is still the same, at a glance.
Ubuntu doesn't trigger this behavior out of the box because our default limit for NOFILE is 1048576.
One could argue that this is a local configuration issue, and/or a linux distribution packaging issue, but it does seem wrong that mysql would allocate that much memory based on the limit of open files.
Troubleshooting was done by others[3], but basically set_max_open_files(max_file_limit) can return the current limit even if it's stupidly large, as long as it's not equal to RLIM_INFINITY:
if (rlimit.rlim_cur == (rlim_t) RLIM_INFINITY)
rlimit.rlim_cur = max_file_limit;
if (rlimit.rlim_cur >= max_file_limit)
DBUG_RETURN(rlimit.rlim_cur); /* purecov: inspected */
So if current limit is larger than max_file_limit, but not identical to RLIM_INFINITY, current limit is returned, and later used in a malloc which can become huge.
1. https://bugs.launchpad.net/ubuntu/+source/mysql-5.7/+bug/1839527
2. https://jira.mariadb.org/browse/MDEV-18360
3. https://github.com/systemd/systemd/issues/11510#issuecomment-456999084
How to repeat:
These instructions on ubuntu are going through some hoops to raise the NOFILE limit, as by default ubuntu takes a conservative approach of a value of 1024 for that limit.
Using an Ubuntu 18.04 VM as a base, with 2Gb of RAM, install mysql-server:
sudo apt update
sudo apt install mysql-server -y
Artificially increase the open file limit by editing /lib/systemd/system/mysql.service and replacing the ExecStartPre line with this and commenting the LimitNOFILE line:
ExecStartPre=/bin/sh -c 'ulimit -n 1073741816; /usr/share/mysql/mysql-systemd-start pre'
#LimitNOFILE=5000
Allow the new limit system-wide:
- edit /etc/systemd/system.conf and set:
DefaultLimitNOFILE=1073741816
Issue this command:
sudo systemctl daemon-reload
And now restart mysql:
sudo systemctl restart mysql
/var/log/syslog should have something like this:
Aug 13 21:51:48 ubuntu mysqld[8100]: mysqld: Out of memory (Needed 4294967200 bytes)
Suggested fix:
If the current limit is higher than max_file_limit, return max_file_limit.
Description: Hello, we had the following bug[1] filed in Ubuntu, even though the host was running Arch linux. With RLIMIT_NOFILE set to a high value, this triggers an exaggerated memory allocation at service startup and causes a 16Gb RAM box to swap or even get allocation errors. The MySQL systemd service file does have a LimitNOFILE=5000 setting, but due to also having PermissionsStartOnly=true, that limit is not applied to the ExecStartPre command: [Service] Type=forking User=mysql Group=mysql PIDFile=/run/mysqld/mysqld.pid PermissionsStartOnly=true ExecStartPre=/usr/share/mysql/mysql-systemd-start pre ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid TimeoutSec=600 Restart=on-failure RuntimeDirectory=mysqld RuntimeDirectoryMode=755 LimitNOFILE=5000 It turns out Arch linux has a high limit of open files out of the box (RLIMIT_NOFILE, value set to 1073741816) and this triggers an exaggerated memory allocation by mysql. The same happens in MariaDB, and was fixed[2] there. The code in that area is still the same, at a glance. Ubuntu doesn't trigger this behavior out of the box because our default limit for NOFILE is 1048576. One could argue that this is a local configuration issue, and/or a linux distribution packaging issue, but it does seem wrong that mysql would allocate that much memory based on the limit of open files. Troubleshooting was done by others[3], but basically set_max_open_files(max_file_limit) can return the current limit even if it's stupidly large, as long as it's not equal to RLIM_INFINITY: if (rlimit.rlim_cur == (rlim_t) RLIM_INFINITY) rlimit.rlim_cur = max_file_limit; if (rlimit.rlim_cur >= max_file_limit) DBUG_RETURN(rlimit.rlim_cur); /* purecov: inspected */ So if current limit is larger than max_file_limit, but not identical to RLIM_INFINITY, current limit is returned, and later used in a malloc which can become huge. 1. https://bugs.launchpad.net/ubuntu/+source/mysql-5.7/+bug/1839527 2. https://jira.mariadb.org/browse/MDEV-18360 3. https://github.com/systemd/systemd/issues/11510#issuecomment-456999084 How to repeat: These instructions on ubuntu are going through some hoops to raise the NOFILE limit, as by default ubuntu takes a conservative approach of a value of 1024 for that limit. Using an Ubuntu 18.04 VM as a base, with 2Gb of RAM, install mysql-server: sudo apt update sudo apt install mysql-server -y Artificially increase the open file limit by editing /lib/systemd/system/mysql.service and replacing the ExecStartPre line with this and commenting the LimitNOFILE line: ExecStartPre=/bin/sh -c 'ulimit -n 1073741816; /usr/share/mysql/mysql-systemd-start pre' #LimitNOFILE=5000 Allow the new limit system-wide: - edit /etc/systemd/system.conf and set: DefaultLimitNOFILE=1073741816 Issue this command: sudo systemctl daemon-reload And now restart mysql: sudo systemctl restart mysql /var/log/syslog should have something like this: Aug 13 21:51:48 ubuntu mysqld[8100]: mysqld: Out of memory (Needed 4294967200 bytes) Suggested fix: If the current limit is higher than max_file_limit, return max_file_limit.