Bug #119422 Proposal: Reduce default lock_wait_timeout from 31536000 (1 year) to a safer value (1 day or 1 week)
Submitted: 18 Nov 16:28 Modified: 19 Nov 1:27
Reporter: Hongju Yang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Options Severity:S4 (Feature request)
Version:8.0.x / 8.4.x (latest) OS:Any
Assigned to: CPU Architecture:Any
Tags: default-value, feature-request, lock_wait_timeout, metadata-lock

[18 Nov 16:28] Hongju Yang
Description:
This is a feature request to adjust the default value of the system variable
`lock_wait_timeout`, which currently defaults to 31536000 seconds (1 year)
according to the MySQL Reference Manual.

In real operational environments, this extremely long default effectively
behaves as “no timeout,” which can cause severe hidden metadata lock stalls
(MDL). I have encountered this issue in an actual production system.

Example scenario:

[Session 1]
START TRANSACTION;
UPDATE mdl_test SET value = 'hold' WHERE id = 1;

[Session 2]
START TRANSACTION;
TRUNCATE TABLE mdl_test;

[Session 3]
START TRANSACTION;
SELECT * FROM mdl_test;

In this situation, Session 2 holds a metadata lock while waiting for Session 1’s
transaction to complete. Session 3 is also blocked. This is expected behavior.

However, in many web-based DB access systems, closing the browser tab does
not close the backend connection. The application server often maintains idle
connections without detecting that the frontend user has disappeared.

As a result:

- A user opens a page → server opens a DB connection
- User closes the browser tab → server does NOT close the DB connection
- The transaction remains open indefinitely
- MySQL believes the connection is still alive
- A metadata lock can remain held for an extremely long time

In my production environment, a single abandoned connection kept an open
transaction and caused DDL operations to hang silently. Because the default
lock_wait_timeout is *1 year*, this problem persisted far longer than expected
and was difficult to detect.

For this reason, the default timeout is not practical for modern systems.
A shorter default such as **1 day** or even **1 week** would prevent such
invisible long-term blocking while being conservative enough to avoid breaking
legitimate long-running operations.

As a relevant precedent:
MariaDB changed the default lock_wait_timeout value from 31536000 (1 year)
to 86400 (1 day) in MariaDB 10.2,
as documented here:
https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/upgrading/ma...

After this change, no significant operational issues were reported in the
MariaDB community. This strongly suggests that a drastically shorter default
timeout is safe and appropriate.

How to repeat:
1. Start Transaction A and perform an UPDATE that leaves an open transaction.
2. Start Transaction B and issue a TRUNCATE TABLE (which requires an MDL EXCLUSIVE lock).
3. Start Transaction C and issue a SELECT (which also requires an MDL SHARED lock).
4. Close the client frontend but do not close the backend connection.
5. The server continues to consider the connection alive.
6. Transaction B and C remain blocked until lock_wait_timeout expires—
   which by default is 31536000 seconds (1 year).
7. DDL operations appear to "freeze" for extremely long periods.

Suggested fix:
Reduce the default value of lock_wait_timeout from 31536000 seconds
to a more practical default:

Option A: 86400 seconds (1 day) — matches MariaDB's choice.
Option B: 604800 seconds (1 week) — more conservative but still safe.
Option C: an intermediate default between 1 day and 1 year.

This reduces the chance of invisible long-term metadata lock stalls while still
allowing sufficient time for legitimate long-running operations.

MariaDB's change in version 10.2 demonstrates that this adjustment is safe and
does not introduce significant operational issues.
[18 Nov 16:30] Hongju Yang
For example, the QueryPie tool used in my company behaves this way: when a user
closes the browser tab, the frontend disappears, but the backend connection and
its open transaction remain active on the server.
[18 Nov 16:31] Hongju Yang
If it is helpful for evaluation, I can provide a prototype patch or
additional technical details. I understand that the MySQL engineering
team has their own internal processes, so please let me know what form
of contribution is appropriate.
[18 Nov 16:37] Hongju Yang
Additionally, I found historical context in this Percona article from more than
13 years ago:
https://www.percona.com/blog/mysql-5-5-lock_wait_timeout-patience-is-a-virtue-and-a-locked...

According to the article, the default value of 31536000 seconds (1 year) was
introduced primarily to maintain backward compatibility with older MySQL
versions, rather than because it was considered a safe or reasonable operational
default.

In the comments of the same article, experienced DBAs clearly state that a
1-year timeout is not a good default and that it is strongly recommended that
DBAs adjust this value to something more practical for real-life environments.

At that time, many users did not even know the `lock_wait_timeout` variable
existed, and it made sense to avoid surprising users by changing defaults.

However, the situation today is very different:

- DBAs and developers now widely understand metadata locks and the behavior of
  `lock_wait_timeout`.
- Modern database systems and tooling expose metadata lock wait issues much
  more clearly.
- Production environments often involve web-based tools where abandoned
  connections are common, making a 1-year timeout increasingly problematic.

Given this evolution, there is no longer a strong justification to keep such an
extremely large and unsafe default value.

MariaDB already changed this default to 1 day in version 10.2, and the
community has not reported major operational issues resulting from that change.
This indicates that modern systems can safely operate with a much shorter and
more sensible default timeout.

For these reasons, MySQL would benefit greatly from adopting a safer default
value—such as 1 day or 1 week—rather than the legacy 1-year value that primarily
exists for backward compatibility reasons that no longer apply today.
[18 Nov 19:10] MySQL Verification Team
Thanks for the request and motivation behind it.  

In production, before running a DDL, one could set lock_wait_timeout in the session to the max time that you'd want to wait to start the DDL if there are blocking transactions (e.g. 300 seconds), or give up and retry later.
[19 Nov 1:27] Hongju Yang
Thank you for the response.

I understand that users *can* manually set a shorter lock_wait_timeout before
running DDL. However, this feature request is not about whether a workaround
exists — it is about the safety of the *default* value.

The current default of 31536000 seconds (1 year) has repeatedly caused real
production issues in modern environments where abandoned connections are common
(e.g., web-based SQL tools such as QueryPie). In such environments, users often
do not notice that a long-running or abandoned transaction is blocking DDL, and
the 1-year default makes the problem almost invisible.

This is not simply a configuration issue. It is a product design issue:
the default value is no longer reasonable for modern workloads.

MariaDB changed this default to 1 day in version 10.2 with no major negative
impact reported by the community. Percona articles also point out that the
1-year default was originally kept only for backward compatibility many years
ago and is not considered a good operational value.

Modern systems and operators now fully understand metadata locks and
lock_wait_timeout. Therefore, the compatibility argument is no longer valid.

The request is to reconsider the *default* to a safer and more realistic value
(e.g., 1 day or 1 week), not to ask how users can override it manually.

I kindly ask that this request be forwarded to the development team for proper
evaluation rather than being closed as “can be set manually.”
[19 Nov 4:25] MySQL Verification Team
The FR is understood, and was not closed.  It is verified.  Thanks!