Bug #82774 Reasonably easy to generate mysqld semi-hang. Input file is 1K lines.
Submitted: 29 Aug 2016 3:13 Modified: 13 Apr 2017 22:01
Reporter: Roel Van de Paar Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.7.13 OS:Any
Assigned to: CPU Architecture:Any

[29 Aug 2016 3:13] Roel Van de Paar
Description:
mysql> show processlist;
+-----+------+-----------+------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
| Id  | User | Host      | db   | Command | Time | State                           | Info                                                                                                 |
+-----+------+-----------+------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
|   2 | root | localhost | test | Query   |    0 | starting                        | show processlist                                                                                     |
|   8 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | INSERT INTO t1 VALUES (1.e+10,1.e-1,10-10-2014)                                                      |
|   9 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | CREATE TABLE t2 (c1 NUMERIC(65) UNSIGNED ZEROFILL,c2 ENUM('a','b','c') CHARACTER SET "utf8" COLLATE  |
|  10 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | CREATE TABLE t1 (c1 BIT PRIMARY KEY,c2 DECIMAL(1) UNSIGNED,c3 MULTIPOLYGON) ENGINE=InnoDB            |
|  11 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | INSERT INTO t1 VALUES (0.01,1,1e3)                                                                   |
|  12 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | SELECT c1,c2 FROM t1 AS a1, t3 AS a2                                                                 |
|  13 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | CREATE TABLE t1 (c1 VARCHAR(256) PRIMARY KEY,c2 DOUBLE UNSIGNED,c3 BIGINT UNSIGNED ZEROFILL) ENGINE= |
[...]
| 138 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | SELECT c1,c2 FROM t1                                                                                 |
| 139 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | CREATE TABLE t1 (c1 TIME,c2 CHAR(255) CHARACTER SET "latin1" COLLATE "latin1_bin",c3 REAL(255,10) ZE |
| 140 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | CREATE TABLE t2 (c1 BINARY,c2 MULTIPOINT,c3 BLOB) ENGINE=InnoDB                                      |
| 141 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | INSERT INTO t1 VALUES (-1.e-20,'12-23-2340',1.e+10)                                                  |
| 142 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | CREATE TABLE t1 (c1 REAL UNSIGNED ZEROFILL,c2 NUMERIC(2) ZEROFILL,c3 TIME(2), PRIMARY KEY(c1)) ENGIN |
| 143 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | INSERT INTO t1 VALUES ('b','23-12-2340',1.e-10)                                                      |
| 144 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | INSERT INTO t2 VALUES ('12-23-2015',12-23-2015,-1.0)                                                 |
| 145 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | DROP TABLE t1                                                                                        |
| 146 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | SELECT c1,c2 FROM t1                                                                                 |
| 147 | root | localhost | test | Query   |   46 | Waiting for table flush         | FLUSH TABLES t2                                                                                      |
| 148 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | INSERT INTO t1 VALUES (1.e-20,'CC',-1.e+1)                                                           |
| 149 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | INSERT INTO t1 VALUES (-3,1e3,1.e+2)                                                                 |
| 150 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | DROP TABLE t1                                                                                        |
| 151 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | INSERT INTO t1 VALUES (1.e-3,'XYZ',3.1415927)                                                        |
| 152 | root | localhost | test | Query   |   47 | updating                        | DELETE FROM t2                                                                                       |
| 153 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | TRUNCATE t2                                                                                          |
| 154 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | UPDATE t3 SET c1='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'                  |
| 155 | root | localhost | test | Query   |   48 | update                          | INSERT INTO t2 VALUES (1.e+2,10-10-2014,2990)                                                        |
| 156 | root | localhost | test | Query   |   47 | Waiting for table metadata lock | CREATE TABLE t1 (c1 CHAR(254) BINARY CHARACTER SET "latin1" COLLATE "latin1_bin",c2 REAL(40,30) ZERO |
| 157 | root | localhost | test | Query   |   46 | Waiting for table metadata lock | SELECT c1 FROM t1                                                                                    |
+-----+------+-----------+------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
151 rows in set (0.00 sec)

How to repeat:
1) Use Centos 7 x64. Not sure if it matters, but this is where I can reproduce the issue (not tested on other OS'es)
2) Use mysql-5.7.13-linux-glibc2.5-x86_64 with no-defaults, start it. Make sure to use something like --socket=${PWD}/socket.sock 
   Just for reference, here is my full command line:
  /sda/mysql-5.7.13-linux-glibc2.5-x86_64/bin/mysqld --no-defaults --core-file --basedir=/sda/mysql-5.7.13-linux-glibc2.5-x86_64 --tmpdir=/sda/mysql-5.7.13-linux-glibc2.5-x86_64/data --datadir=/sda/mysql-5.7.13-linux-glibc2.5-x86_64/data  --socket=/sda/mysql-5.7.13-linux-glibc2.5-x86_64/socket.sock --port=18479 --log-error=/sda/mysql-5.7.13-linux-glibc2.5-x86_64/log/master.err 2>&1 &
3) Start pquery like this:
   $  ~/percona-qa/pquery/pquery_h1 --database=test --socket=socket.sock --infile=/home/roel/out3.sql --user=root  -threads=150
4) Check SHOW PROCESSLIST with extra CLI client every 10-20 seconds. If you do not see the occurrence above after about 10 minutes, stop mysqld, wipe/clean the data dir (--no-defaults --initialize-insecure) and restart with #1. You should it happen in less then 10 attempts, likely less then 3.
5) If some internal timeout is reached, the queries will start flowing again (which does not invalidate the finding)
[29 Aug 2016 3:17] Roel Van de Paar
out3.sql, pquery_h1, and full processlist

Attachment: 82774_bug_bundle.tar.gz (application/x-gzip, text), 2.00 MiB.

[29 Aug 2016 3:24] Roel Van de Paar
I've also been able to generate the lockup with 10 threads.
[29 Aug 2016 3:25] Roel Van de Paar
As DROP TABLE t1; / t2; / t3; is part of the grammar, it's extremely unlikely that enough data is compiled for it to be the cause of the lockup.
[29 Aug 2016 3:26] Roel Van de Paar
A GDB trace should be easy to grab.
[29 Aug 2016 4:48] MySQL Verification Team
for tests, to avoid hanging you can set:

--innodb-lock-wait-timeout=1
--lock-wait-timeout=1

i'm guessing the insert/update are getting locked for 51 seconds, 
then the flush/ddl follows that and waits for 1 year.  
but why is this a bug if the config allows it?
[30 Aug 2016 4:22] Roel Van de Paar
Because it takes so long? With 10 threads and a high end machine (i7/ssd) and small queries (some larger), one would not expect something to lockup mysqld for > 30-40 seconds?
[30 Aug 2016 5:28] MySQL Verification Team
I don't think it is a "lockup".  it's probably a timed wait, due to those timeout settings mentioned.  To troubleshoot exact cause you'd need to interrogate the sys schema or these :

https://dev.mysql.com/doc/refman/5.7/en/metadata-locks-table.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-lock-waits-table.html

if it was a real lockup then KILL sql command wouldn't work,  and the lockup would never resolve itself after timeouts expire!
[30 Aug 2016 5:32] Roel Van de Paar
Agreed that lockup is wrong word choice. Changed it to semi-hang, which I still believe it is. i.e. at the source of it there must be some interaction/step which goes wrong for it 'hang' as long as it does...
[30 Aug 2016 5:53] MySQL Verification Team
But what is the bug here?  Metadata locking is required since fix of Bug #989

Default timeouts are:
innodb-lock-wait-timeout              50
lock-wait-timeout                     31536000

to maintain backward compatibility.   So,  set those to 1 to reduce it.
https://dev.mysql.com/doc/refman/5.7/en/metadata-locking.html

Exact statements involved can be seen via I_S and P_S.
[30 Aug 2016 5:55] Roel Van de Paar
Aha, think I am starting to understand your point. So would removing all locks from the grammar achieve the same?
[30 Aug 2016 5:57] Roel Van de Paar
(inc transactions...)
[10 Apr 2017 15:35] MySQL Verification Team
Hi Roel,

I fail to understand your question. What exactly do you mean by:

"
So would removing all locks from the grammar achieve the same?

"

Thanks in advance.
[11 Apr 2017 13:54] MySQL Verification Team
Still waiting.
[12 Apr 2017 23:39] Roel Van de Paar
Did some further thinking on this bug.

1] First, to answer Sinisa's question; my thought was to remove locking from testing. Bad idea. Ignore my last comment.
2] So we know that at some point we run into a lock timeout, OK
3] But, the real question is; why do we run into a lock timeout when at the same time all of the transactions are long-running? i.e. should not one of them have been processing [quickly]? 
4] It seems that (in this particular case) these two transactions are live/processing;
DELETE FROM t2           
INSERT INTO t2 VALUES (1.e+2,10-10-2014,2990)                       
and both of those would presumably complete in a few microsec.

Is all in all this not some sort of race condition or incorrect locking effect?
[13 Apr 2017 6:53] MySQL Verification Team
IMHO this report is simply 'not a bug' in its current form..  
At least the report is not specific enough to be taken as a bug.

If we want to diagnose why two statements hang:

DELETE FROM t2           
INSERT INTO t2 VALUES (1.e+2,10-10-2014,2990)       

There are proper methods to do that.  They start with checking:
https://dev.mysql.com/doc/refman/5.7/en/innodb-lock-waits-table.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-table.html
[13 Apr 2017 14:18] MySQL Verification Team
Roel,

Sorry, but I have to agree with Shane on this issue.

With one small addition. Beside InnoDB locks, you seem to be hitting MDL locks. These locks also require waiting in certain situations.

Not a bug, until provided with some reproducible test case, which would showcase some, totally, unnecessary or inappropriate locking.
[13 Apr 2017 14:21] MySQL Verification Team
Roel,

One recommended reading for you. Will not solve these problems, but might give you some ideas:

http://mysqlserverteam.com/mysql-8-0-1-using-skip-locked-and-nowait-to-handle-hot-rows/
[13 Apr 2017 22:01] Roel Van de Paar
Fair enough, thanks Shane and Sinisa!