Bug #85706 | Rowid already allocated under heavy load | ||
---|---|---|---|
Submitted: | 30 Mar 2017 11:19 | Modified: | 11 May 2017 5:34 |
Reporter: | Сергей Кукуев | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.7.17 ndb-7.5.5 | OS: | Oracle Linux (6.8 (Santiago) kernel 4.1.12-37.4.1.el6uek.x86_64) |
Assigned to: | CPU Architecture: | Any | |
Tags: | 899, Error 899, heavy load, Rowid already allocated |
[30 Mar 2017 11:19]
Сергей Кукуев
[30 Mar 2017 11:24]
Сергей Кукуев
mysql-bug-data-85706.tar.gz uploaded to //support/incoming
[30 Mar 2017 13:22]
Сергей Кукуев
Additionally we perform tests with less data nodes. On cluster with 1 or 2 datanodes error had not reproduced. On 4 and 8 nodes it was reproduced. And it was reproduced with less data - approx. 30 millions rows in 6 tables summary.
[31 Mar 2017 15:30]
Сергей Кукуев
test wich reproduces error
Attachment: main.cpp (text/plain), 15.52 KiB.
[31 Mar 2017 15:33]
Сергей Кукуев
DB scheme for test
Attachment: DB.sql (application/octet-stream, text), 1016 bytes.
[31 Mar 2017 15:36]
Сергей Кукуев
We reproduced this error within small test. Run it with 80 threads (command line param) Files with test and test database attached in prev comment. No initial tables filling needed for run.
[5 Apr 2017 11:43]
Michael Prokopiv
Reproduced OEL MySQL cluster gpl libndbclient 7.2 7.5.5 7.4.12 7.2 7.5.5 7.5.5 7.2 7.4.14 7.4.12 7.2 7.4.10 7.4.12 6.8 7.5.5 7.5.5
[7 Apr 2017 3:42]
MySQL Verification Team
Hi Сергей, I tried your test case and on "normal, small" cluster I was not able to reproduce this. I will retry on a larger system but before that I need to know, since you wrote "We reproduced this error within small test" - do you talk about this small test case but on the big "16 nodes x 48 CPU" cluster or you managed to reproduce this on a smaller cluster as well (one that you said you could not reproduce the problem on)? best regards Bogdan
[7 Apr 2017 7:29]
Сергей Кукуев
Hi, Bogdan! We reproduced this bug on 8x32cpu and on 4x32cpu clusters. On 4 nodes it takes a bit more time than on 8 nodes.
[7 Apr 2017 7:36]
MySQL Verification Team
Hi, And on 16cpu nodes? There you can or cannot reproduce? thanks Bogdan
[7 Apr 2017 7:39]
Сергей Кукуев
We haven't got 16cpus nodes. So we didn't test it on such configuration.
[7 Apr 2017 7:47]
MySQL Verification Team
Hi, Sorry, my mistake, so you reproduced on 48 and 32CPU boxes both. This bug is fixed number of times already, but looks under some circumstances it reappears. Our dev team is on it, I'm waiting to see if there's any info they need you to provide them from your system as we ourselves have huge problems reproducing this. all best Bogdan
[11 May 2017 5:34]
MySQL Verification Team
Hi, I'm running the test now on 192 core's and I'm reproducing it :( mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64 config.ini: [mysql@supra10 mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64]$ cat config.ini [ndbd default] NoOfReplicas= 2 DataDir= /export/home/mysql/mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/clusterdata DataMemory = 1024M IndexMemory = 256M MaxNoOfConcurrentOperations=500000 MaxNoOfExecutionThreads=32 NoOfFragmentLogParts=32 [ndb_mgmd] Hostname= localhost DataDir= /export/home/mysql/mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/clusterdata [ndbd] HostName= localhost [ndbd] HostName= localhost [ndbd] HostName= localhost [ndbd] HostName= localhost [mysqld] [mysqld] [mysqld] [mysqld] [mysqld] [mysql@supra10 mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64]$ bin/ndb_mgm -e show Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 4 node(s) id=2 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15, Nodegroup: 0, *) id=3 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15, Nodegroup: 0) id=4 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15, Nodegroup: 1) id=5 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15, Nodegroup: 1) [ndb_mgmd(MGM)] 1 node(s) id=1 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15) [mysqld(API)] 5 node(s) id=6 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15) id=7 @127.0.0.1 (mysql-5.6.36 ndb-7.4.15) id=8 (not connected, accepting connect from any host) id=9 (not connected, accepting connect from any host) id=10 (not connected, accepting connect from any host) [mysql@supra10 mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64]$ cat mysql.cnf [mysqld] log-bin binlog-format = ROW gtid-mode = ON enforce-gtid-consistency = ON log-slave-updates = ON master-info-repository = TABLE relay-log-info-repository = TABLE binlog-checksum = NONE datadir=/export/home/mysql/mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/mysqldata/ basedir=/export/home/mysql/mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/ socket=/export/home/mysql/mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/mysqldata/mysql.sock ndbcluster skip-networking [mysql@supra10 mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64]$ running your test reproduces the problem g++ -o testcase testcase.c -I mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/include/ -L mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/lib/ -I ./mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/include/storage/ndb/ndbapi -I./mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/include/storage/ndb/ -I ./mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/include/storage/ -lmysqlclient -lndbclient -std=gnu++11 [mysql@supra10 ~]$ LD_LIBRARY_PATH=mysql-cluster-gpl-7.4.15-linux-glibc2.5-x86_64/lib/ ./testcase localhost BS 1000000 80 execute:222: line: 144: Rowid already allocated execute:222: line: 144: Rowid already allocated execute:222: line: 144: Rowid already allocated Setting the bug to verified. Thanks for the test case! all best Bogdan