Bug #70435 | All data node shutdown by any trigger | ||
---|---|---|---|
Submitted: | 26 Sep 2013 12:15 | Modified: | 21 Nov 2013 12:15 |
Reporter: | 李 源錫 | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql cluster 7.3, mysql cluster 7.1, 7.3.3 | OS: | Linux (CentOS 6.4) |
Assigned to: | CPU Architecture: | Any |
[26 Sep 2013 12:15]
李 源錫
[27 Sep 2013 7:09]
MySQL Verification Team
Hello, Thank you for the report. I can not repeat described behavior on my local box. Could you please attach the cluster logs? Preferably using the ndb_error_reporter utility: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-error-reporter.html Thanks, Umesh
[27 Sep 2013 8:44]
李 源錫
To. Umesh thanks for your fast confirm. but you must test with over 2 data node. please one more test at enviroment with 2 data node.
[27 Sep 2013 8:49]
MySQL Verification Team
Hello, Thank you for the feedback. I tried it on my local box with 2,4 storage nodes and cannot reproduce. Could you please attach the cluster logs? Preferably using the ndb_error_reporter utility: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-error-reporter.html Thanks, Umesh
[27 Sep 2013 9:14]
李 源錫
To. Umesh ok. I will Send error trace file after a few hours. if your local box is not physical 2 data node, can do you test with physical 2 data node?
[27 Sep 2013 12:55]
李 源錫
file by ndb_err_report
Attachment: ndb_error_report_20130927123949.tar.bz2 (application/octet-stream, text), 142.45 KiB.
[27 Sep 2013 12:57]
李 源錫
To. umesh I attached file produced by ndb_error_report. Please confirm.
[27 Sep 2013 13:16]
李 源錫
To. umesh > I can not repeat described behavior on my local box. Do you have any error message when you run "UPDATE a SET a2=2;", although data node don't downed?
[27 Sep 2013 13:53]
李 源錫
To. umesh > I can not repeat described behavior on my local box. I more tested at one physical machine, also. At one physical machine, i occured problem also. Follow next procedure. 1. see my.cnf cat /usr/local/mysql5.6.11-ndb7.3.2/my.cnf [MYSQL_CLUSTER] ndb-connectstring="host=localhost" [MYSQLD] basedir=/usr/local/mysql5.6.11-ndb7.3.2 datadir=/data/mysql5.6.11-ndb7.3.2 ndbcluster default_storage_engine=ndb 2. confer config.ini cat /usr/local/mysql5.6.11-ndb7.3.2/config.ini [NDB_MGMD DEFAULT] datadir=/data/mysql-cluster5.6.11-ndb7.3.2 [NDBD DEFAULT] datadir=/data/mysql-cluster5.6.11-ndb7.3.2 NoOfReplicas=2 DataMemory=2G IndexMemory=1G [NDB_MGMD] hostname=localhost NodeID=1 [NDBD] hostname=localhost NodeID=2 [NDBD] hostname=localhost NodeID=3 [MYSQLD] hostname=localhost NodeID=4 3. run mgmd /usr/bin/sudo -u mysql /usr/local/mysql5.6.11-ndb7.3.2/bin/ndb_mgmd -f /usr/local/mysql5.6.11-ndb7.3.2/config.ini --configdir=/usr/local/mysql5.6.11-ndb7.3.2 4. run ndbmtd /usr/bin/sudo -u mysql /usr/local/mysql5.6.11-ndb7.3.2/bin/ndbmtd --defaults-file=/usr/local/mysql5.6.11-ndb7.3.2/my.cnf /usr/bin/sudo -u mysql /usr/local/mysql5.6.11-ndb7.3.2/bin/ndbmtd --defaults-file=/usr/local/mysql5.6.11-ndb7.3.2/my.cnf 5. run mysqld /usr/local/mysql5.6.11-ndb7.3.2/bin/mysqld_safe --defaults-file=/usr/local/mysql5.6.11-ndb7.3.2/my.cnf --datadir=/data/mysql5.6.11-ndb7.3.2 & 6. login mysql /usr/local/mysql5.6.11-ndb7.3.2/bin/mysql -uroot -p test 7. run next command create table a ( a1 varchar(100) NOT NULL DEFAULT '', a2 int(6) DEFAULT '0', UNIQUE KEY idx_b1 (a1) ); CREATE TABLE b ( b1 char(20) NOT NULL DEFAULT '', b2 enum('N','Y') NOT NULL DEFAULT 'N' , UNIQUE KEY idx_b1 (b1) ); delimiter // CREATE TRIGGER a_update BEFORE UPDATE ON a FOR EACH ROW UPDATE b SET b2='Y' WHERE b1=SUBSTRING(old.a1, 1 + INSTR(old.a1, '.')) // delimiter ; delimiter // drop procedure a_ins// CREATE PROCEDURE a_ins() BEGIN DECLARE i INT DEFAULT 1; WHILE (i<=20000) DO INSERT INTO a VALUES(concat('a',i,'.aa'), 1); SET i=i+1; END WHILE; END // delimiter ; call a_ins; insert into b values('aa','N'); 8. confirm record count table a,b mysql> select count(*) from a; +----------+ | count(*) | +----------+ | 20000 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from b; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) 9. confirm trigger mysql> show triggers; +----------+--------+-------+------------------------------------------------------------------------+--------+---------+------------------------+----------------+----------------------+----------------------+--------------------+ | Trigger | Event | Table | Statement | Timing | Created | sql_mode | Definer | character_set_client | collation_connection | Database Collation | +----------+--------+-------+------------------------------------------------------------------------+--------+---------+------------------------+----------------+----------------------+----------------------+--------------------+ | a_update | UPDATE | a | UPDATE b SET b2='Y' WHERE b1=SUBSTRING(old.a1, 1 + INSTR(old.a1, '.')) | BEFORE | NULL | NO_ENGINE_SUBSTITUTION | root@localhost | utf8 | utf8_general_ci | latin1_swedish_ci | +----------+--------+-------+------------------------------------------------------------------------+--------+---------+------------------------+----------------+----------------------+----------------------+--------------------+ 1 row in set (0.00 sec) 10. at last, run problem query. mysql> UPDATE a SET a2=2; ERROR 1297 (HY000): Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER --you will see error like upper. -- at once, count record mysql> select count(*) from a where a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER --you will see error like upper, -- rerun , count record mysql> select count(*) from a where a2=2; +----------+ | count(*) | +----------+ | 0 | +----------+ -- you will see count 0. a2 column is not updated normally. --rerun problem then query mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER -- you will see error again. --rerun problem then query mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER -- you will see error again. --rerun problem then query mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER -- you will see error again. --rerun problem then query mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER -- you will see error again. -- repeat this query about 20 times, or more times. ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER mysql> UPDATE a SET a2=2; ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER -- see table status +----------------+----------------------------------------------------+ | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | +------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------+----------+----------------+----------------------------------------------------+ | a | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Got error 157 'Unknown error code' from NDBCLUSTER | | b | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Got error 157 'Unknown error code' from NDBCLUSTER | +------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-----------+---------- -- you will confirm data node down.
[1 Oct 2013 15:46]
李 源錫
To. umesh Did you retest in your local box according to my new procedure in one physical machine ? Please check A.S.A.P. This problem is serious in my production service using mysql cluster.
[10 Oct 2013 2:56]
李 源錫
To. support staff Why don't reply my question?
[10 Oct 2013 9:25]
MySQL Verification Team
Hi, I tried couple of times with the provided test case and but it never crashed even with 2 separate physical boxes.. // mysql-5.6.11 ndb-7.3.2 --------------------- [ndbd(NDB)] 2 node(s) id=2 @x.x.x.1 (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master) id=3 @x.x.x.2 (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0) [ndb_mgmd(MGM)] 1 node(s) id=1 @x.x.x.100 (mysql-5.6.11 ndb-7.3.2) [mysqld(API)] 12 node(s) id=4 @x.x.x.100 (mysql-5.6.11 ndb-7.3.2) id=5 (not connected, accepting connect from any host) ------------ mysql> drop table if exists a; Query OK, 0 rows affected (0.03 sec) mysql> drop table if exists b; Query OK, 0 rows affected (0.01 sec) mysql> mysql> mysql> create table a ( -> a1 varchar(250) NOT NULL DEFAULT '', -> a2 int(6) DEFAULT '0', UNIQUE KEY idx_b1 (a1) -> ); CREATE TABLE b ( b1 char(20) NOT NULL DEFAULT '', b2 enum('N','Y') NOT NULL DEFAULT 'N' , UNIQUE KEY idx_b1 (b1) Query OK, 0 rows affected (0.04 sec) mysql> mysql> CREATE TABLE b ( -> b1 char(20) NOT NULL DEFAULT '', -> b2 enum('N','Y') NOT NULL DEFAULT 'N' , -> UNIQUE KEY idx_b1 (b1) -> ); delimiter // CREATE TRIGGER a_update BEFORE UPDATE ON a FOR EACH ROW UPDATE b SET b2='Y' WHERE b1=SUBSTRING(old.a1, 1 + INSTR(old.a1, '.')) Query OK, 0 rows affected (0.03 sec) mysql> mysql> delimiter // mysql> CREATE TRIGGER a_update BEFORE UPDATE ON a -> FOR EACH ROW -> UPDATE b SET b2='Y' WHERE b1=SUBSTRING(old.a1, 1 + INSTR(old.a1, '.')) -> // delimiter ; delimiter // drop procedure a_ins// CREATE PROCEDURE a_ins() BEGIN Query OK, 0 rows affected (0.03 sec) mysql> delimiter ; mysql> mysql> delimiter // mysql> drop procedure a_ins// Query OK, 0 rows affected (0.00 sec) mysql> CREATE PROCEDURE a_ins() -> BEGIN -> DECLARE i INT DEFAULT 1; -> WHILE (i<=200000) DO INSERT INTO a VALUES(concat('a',i,'.aa'), 1); -> SET i=i+1; -> END WHILE; -> END -> // Query OK, 0 rows affected (0.00 sec) // i<=200000 (increased to check if it fails for > 20000) mysql> delimiter ; mysql> mysql> call a_ins; Query OK, 1 row affected (10 min 33.99 sec) mysql> insert into b values('aa','N'); Query OK, 1 row affected (0.00 sec) mysql> UPDATE a SET a2=2; Query OK, 200000 rows affected (9.83 sec) Rows matched: 200000 Changed: 200000 Warnings: 0 Btw, looking at your config file - are you running cluster on default settings? See http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-config-starting.html
[10 Oct 2013 12:02]
李 源錫
thanks mesh your reply. i unbelievable data node is normal in your test. it maybe in your test trigger not work. 1. please could you upload config.ini and my.cnf? 2. please could you upliad next result file? mysqldump -uroot -p -R -d targetdb > targetdb.dmp 3. ans please teach me your os version. uname -a
[10 Oct 2013 17:44]
李 源錫
To. umesh thanks mesh your reply. >i unbelievable data node is normal in your test. >it maybe in your test trigger not work. >1. >please could you upload config.ini and my.cnf? >2. >please could you upliad next result file? >mysqldump -uroot -p -R -d targetdb > targetdb.dmp >3. >ans please teach me your os version. >uname -a I want to reply about upper request in today. I immediately will retest considering your config. I have tested with three other hardware(supermicro, ibm, hp). At all lserver, all data node downed in this test.
[11 Oct 2013 3:47]
李 源錫
To. umesh I am very sorry that I promotes your confirm. If you can permit time and possible, I would like the following confirmation. 1. please could you upload your config.ini and my.cnf 2. please could you upload next result file. mysqldump -uroot -p -R -d targetdb > targetdb.dmp 3. please teach os version. uname -a thanks. p.s) >Btw, looking at your config file - are you running cluster on default settings? >See http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-config-starting.html I agin tested after same config set like http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-config-starting.html. but, all data node shutdowned. error message is below. ndb_mgm> Node 3: Forced node shutdown completed. Caused by error 6050: 'WatchDog terminate, internal error or massive overload on the machine running this node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. Node 2: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.
[21 Nov 2013 12:13]
MySQL Verification Team
Thank you for the feedback. Verified as reported. Marking this as duplicate of Bug #70955 Thanks, Umesh
[21 Nov 2013 12:35]
MySQL Verification Team
Duplicate of Bug #56929