Bug #54199 Cluster+DiskData, Querying INFORMATION_SCHEMA.FILES slow when data node is down
Submitted: 3 Jun 2010 12:10 Modified: 16 Nov 2010 15:24
Reporter: Joffrey MICHAIE Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.1 OS:Linux
Assigned to: Maitrayi Sabaratnam CPU Architecture:Any
Tags: 7.1.3, cluster, information_schema.FILES

[3 Jun 2010 12:10] Joffrey MICHAIE
Description:
When not all the data nodes are running, querying the INFORMATION_SCHEMA.FILES table takes quite a long time ....

+-+-+- Normal behaviour :

mysql> SELECT TABLESPACE_NAME,LOGFILE_GROUP_NAME,FILE_TYPE,SUM(FREE_EXTENTS*EXTENT_SIZE)/1024/1024 AS Avail,SUM(FREE_EXTENTS),SUM(TOTAL_EXTENTS),EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE='DATAFILE' GROUP BY EXTRA,TABLESPACE_NAME;
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
| TABLESPACE_NAME | LOGFILE_GROUP_NAME | FILE_TYPE | Avail         | SUM(FREE_EXTENTS) | SUM(TOTAL_EXTENTS) | EXTRA          |
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
| cbtu_ts1        | cbtu_log_group     | DATAFILE  | 3386.00000000 |              3386 |              20480 | CLUSTER_NODE=3 |
| cbtu_ts1        | cbtu_log_group     | DATAFILE  | 3386.00000000 |              3386 |              20480 | CLUSTER_NODE=4 |
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
2 rows in set (0.11 sec) 

+-+-+- Abnormal (slow) behaviour :

ndb_mgm> all status
Node 3: started (mysql-5.1.44 ndb-7.1.3)
Node 4: starting (Last completed phase 1767987005) (mysql-5.1.44 ndb-7.1.3)

ndb_mgm>
Running while node is starting :
mysql> SELECT TABLESPACE_NAME,LOGFILE_GROUP_NAME,FILE_TYPE,SUM(FREE_EXTENTS*EXTENT_SIZE)/1024/1024 AS Avail,SUM(FREE_EXTENTS),SUM(TOTAL_EXTENTS),EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE='DATAFILE' GROUP BY EXTRA,TABLESPACE_NAME;
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
| TABLESPACE_NAME | LOGFILE_GROUP_NAME | FILE_TYPE | Avail         | SUM(FREE_EXTENTS) | SUM(TOTAL_EXTENTS) | EXTRA          |
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
| cbtu_ts1        | cbtu_log_group     | DATAFILE  | 3386.00000000 |              3386 |              20480 | CLUSTER_NODE=3 |
+-----------------+--------------------+-----------+---------------+-------------------+--------------------+----------------+
1 row in set (3 min 35.88 sec)

mysql>

If one of the data node is down, it is no longer possible to monitor the space used on the disk data tablespaces (takes seconds to fetch information ...)

Tested on Linux RedHat 5.4 & Debian latest, with 7.1.3, and ndb(mt)d, 1 and 2 ndb_mgmd

How to repeat:
Setup Cluster with : 

////my.cnf (debian default + ndbcluster)/////

[client]
port        = 3306
socket        = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket        = /var/run/mysqld/mysqld.sock
nice        = 0
[mysqld]
ndbcluster
ndb-connectstring=192.168.56.101
ndb-nodeid=8
user        = mysql
pid-file    = /var/run/mysqld/mysqld.pid
socket        = /var/run/mysqld/mysqld.sock
port        = 3306
basedir        = /usr/local/mysql
datadir        = /var/lib/mysql
tmpdir        = /tmp
language    = /usr/local/mysql/share/english
skip-external-locking
bind-address        = 127.0.0.1
key_buffer        = 16M
max_allowed_packet    = 16M
thread_cache_size    = 8
myisam-recover        = BACKUP
query_cache_limit       = 1M
query_cache_size        = 16M
expire_logs_days    = 10
max_binlog_size         = 100M
[mysqldump]
quick
quote-names
max_allowed_packet    = 16M
[mysql]
[isamchk]
key_buffer        = 16M
ndbcluster
ndb-connectstring=127.0.0.1
!includedir /etc/mysql/conf.d/

////config.ini////
[ndbd default]
NoOfReplicas=2
MaxNoOfExecutionThreads=4

[ndbd]
nodeid=3
hostname=192.168.56.101
DataDir=/usr/local/data/dn1

[ndbd]
nodeid=4
hostname=192.168.56.101
DataDir=/usr/local/data/dn2

[ndb_mgmd]
nodeid=1
hostname=192.168.56.101
DataDir=/home/mysql/mgmd

[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]

- Start Cluster (All nodes)
- Setup Disk Data
    * Create logfile group
    * Create tablespace(s)
- Run mysql query
- Stop 1 data node
- Run mysql query

Suggested fix:
MySQL should be aware of non running / not ready data nodes (with ndbinfo or mysql status) and not wait for no reply / timeout of the missing / not ready data node.

Thanks in advance,
BR,
Joffrey
[3 Nov 2010 11:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/122657

3915 Maitrayi Sabaratnam	2010-11-03
      BUG#54199 Cluster+DiskData, Querying INFORMATION_SCHEMA.FILES slow when data node is down
[8 Nov 2010 10:59] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:msabaratnam@mysql.com-20101108104109-9k3zwgi9a11b7hf7) (version source revid:msabaratnam@mysql.com-20101108104109-9k3zwgi9a11b7hf7) (merge vers: 5.1.51-ndb-7.0.21) (pib:21)
[10 Nov 2010 14:11] Jon Stephens
Documented in the NDB-7.0.21 and 7.1.10 changelogs as follows:

        When one or more data nodes was not running, queries on the
        INFORMATION_SCHEMA.FILES table took an excessive length of time
        to complete because the MySQL server waited for responses from
        the stopped node or nodes to time out. Now in such cases, MySQL
        does not attempt to contact nodes which are not known to be
        running.

Closed.
[11 Nov 2010 16:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/123649

3339 Maitrayi Sabaratnam	2010-11-11
      Merge patch for bug#54199 to 6.3
[11 Nov 2010 16:51] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.40 (revid:msabaratnam@mysql.com-20101111165016-6ivq5gcr3qgw4vlt) (version source revid:msabaratnam@mysql.com-20101111165016-6ivq5gcr3qgw4vlt) (merge vers: 5.1.51-ndb-6.3.40) (pib:21)
[11 Nov 2010 22:35] Jon Stephens
Also documented in the NDB-6.3.40 changelog. Closed.
[15 Nov 2010 9:36] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:pekka@mysql.com-20101115092310-yr9m4d3s4gvg7zdf) (version source revid:pekka@mysql.com-20101115092310-yr9m4d3s4gvg7zdf) (merge vers: 5.1.51-ndb-7.0.21) (pib:21)
[16 Nov 2010 15:24] Jon Stephens
Already documented for NDB-6.3/7.0/7.1 series. Setting back to Closed.