Bug #35607 Setting clock further 'confuses' ndb_mgm_get_status()
Submitted: 27 Mar 2008 13:59 Modified: 2 Jul 2008 11:23
Reporter: Geert Vanderkelen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S3 (Non-critical)
Version:5.2.2_drop6p15 OS:Any
Assigned to: CPU Architecture:Any

[27 Mar 2008 13:59] Geert Vanderkelen
Description:
Attached to this bug is a small MGM API application (minimgm.cpp) which checks continuously
the state of all nodes. When setting the clock of the system running this application back
in time, it seems to work OK.
But putting the clock a few minutes further seems to confuse it, but comes back fine after a few tries.

How to repeat:
Simple cluster setup, 1 data node should do..
Use the attached mini mgm applicaiton using MGM API then set the clock a few minutes further:
 shell> date
 Thu Mar 27 14:57:45 CET 2008
 shell> date -s 15:00
 Thu Mar 27 15:00:00 CET 2008

Should see output like this:

2008-03-27 14:57:10 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 14:57:10 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 14:57:10 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
..
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status: 1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
[27 Mar 2008 14:00] Geert Vanderkelen
Simple code..

Attachment: minimgm.cpp (, text), 1.27 KiB.

[16 Apr 2008 10:07] Jonas Oreland
suggest
uses times() for mgmapi
and CLOCK_MONTONTONIC for ndbd (real time scheduling)

suggestion disable real-time extension if CLOCK_MONOTINC is not present

to be fixed in 6.3.14, ndbd
and future release mgmapi
[9 Jun 2008 11:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47595

2617 jonas@mysql.com	2008-06-09
      ndb - bug#35607
        use CLOCK_MONOTONIC if present/possible
        both for NdbTick_Current* and NdbCondition_WaitTimeout
[12 Jun 2008 15:01] Geert Vanderkelen
Other case, which the above patch fixes:

When changing the clock on the system the Master node will crash. Happens with MySQL Cluster 6.3.10 and 6.3.15.

To reproduce: Start a 2 data node cluster on 2 separate machines (same one doesn't work apparently). Then do the following on the machine which runs the Mater node:

  shell> date && sudo date `date -d 'now + 12 sec' "+%m%d%H%M%Y.%S"`
  shell> sudo /usr/sbin/ntpdate -b clock.redhat.com

(This only works with the -b option for ntpdate)

Node 4: Forced node shutdown completed. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Time: Thursday 12 June 2008 - 16:58:33
Status: Temporary error, restart node
Message: System error, node killed during node restart by other node (Internal error, programming error or mis
sing error message, please report a bug)
Error: 2303
Error data: Node 4 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 249) 0x0000000e
Program: ./libexec/ndbd
Pid: 18429
Trace: /opt/mysql/data/cluster63/ndb_4_trace.log.5
Version: mysql-5.1.24 ndb-6.3.15-RC

Applying the patch seems to fix the problem.
[15 Jun 2008 15:42] Hartmut Holzgraefe
Is bug #35280 a duplicate of this one?
[27 Jun 2008 13:02] Tomas Ulin
pushed to 6.2.16 and 6.3.16
[2 Jul 2008 11:23] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.16 changelogs as follows:

        Changing the system time on data nodes could cause MGM API applications 
        to hang and the data nodes to crash.
[12 Dec 2008 23:26] Bugs System
Pushed into 6.0.6-alpha  (revid:jonas@mysql.com-20080609115717-r1gwdm55n12kv7qk) (version source revid:jonas@mysql.com-20080812185642-1nevjb94zj621dqx) (pib:5)