Bug #35607 Setting clock further 'confuses' ndb_mgm_get_status()
Submitted: 27 Mar 2008 14:59 Modified: 2 Jul 2008 13:23
Reporter: Geert Vanderkelen
Status: Closed
Category:Server: NDBAPI Severity:S3 (Non-critical)
Version:5.2.2_drop6p15 OS:Any
Assigned to: Target Version:
Triage: D3 (Medium)

[27 Mar 2008 14:59] Geert Vanderkelen
Description:
Attached to this bug is a small MGM API application (minimgm.cpp) which checks
continuously
the state of all nodes. When setting the clock of the system running this application
back
in time, it seems to work OK.
But putting the clock a few minutes further seems to confuse it, but comes back fine
after a few tries.

How to repeat:
Simple cluster setup, 1 data node should do..
Use the attached mini mgm applicaiton using MGM API then set the clock a few minutes
further:
 shell> date
 Thu Mar 27 14:57:45 CET 2008
 shell> date -s 15:00
 Thu Mar 27 15:00:00 CET 2008

Should see output like this:

2008-03-27 14:57:10 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 14:57:10 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 14:57:10 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
..
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] Could not get status
2008-03-27 15:02:00 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
2008-03-27 15:02:00 [-] All status:
1=OK,2=KO,3=OK,4=OK,5=OK,6=OK,7=KO,8=KO,9=KO,10=KO,11=KO,
[27 Mar 2008 15:00] Geert Vanderkelen
Simple code..

Attachment: minimgm.cpp (, text), 1.27 KiB.

[16 Apr 2008 12:07] Jonas Oreland
suggest
uses times() for mgmapi
and CLOCK_MONTONTONIC for ndbd (real time scheduling)

suggestion disable real-time extension if CLOCK_MONOTINC is not present

to be fixed in 6.3.14, ndbd
and future release mgmapi
[9 Jun 2008 13:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47595

2617 jonas@mysql.com	2008-06-09
      ndb - bug#35607
        use CLOCK_MONOTONIC if present/possible
        both for NdbTick_Current* and NdbCondition_WaitTimeout
[12 Jun 2008 17:01] Geert Vanderkelen
Other case, which the above patch fixes:

When changing the clock on the system the Master node will crash. Happens with MySQL
Cluster 6.3.10 and 6.3.15.

To reproduce: Start a 2 data node cluster on 2 separate machines (same one doesn't work
apparently). Then do the following on the machine which runs the Mater node:

  shell> date && sudo date `date -d 'now + 12 sec' "+%m%d%H%M%Y.%S"`
  shell> sudo /usr/sbin/ntpdate -b clock.redhat.com

(This only works with the -b option for ntpdate)

Node 4: Forced node shutdown completed. Caused by error 2303: 'System error, node killed
during node restart by other node(Internal error, programming error or missing error
message, please report a bug). Temporary error, restart node'.

Time: Thursday 12 June 2008 - 16:58:33
Status: Temporary error, restart node
Message: System error, node killed during node restart by other node (Internal error,
programming error or mis
sing error message, please report a bug)
Error: 2303
Error data: Node 4 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 249) 0x0000000e
Program: ./libexec/ndbd
Pid: 18429
Trace: /opt/mysql/data/cluster63/ndb_4_trace.log.5
Version: mysql-5.1.24 ndb-6.3.15-RC

Applying the patch seems to fix the problem.
[15 Jun 2008 17:42] Hartmut Holzgraefe
Is bug #35280 a duplicate of this one?
[27 Jun 2008 15:02] Tomas Ulin
pushed to 6.2.16 and 6.3.16
[2 Jul 2008 13:23] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.16 changelogs as follows:

        Changing the system time on data nodes could cause MGM API applications 
        to hang and the data nodes to crash.
[13 Dec 2008 0:26] Bugs System
Pushed into 6.0.6-alpha  (revid:jonas@mysql.com-20080609115717-r1gwdm55n12kv7qk) (version
source revid:jonas@mysql.com-20080812185642-1nevjb94zj621dqx) (pib:5)