Description:
Since replication dies with the Sabre test, I decided to allow the master cluster to run the Sabre tests without replication. Shortly after starting script go9, the NDBD's start dieing off without cores.
2005-07-04 19:43:10 [MgmSrvr] INFO -- Node 2: Local checkpoint 93 started. Keep GCI = 126953 oldest restorable GCI = 126955
2005-07-04 19:43:21 [MgmSrvr] INFO -- Node 6: Event buffer status: used=15804KB(92%) alloc=17MB(0%) max=0B apply_gci=126960 latest_gci=126960
2005-07-04 19:43:23 [MgmSrvr] INFO -- Node 2: Local checkpoint 94 started. Keep GCI = 126956 oldest restorable GCI = 126959
2005-07-04 19:43:36 [MgmSrvr] INFO -- Node 2: Local checkpoint 95 started. Keep GCI = 126960 oldest restorable GCI = 126964
2005-07-04 19:43:43 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=126968 latest_gci=126968
2005-07-04 19:43:53 [MgmSrvr] INFO -- Node 2: Local checkpoint 96 started. Keep GCI = 126966 oldest restorable GCI = 126969
2005-07-04 19:44:09 [MgmSrvr] WARNING -- Node 5: Transporter to node 6 reported error 0x16
2005-07-04 19:44:14 [MgmSrvr] INFO -- Node 2: Local checkpoint 97 started. Keep GCI = 126971 oldest restorable GCI = 126976
2005-07-04 19:44:21 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=126982 latest_gci=126982
2005-07-04 19:44:38 [MgmSrvr] INFO -- Node 2: Local checkpoint 98 started. Keep GCI = 126980 oldest restorable GCI = 126984
2005-07-04 19:44:43 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=126990 latest_gci=126990
2005-07-04 19:44:46 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=126991 latest_gci=126991
2005-07-04 19:44:56 [MgmSrvr] WARNING -- Node 2: Transporter to node 3 reported error 0x16
2005-07-04 19:44:56 [MgmSrvr] WARNING -- Node 2: Transporter to node 3 reported error 0x16
2005-07-04 19:44:57 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=126995 latest_gci=126995
2005-07-04 19:45:10 [MgmSrvr] INFO -- Node 2: Local checkpoint 99 started. Keep GCI = 126989 oldest restorable GCI = 126995
2005-07-04 19:45:15 [MgmSrvr] INFO -- Node 6: Event buffer status: used=17MB(100%) alloc=17MB(0%) max=0B apply_gci=127000 latest_gci=127000
2005-07-04 19:45:15 [MgmSrvr] INFO -- Node 6: Event buffer status: used=18MB(100%) alloc=18MB(0%) max=0B apply_gci=127000 latest_gci=127000
2005-07-04 19:45:15 [MgmSrvr] INFO -- Node 6: Event buffer status: used=20MB(100%) alloc=20MB(0%) max=0B apply_gci=127000 latest_gci=127000
2005-07-04 19:45:15 [MgmSrvr] INFO -- Node 6: Event buffer status: used=20MB(100%) alloc=20MB(0%) max=0B apply_gci=127001 latest_gci=127001
2005-07-04 19:45:17 [MgmSrvr] WARNING -- Node 5: Transporter to node 6 reported error 0x16
2005-07-04 19:45:17 [MgmSrvr] WARNING -- Node 3: Transporter to node 6 reported error 0x16
2005-07-04 19:45:18 [MgmSrvr] INFO -- Node 6: Event buffer status: used=20MB(100%) alloc=20MB(0%) max=0B apply_gci=127002 latest_gci=127002
2005-07-04 19:45:28 [MgmSrvr] WARNING -- Node 3: Transporter to node 6 reported error 0x16
2005-07-04 19:45:33 [MgmSrvr] INFO -- Node 6: Event buffer status: used=21MB(100%) alloc=21MB(0%) max=0B apply_gci=127006 latest_gci=127006
2005-07-04 19:45:33 [MgmSrvr] INFO -- Node 6: Event buffer status: used=22MB(100%) alloc=22MB(0%) max=0B apply_gci=127006 latest_gci=127006
2005-07-04 19:45:34 [MgmSrvr] INFO -- Node 6: Event buffer status: used=23MB(100%) alloc=23MB(0%) max=0B apply_gci=127007 latest_gci=127007
2005-07-04 19:45:48 [MgmSrvr] INFO -- Node 2: Local checkpoint 100 started. Keep GCI = 126999 oldest restorable GCI = 127007
2005-07-04 19:45:55 [MgmSrvr] INFO -- Node 6: Event buffer status: used=23MB(100%) alloc=23MB(0%) max=0B apply_gci=127014 latest_gci=127014
2005-07-04 19:46:16 [MgmSrvr] INFO -- Node 6: Event buffer status: used=22MB(94%) alloc=23MB(0%) max=0B apply_gci=127021 latest_gci=127021
2005-07-04 19:46:28 [MgmSrvr] WARNING -- Node 4: Transporter to node 5 reported error 0x16
2005-07-04 19:46:29 [MgmSrvr] WARNING -- Node 4: Transporter to node 5 reported error 0x16 - Repeated 5 times
2005-07-04 19:46:36 [MgmSrvr] INFO -- Node 2: Local checkpoint 101 started. Keep GCI = 127012 oldest restorable GCI = 127021
2005-07-04 19:46:39 [MgmSrvr] WARNING -- Node 3: Transporter to node 6 reported error 0x16
2005-07-04 19:47:09 [MgmSrvr] INFO -- Node 6: Event buffer status: used=23MB(100%) alloc=23MB(0%) max=0B apply_gci=127039 latest_gci=127039
2005-07-04 19:47:12 [MgmSrvr] INFO -- Node 6: Event buffer status: used=23MB(100%) alloc=23MB(0%) max=0B apply_gci=127039 latest_gci=127039
2005-07-04 19:47:12 [MgmSrvr] INFO -- Node 6: Event buffer status: used=24MB(100%) alloc=24MB(0%) max=0B apply_gci=127039 latest_gci=127039
2005-07-04 19:47:12 [MgmSrvr] INFO -- Node 6: Event buffer status: used=24MB(100%) alloc=24MB(0%) max=0B apply_gci=127040 latest_gci=127040
2005-07-04 19:47:27 [MgmSrvr] WARNING -- Node 3: Transporter to node 6 reported error 0x16
2005-07-04 19:47:27 [MgmSrvr] WARNING -- Node 5: Transporter to node 6 reported error 0x16
2005-07-04 19:47:34 [MgmSrvr] WARNING -- Node 5: Transporter to node 6 reported error 0x16
2005-07-04 19:47:35 [MgmSrvr] INFO -- Node 2: Local checkpoint 102 started. Keep GCI = 127028 oldest restorable GCI = 127039
2005-07-04 19:48:59 [MgmSrvr] INFO -- Node 6: Event buffer status: used=25MB(100%) alloc=25MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:00 [MgmSrvr] INFO -- Node 6: Event buffer status: used=26MB(100%) alloc=26MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:00 [MgmSrvr] INFO -- Node 6: Event buffer status: used=27MB(100%) alloc=27MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:01 [MgmSrvr] INFO -- Node 6: Event buffer status: used=28MB(100%) alloc=28MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:01 [MgmSrvr] INFO -- Node 6: Event buffer status: used=30MB(100%) alloc=30MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:01 [MgmSrvr] INFO -- Node 6: Event buffer status: used=31MB(100%) alloc=31MB(0%) max=0B apply_gci=127076 latest_gci=127076
2005-07-04 19:49:01 [MgmSrvr] INFO -- Node 6: Event buffer status: used=31MB(100%) alloc=31MB(0%) max=0B apply_gci=127077 latest_gci=127077
2005-07-04 19:49:04 [MgmSrvr] INFO -- Node 6: Event buffer status: used=31MB(100%) alloc=31MB(0%) max=0B apply_gci=127078 latest_gci=127078
2005-07-04 19:49:18 [MgmSrvr] ALERT -- Node 2: Node 7 Disconnected
2005-07-04 19:49:18 [MgmSrvr] INFO -- Node 2: Communication to Node 7 closed2005-07-04 19:49:18 [MgmSrvr] ALERT -- Node 4: Node 7 Disconnected
2005-07-04 19:49:18 [MgmSrvr] INFO -- Node 4: Communication to Node 7 closed2005-07-04 19:49:18 [MgmSrvr] ALERT -- Node 3: Node 7 Disconnected
2005-07-04 19:49:18 [MgmSrvr] INFO -- Node 3: Communication to Node 7 closed2005-07-04 19:49:18 [MgmSrvr] ALERT -- Node 5: Node 7 Disconnected
"ndb_1_cluster.log" 1160L, 109105C 1041,1 89%
ndb_2_error.log
Current byte-offset of file-pointer is: 468
Date/Time: Monday 4 July 2005 - 20:00:14
Type of error: error
Message: Internal program error
Fault ID: 2307
Problem data: Signal lost, send buffer full
Object of reference: TransporterCallback.cpp
ProgramName: /home/ndbdev/jmiller/builds/libexec/ndbd
ProcessID: 18132
TraceFile: /space/run/ndb_2_trace.log.1
Version 5.1.0 (a_drop5p3)
***EOM***
ndb_3_error.log
Current byte-offset of file-pointer is: 468
Date/Time: Monday 4 July 2005 - 20:19:10
Type of error: error
Message: Internal program error
Fault ID: 2307
Problem data: Signal lost, send buffer full
Object of reference: TransporterCallback.cpp
ProgramName: /home/ndbdev/jmiller/builds/libexec/ndbd
ProcessID: 5524
TraceFile: /space/run/ndb_3_trace.log.1
Version 5.1.0 (a_drop5p3)
***EOM***
All files have been saved off to ndb08:/space/bug#### where #### = the number of this report
How to repeat:
run the Sabre test.
Suggested fix:
Cluster should handle the Sabre test