| Bug #20892 | dbtup/DbtupPageMap.cpp Error: 2306 Pointer too large | ||
|---|---|---|---|
| Submitted: | 6 Jul 2006 22:12 | Modified: | 6 Sep 2006 11:00 | 
| Reporter: | Jonathan Miller | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) | 
| Version: | 4.1,5.0,5.1 | OS: | Linux (Linux 32 Bit OS) | 
| Assigned to: | Jonas Oreland | CPU Architecture: | Any | 
   [6 Jul 2006 22:48]
   Jonathan Miller        
  I paste from wrong trace log. here is correction DBTUP 024156 024232 021082 016212 016212 016214 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699984 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699983 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699983 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699982 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699982 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699981 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699981 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699980 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699980 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699979 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 255699979 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 255699978 length: 3 trace: 0 #sec: 0 fragInf: 0 H'0000000d H'00000005 H'00000010
   [7 Jul 2006 13:12]
   Jonathan Miller        
  Hi, Hit this again on a different table and a different line of code. Time: Friday 7 July 2006 - 14:48:33 Status: Temporary error, restart node Message: Pointer too large (Internal error, programming error or missing error message, please report a bug) Error: 2306 Error data: DbtupPageMap.cpp Error object: DBTUP (Line: 297) 0x0000000e Program: /home/ndbdev/jmiller/builds/libexec/ndbd Pid: 6170 Trace: /space/run/ndb_3_trace.log.3 Version: Version 5.0.23 There seems to be no recovery from this unless I take down the cluster and restart with --initial DBTC 000172 000278 012381 012416 012418 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 538495503 gsn: 164 "CONTINUEB" prio: 1 s.bn: 249 "DBTUP", s.proc: 3, s.sigId: 538495501 length: 2 trace: 0 #sec: 0 fragInf: 0 H'00000007 H'00000003 --------------- Signal ---------------- r.bn: 245 "DBTC", r.proc: 3, r.sigId: 538495502 gsn: 164 "CONTINUEB" prio: 1 s.bn: 245 "DBTC", s.proc: 3, s.sigId: 538495500 length: 3 trace: 1 #sec: 0 fragInf: 0 H'00000011 H'00000021 H'00f70003 --------------- Signal ---------------- r.bn: 249 "DBTUP", r.proc: 3, r.sigId: 538495501 gsn: 202 "DROP_TAB_REQ" prio: 1 s.bn: 250 "DBDICT", s.proc: 3, s.sigId: 538495499 length: 4 trace: 0 #sec: 0 fragInf: 0 senderRef: fa0003 senderData: 36 TableId: 3 requestType: 0 --------------- Signal ---------------- r.bn: 245 "DBTC", r.proc: 3, r.sigId: 538495500 gsn: 164 "CONTINUEB" prio: 1 s.bn: 245 "DBTC", s.proc: 3, s.sigId: 538495498 length: 3 trace: 1 #sec: 0 fragInf: 0 H'00000011 H'00000021 H'00f70003 --------------- Signal ---------------- r.bn: 250 "DBDICT", r.proc: 3, r.sigId: 538495499 gsn: 204 "DROP_TAB_CONF" prio: 1 s.bn: 248 "DBACC", s.proc: 3, s.sigId: 538495495 length: 3 trace: 0 #sec: 0 fragInf: 0 senderRef: f80003 senderData: 36 TableId: 3 --------------- Signal ----------------
   [7 Jul 2006 13:23]
   Jonathan Miller        
  When Trying to restart data node 3, data node 2 failes
       ?016217 014255 014262 014267 014255 000299 016215 016215
r.bn: 249 "DBTUP", r.proc: 2, r.sigId: 550288783 gsn: 164 "CONTINUEB" prio: 1
s.bn: 249 "DBTUP", s.proc: 2, s.sigId: 550288782 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'00000007 H'00000002
--------------- Signal ----------------
r.bn: 249 "DBTUP", r.proc: 2, r.sigId: 550288782 gsn: 164 "CONTINUEB" prio: 1
s.bn: 249 "DBTUP", s.proc: 2, s.sigId: 550288781 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'00000007 H'00000002
--------------- Signal ----------------
r.bn: 249 "DBTUP", r.proc: 2, r.sigId: 550288781 gsn: 202 "DROP_TAB_REQ" prio: 1
s.bn: 250 "DBDICT", s.proc: 2, s.sigId: 550288780 length: 4 trace: 0 #sec: 0 fragInf: 0
 senderRef: fa0002 senderData: 119 TableId: 2 requestType: 0
--------------- Signal ----------------
r.bn: 250 "DBDICT", r.proc: 2, r.sigId: 550288780 gsn: 204 "DROP_TAB_CONF" prio: 1
s.bn: 248 "DBACC", s.proc: 2, s.sigId: 550288776 length: 3 trace: 0 #sec: 0 fragInf: 0
 senderRef: f80002 senderData: 119 TableId: 2
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 2, r.sigId: 550288779 gsn: 409 "TIME_SIGNAL" prio: 1
s.bn: 252 "QMGR", s.proc: 2, s.sigId: 550288775 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 2, r.sigId: 550288778 gsn: 409 "TIME_SIGNAL" prio: 1
s.bn: 252 "QMGR", s.proc: 2, s.sigId: 550288775 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
Time: Friday 7 July 2006 - 14:04:44
Status: Temporary error, restart node
Message: Pointer too large (Internal error, programming error or missing error message, please report a bug)
Error: 2306
Error data: DbtupPageMap.cpp
Error object: DBTUP (Line: 297) 0x0000000e
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 9131
Trace: /space/run/ndb_2_trace.log.2
Version: Version 5.0.23
***EOM***
 
   [7 Jul 2006 14:29]
   Jonas Oreland        
  Hi, I tried all day to reproduce this...unfortunatly wo/ success. I did however find 2 bugs while searching, and fixed one... I found http://bugs.mysql.com/bug.php?id=20904 and http://bugs.mysql.com/bug.php?id=20905 Your most welcome to retest 20905 if you like..cause I think it's so weird that I'm uncertain to wheather i'm loosing it...
   [7 Jul 2006 14:39]
   Jonathan Miller        
  Here is the script that I used to produce this
Attachment: load.sh (application/x-shellscript, text), 1.98 KiB.
   [7 Jul 2006 15:59]
   Jonathan Miller        
  Time: Friday 7 July 2006 - 17:26:33 Status: Temporary error, restart node Message: Pointer too large (Internal error, programming error or missing error message, please report a bug) Error: 2306 Error data: DbtupPageMap.cpp Error object: DBTUP (Line: 297) 0x0000000a Program: /home/ndbdev/jmiller/builds/libexec/ndbd Pid: 17815 Trace: /space/run/ndb_2_trace.log.3 Version: Version 5.0.23 ***EOM*** I keep hitting it. I was able to recover last time. But the cluster will not allow and DDL operations with only one data node up. Please try the shell script I attached. Thanks
   [3 Aug 2006 7:37]
   Jonas Oreland        
  Hi, I finally manage to get this...but in 5.0 (still havent succeeded(?) in 5.1) while retesting bug#20905 /Jonas
   [3 Aug 2006 12:02]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10004 ChangeSet@1.2537, 2006-08-03 14:01:37+02:00, jonas@perch.ndb.mysql.com +1 -0 ndb - bug#20892 Fix bug in tup buddy allocator, which made it make invalid access to cfreepagelist[16] (which is not defined)
   [3 Aug 2006 12:50]
   Jonas Oreland        
  This patch is also needed in 5.0/5.1 http://lists.mysql.com/commits/10005
   [28 Aug 2006 9:27]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/10918 ChangeSet@1.2037, 2006-08-28 11:27:11+02:00, mskold@mysql.com +2 -0 ndb - bug#20892 Fix bug in tup buddy allocator, which made it make invalid access to cfreepagelist[16] (which is not defined)
   [1 Sep 2006 8:05]
   Jonas Oreland        
  pushed to 5.1.12
   [1 Sep 2006 19:33]
   Jonas Oreland        
  pushed to 5.0.25
   [6 Sep 2006 7:13]
   Jonas Oreland        
  pushed into 4.1.22
   [6 Sep 2006 11:00]
   Jon Stephens        
  Documented in 4.1.22, 5.0.25, and 5.1.12 changelogs.
   [4 Apr 2008 21:06]
   Sveta Smirnova        
  Bug #35795 was marked as duplicated of this one.


Description: In running hugoLoad on T8 using 1,000,000 (2GB table) I recieved: ERROR: 1221 REDO buffers overloaded, consult online manual (increase RedoBuffer) Status: Temporary error, Classification: Overload error File: HugoTransactions.cpp (Line: 609) Record 578773 could not be inserted, has retried 5 times ERROR: 1221 REDO buffers overloaded, consult online manual (increase RedoBuffer) Status: Temporary error, Classification: Overload error File: HugoTransactions.cpp (Line: 609) ERROR: 1221 REDO buffers overloaded, consult online manual (increase RedoBuffer) Status: Temporary error, Classification: Overload error File: HugoTransactions.cpp (Line: 609) ERROR: 827 Out of memory in Ndb Kernel, table data (increase DataMemory) Status: Permanent error, Classification: Insufficient space File: HugoTransactions.cpp (Line: 633) Which cause one of the data nodes to crash. Time: Thursday 6 July 2006 - 23:32:14 Status: Temporary error, restart node Message: Pointer too large (Internal error, programming error or missing error message, please report a bug) Error: 2306 Error data: dbtup/DbtupPageMap.cpp Error object: DBTUP (Line: 317) 0x0000000e Program: /home/ndbdev/jmiller/builds/libexec/ndbd Pid: 2816 Trace: /space/run/ndb_3_trace.log.2 Version: Version 5.1.12 (beta) ***EOM*** DBLQH 002479 002489 DBACC 008525 003208 003227 --------------- Signal ---------------- r.bn: 247 "DBLQH", r.proc: 3, r.sigId: 1224823220 gsn: 465 "DUMP_STATE_ORD" prio: 1 s.bn: 247 "DBLQH", s.proc: 3, s.sigId: 1224823207 length: 2 trace: 1 #sec: 0 fragInf: 0 H'000008fe H'00000001 --------------- Signal ---------------- r.bn: 254 "CMVMI", r.proc: 3, r.sigId: 1224823219 gsn: 247 "EVENT_REP" prio: 1 s.bn: 247 "DBLQH", s.proc: 3, s.sigId: 0 length: 11 trace: 1 #sec: 0 fragInf: 0 H'00000031 H'3a48514c H'6d754420 H'63612070 H'65766974 H'61635320 H'6365526e H'2064726f H'6973202d H'203a657a H'00313032 --------------- Signal ---------------- r.bn: 261 "PGMAN", r.proc: 3, r.sigId: 1224823218 gsn: 465 "DUMP_STATE_ORD" prio: 1 s.bn: 254 "CMVMI", s.proc: 3, s.sigId: 1224823201 length: 1 trace: 1 #sec: 0 fragInf: 0 H'000008fe --------------- Signal ---------------- r.bn: 259 "TSMAN", r.proc: 3, r.sigId: 1224823217 gsn: 465 "DUMP_STATE_ORD" prio: 1 s.bn: 254 "CMVMI", s.proc: 3, s.sigId: 1224823201 length: 1 trace: 1 #sec: 0 fragInf: 0 H'000008fe How to repeat: create_tab T8 hugoLoad -r 1000000 T8