Bug #26487 Bug in extent/page allocation when *using* >1gb datafiles
Submitted: 20 Feb 2007 4:51 Modified: 9 Mar 2007 1:53
Reporter: Tomas Ulin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S3 (Non-critical)
Version:5.1 OS:
Assigned to: Jonas Oreland CPU Architecture:Any

[20 Feb 2007 4:51] Tomas Ulin
Description:
freshly started 1-node cluster with mysqld
create tablespace and table
hugoLoad many rows...

Bam!

How to repeat:
see internal

DBLQH   004487 000436 
DBTUP   000587 000076 000076 000076 000076 001200 003758 003909 
        003758 004367 003758 004367 003758 004422 004367 003758 
        004422 004367 003758 004422 004367 003758 004431 003758 
        004422 004367 003758 004431 003758 004422 004367 003758 
        004422 004367 003758 004422 004367 003758 004422 004367 
        003758 004422 004367 003758 004422 004367 003758 004422 
        004367 003758 004422 004367 003758 004422 004367 003758 
        004422 004367 003758 004431 003758 004422 004367 001279 
        006105 
DBACC   002271 000427 
TSMAN   001828 000436 000455 
TSMAN   001828 000545 

--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 1, r.sigId: 94807407 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 1, s.sigId: 94807279 length: 15 trace: 1 #sec: 0 fragInf: 0
 H'00019a76 H'00003235 H'00000400 H'00120001 H'000000f1 H'00130000 H'00140020
 H'496cde5e H'85419542 H'7dddd91f H'e291bf6f H'26603c48 H'bef8cde4 H'5f1a8219
 H'dbae4de1
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 1, r.sigId: 94807406 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 1, s.sigId: 94807279 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00019a76 H'00003235 H'00000400 H'5bf15580 H'000000bc H'00080000 H'00090001
 H'00000007 H'000a0008 H'eecc5e3d H'ff616135 H'000b0001 H'00000074 H'000c0002
 H'000044aa H'000d0001 H'000000e1 H'000e0001 H'00000017 H'000f0001 H'0000004e
 H'00100001 H'00000084 H'00110001 H'000000bb
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 1, r.sigId: 94807405 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 1, s.sigId: 94807279 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00019a76 H'00003235 H'00000400 H'00010004 H'ffffffff H'0003000c H'628f37c0
 H'88113cc5 H'3702511d H'0004000b H'5205aaf6 H'e24141fa H'0087b377 H'00050015
 H'417c1e2d H'3c71472e H'f40d15d2 H'e4c4f66f H'bd0f1a7f H'00000018 H'00060000
 H'00070011 H'2069049a H'f0d15197 H'b117da87
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 1, r.sigId: 94807404 gsn: 316 "LQHKEYREQ" prio: 1
s.bn: 245 "DBTC", s.proc: 1, s.sigId: 94807279 length: 20 trace: 1 #sec: 0 fragInf: 0
 ClientPtr = H'00019a76 hashValue = H'47bebd25 tcBlockRef = H'00f50001
 transId1 = H'00003235 transId2 = H'00000400 savePointId = H'00000000
 Op: 2 Lock: 0 Flags: ScanInfo/noFiredTriggers: H'39be0
 AttrLen: 61 (5 in this) KeyLen: 2 TableId: 9 SchemaVer: 2
 FragId: 0 ReplicaNo: 0 LastReplica: 0 NextNodeId: 0
 ApiRef: H'80000004 ApiOpRef: H'00000030
 KeyInfo: H'628f37c0 H'88113cc5 
 AttrInfo: H'00020008 H'628f37c0 H'88113cc5 H'00000004 H'000ea827 

Time: Tuesday 20 February 2007 - 11:35:32
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a 
bug)
Error: 2341
Error data: dbtup/DbtupDiskAlloc.cpp
Error object: DBTUP (Line: 545) 0x0000000e
Program: /home/tomas/mysql-5.1-telco-6.1_2/storage/ndb/src/kernel/ndbd
Pid: 6226
Trace: /home/tomas/mysql-5.1-telco-6.1_2/mysql-test/ndbcluster-1186/ndb_1_trace.log.1
Version: mysql-5.1.15 ndb-

#0  0xffffe410 in __kernel_vsyscall ()
#1  0x400d3541 in raise () from /lib/tls/libc.so.6
#2  0x400d4dbb in abort () from /lib/tls/libc.so.6
#3  0x080e48b5 in childAbort (code=-1, currentStartPhase=255) at main.cpp:104
#4  0x082e447c in NdbShutdown (type=NST_ErrorHandler, restartType=NRT_Default) at Emulator.cpp:254
#5  0x082ee49f in ErrorReporter::handleError (messageID=2341, problemData=0x836d850 "dbtup/DbtupDiskAlloc.cpp", 
    objRef=0xbf8191b4 "DBTUP (Line: 545) 0x0000000e", nst=NST_ErrorHandler) at ErrorReporter.cpp:206
#6  0x082da4ec in SimulatedBlock::progError (this=0x40883008, line=545, err_code=2341, 
    extra=0x836d850 "dbtup/DbtupDiskAlloc.cpp") at SimulatedBlock.cpp:747
#7  0x08251cc9 in Dbtup::disk_page_prealloc (this=0x40883008, signal=0x84cc5dc, fragPtr={p = 0x4942a5a8, i = 5}, 
    key=0xbf8193bc, sz=1) at DbtupDiskAlloc.cpp:545
#8  0x082285f3 in Dbtup::handleInsertReq (this=0x40883008, signal=0x84cc5dc, regOperPtr={p = 0x4945e018, i = 114}, fragPtr=
      {p = 0x4942a5a8, i = 5}, regTabPtr=0x49c4c830, req_struct=0xbf819480) at DbtupExecQuery.cpp:1381
#9  0x08228f5c in Dbtup::execTUPKEYREQ (this=0x40883008, signal=0x84cc5dc) at DbtupExecQuery.cpp:736
#10 0x08109950 in SimulatedBlock::executeFunction (this=0x40883008, gsn=436, signal=0x84cc5dc) at SimulatedBlock.hpp:577
#11 0x08109abf in SimulatedBlock::EXECUTE_DIRECT (this=0x85ce378, block=249, gsn=436, signal=0x84cc5dc, len=18)
    at SimulatedBlock.hpp:752
#12 0x081b96d3 in Dblqh::acckeyconf_tupkeyreq (this=0x85ce378, signal=0x84cc5dc, regTcPtr=0x4f0a33e0, regFragptrP=0x4e6b756c, 
    local_key=4294967295, disk_page=1) at DblqhMain.cpp:4628
#13 0x081df427 in Dblqh::acckeyconf_load_diskpage (this=0x85ce378, signal=0x84cc5dc, regTcPtr={p = 0x4f0a33e0, i = 114}, 
    regFragptrP=0x4e6b756c, local_key=4294967295) at DblqhMain.cpp:4642
#14 0x081df7cd in Dblqh::execACCKEYCONF (this=0x85ce378, signal=0x84cc5dc) at DblqhMain.cpp:4546
#15 0x081df9f3 in Dblqh::exec_acckeyreq (this=0x85ce378, signal=0x84cc5dc, regTcPtr={p = 0x4f0a33e0, i = 114})
    at DblqhMain.cpp:3933
#16 0x081e0811 in Dblqh::prepareContinueAfterBlockedLab (this=0x85ce378, signal=0x84cc5dc) at DblqhMain.cpp:3861
#17 0x081e0a33 in Dblqh::endgettupkeyLab (this=0x85ce378, signal=0x84cc5dc) at DblqhMain.cpp:3748
#18 0x081e8312 in Dblqh::receive_attrinfo (this=0x85ce378, signal=0x84cc5dc, dataPtr=0x84cc628, length=12) at DblqhMain.cpp:3049
#19 0x081e86d8 in Dblqh::execATTRINFO (this=0x85ce378, signal=0x84cc5dc) at DblqhMain.cpp:3031
#20 0x08109950 in SimulatedBlock::executeFunction (this=0x85ce378, gsn=4, signal=0x84cc5dc) at SimulatedBlock.hpp:577
#21 0x082e0df3 in FastScheduler::doJob (this=0x84c8540) at FastScheduler.cpp:136
#22 0x082e1f8e in ThreadConfig::ipControlLoop (this=0x84dea60) at ThreadConfig.cpp:153
#23 0x080e59e5 in main (argc=4, argv=0xbf819f74) at main.cpp:473

540         }      
541         
542         alloc.m_curr_extent_info_ptr_i= ext.i;
543         ext.p->m_free_matrix_pos= RNIL;
544         pageBits= tsman.alloc_page_from_extent(&ext.p->m_key, bits);
545         ddassert(pageBits >= 0);
546       }
547       
548       /**
549        * We have a page from an extent
(gdb) p pageBits
$1 = -2
(gdb) p ext.p->m_key
$2 = {m_page_no = 5, m_page_idx = 2728, m_file_no = 0}
(gdb) p bits
$3 = 1
(gdb)
[20 Feb 2007 4:51] Tomas Ulin
config.ini

Attachment: config.ini (application/octet-stream, text), 773 bytes.

[20 Feb 2007 4:54] Tomas Ulin
truncated trace log

Attachment: ndb_1_trace.log.1.gz (application/x-tar, text), 5.26 KiB.

[20 Feb 2007 4:55] Tomas Ulin
cluster log looks norma

2007-02-20 11:34:25 [MgmSrvr] INFO     -- Node 1: Local checkpoint 82 started. Keep GCI = 4062 oldest restorable GCI = 4065
2007-02-20 11:34:44 [MgmSrvr] INFO     -- Node 1: Local checkpoint 83 started. Keep GCI = 4095 oldest restorable GCI = 4098
2007-02-20 11:34:59 [MgmSrvr] INFO     -- Node 1: Data usage is 61%(1573 32K pages of total 2560)
2007-02-20 11:34:59 [MgmSrvr] INFO     -- Node 1: Index usage is 55%(1711 8K pages of total 3104)
2007-02-20 11:34:59 [MgmSrvr] INFO     -- Node 1: Resource 0 min: 0 max: 639 curr: 268
2007-02-20 11:34:59 [MgmSrvr] INFO     -- Node 1: Resource 1 min: 0 max: 0 curr: 3
2007-02-20 11:34:59 [MgmSrvr] INFO     -- Node 1: Resource 2 min: 0 max: 0 curr: 9
2007-02-20 11:35:06 [MgmSrvr] INFO     -- Node 1: Local checkpoint 84 started. Keep GCI = 4121 oldest restorable GCI = 4124
2007-02-20 11:35:27 [MgmSrvr] INFO     -- Node 1: Local checkpoint 85 started. Keep GCI = 4156 oldest restorable GCI = 4159
2007-02-20 11:35:33 [MgmSrvr] ALERT    -- Node 2: Node 1 Disconnected
2007-02-20 11:35:35 [MgmSrvr] INFO     -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 00000000000000000000000000000000000
0000000000000000000000000000c.
2007-02-20 11:35:36 [MgmSrvr] ALERT    -- Node 1: Forced node shutdown completed. Initiated by signal 6. Caused by error 2341: 'I
nternal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Tempo
rary error, restart node'.
[20 Feb 2007 4:59] Tomas Ulin
cluster comes up alright after system restart

master> select count(*) from  TEST_DB.SUBS_DATA;
+----------+
| count(*) |
+----------+
|   959196 | 
+----------+
1 row in set (0.06 sec)
[20 Feb 2007 5:05] Tomas Ulin
master> select `FILES`.`FILE_NAME` AS `FILE_NAME`,(`FILES`.`TOTAL_EXTENTS` * `FILES`.`EXTENT_SIZE`) AS `Total`,(`FILES`.`FREE_EXTENTS` * `FILES`.`EXTENT_SIZE`) AS `Free`,(((`FILES`.`FREE_EXTENTS` * `FILES`.`EXTENT_SIZE`) * 100) / (`FILES`.`TOTAL_EXTENTS` * `FILES`.`EXTENT_SIZE`)) AS `% Free`,`FILES`.`EXTRA` AS `EXTRA` from `INFORMATION_SCHEMA`.`FILES` where ((`FILES`.`ENGINE` = _utf8'ndbcluster') and (`FILES`.`FILE_TYPE` = _utf8'DATAFILE')) order by `FILES`.`FILE_NAME`,`FILES`.`EXTRA`;
+-------------------+------------+------------+---------+----------------+
| FILE_NAME         | Total      | Free       | % Free  | EXTRA          |
+-------------------+------------+------------+---------+----------------+
| data_preprov1.dat | 4294967296 | 2865758208 | 66.7236 | CLUSTER_NODE=1 | 
+-------------------+------------+------------+---------+----------------+
[20 Feb 2007 5:07] Tomas Ulin
(after restart)

tomas@poseidon:~/mysql-5.1-telco-6.1_2/storage/ndb/test> ../tools/ndb_desc lg_preprov1
Type: LogfileGroup
Name: lg_preprov1
UndoBuffer size: 8388608
Version: 1
Free Words: 7320072
AutoGrow.min_free: 0
AutoGrow.max_size: 0
AutoGrow.file_size: 0
AutoGrow.filename_pattern: (null)

NDBT_ProgramExit: 0 - OK

tomas@poseidon:~/mysql-5.1-telco-6.1_2/storage/ndb/test> ../tools/ndb_desc  ts_prepov1
Type: Tablespace
Name: ts_prepov1
Object Version: 1
Extent Size: 1048576
Default Logfile Group: lg_preprov1

NDBT_ProgramExit: 0 - OK

tomas@poseidon:~/mysql-5.1-telco-6.1_2/storage/ndb/test> ../tools/ndb_desc  data_preprov1.dat
Type: Datafile
Name: data_preprov1.dat
Node: 1
Path: data_preprov1.dat
Size: 4294967296
Free: 2865758208
Tablespace: ts_prepov1

NDBT_ProgramExit: 0 - OK

tomas@poseidon:~/mysql-5.1-telco-6.1_2/storage/ndb/test> ../tools/ndb_desc  undo_preprov1.dat
Type: Undofile
Name: undo_preprov1.dat
Node: 1
Path: undo_preprov1.dat
Size: 33554432
Logfile Group: lg_preprov1

NDBT_ProgramExit: 0 - OK
[20 Feb 2007 5:11] Tomas Ulin
information schema select (after restart)

Attachment: information_schema.txt (text/plain), 3.34 KiB.

[20 Feb 2007 5:20] Tomas Ulin
ndb_1_out  lots of stuff written there....

Attachment: ndb_1_out.log.gz (application/x-tar, text), 65.38 KiB.

[21 Feb 2007 13:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/20263

ChangeSet@1.2417, 2007-02-21 14:40:15+01:00, jonas@eel.(none) +1 -0
  ndb - bug#26487
    fix bug in extent/page allocation when *using* >1gb datafiles
[21 Feb 2007 18:17] Jonas Oreland
pushed to 5.1-ndb, telco-6.1, 5.1-telco
[27 Feb 2007 6:27] Jon Stephens
Documented bugfix in telco-5.1.15/6.1.3 changelog; left bug status unchanged.
[8 Mar 2007 16:46] Tomas Ulin
5.1.17
[9 Mar 2007 1:53] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.17 changelog; closed.