Bug #40791 Inserting a large number of rows in Maria causes a hang
Submitted: 17 Nov 2008 16:45 Modified: 18 Dec 2008 8:56
Reporter: Vemund Østgaard Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S3 (Non-critical)
Version:6.0.8 OS:Linux (Linux siv35 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006 x86_64 x86_64 x86_64 GNU/Linux)
Assigned to: CPU Architecture:Any

[17 Nov 2008 16:45] Vemund Østgaard
Description:
Problem is observed when running suite/large_tests after changing engine in the .test file to maria.

The test inserts larger and larger chunks of data into the same table, doing repeatedly: "insert into t1 select * from t1". After about 15 minutes of this, the test has reached an insert of about 67 million records (and the current size of the table is also 67 million records). At around this point in the test activity on the server seemed to slow down. I connected to mysqld with mysql client and did a "select count(*) from t1;", which returned and did not return an answer and has now been hanging like that for 2 hours. The test is also hanging at the same insert statement.

The problem has been observed repeatedly and seemes 100% reproducible when running with the regular mysqld. When using mysqld-debug I was not able to reproduce the same problem (the test completed the 67 million record insert and proceeded to the next).

The stacktrace of all the threads will be attached after the bug has been created. 

How to repeat:
Run suite/large_tests after changing engine in the .test file to maria. Might not reproduce with a mysqld compiled with debug.
[17 Nov 2008 16:51] Vemund Østgaard
threaddump

Attachment: threaddump (application/octet-stream, text), 13.71 KiB.

[17 Nov 2008 16:59] Guilhem Bichot
Thank you for this bug report. Unfortunately, the threaddump is invalid, as it states that
#8  0x0000000000a1e968 in pagecache_unlock_by_link (pagecache=0x1aaf930, block=0x1aaf320, lock=PAGECACHE_LOCK_READ_UNLOCK, pin=28009498,
    first_REDO_LSN_for_page=3003604992, lsn=35, was_changed=1 '\001', any=0 '\0') at ma_pagecache.c:3013
#9  0x0000000000086745 in ?? ()
#10 0x0000000001aaf320 in ?? ()
#11 0x0000000001aaf930 in ?? ()
#12 0x00000000b3076000 in ?? ()
#13 0x0000000000a36dd8 in _ma_set_share_data_file_length (share=0x1abe5a0, new_length=0) at ma_state.c:550

which is impossible (_ma_set_share_data_file_length() is a very short function which does not call pagecache_unlock_by_link()).
[17 Nov 2008 19:47] Guilhem Bichot
such bad thread dump makes me wonder about memoy corrution. Could you please re-run the test with mysqld under Valgrind?
[18 Dec 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[18 Dec 2008 8:56] Guilhem Bichot
Actually Vemund provided feedback, and the problem seems to be gone now (could be explained by some recent fixes by Monty and Serg for bugs corrupting memory).