Bug #43304 | falcon_limit fails due to deadlock with blocked Maria threads | ||
---|---|---|---|
Submitted: | 2 Mar 2009 10:38 | Modified: | 2 Mar 2009 14:51 |
Reporter: | Olav Sandstå | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Maria storage engine | Severity: | S3 (Non-critical) |
Version: | 6.0.10-alpha | OS: | Solaris |
Assigned to: | CPU Architecture: | Any |
[2 Mar 2009 10:38]
Olav Sandstå
[2 Mar 2009 12:30]
Guilhem Bichot
I looked at each thread in the posted stack trace link. There are indeed some threads which are inside Maria code, because of internally generated temporary tables (generated by GROUP BY or ORDER BY) are created of type Maria in 6.0: for example, thread 32 builds an internal Maria table to execute this query: SELECT * FROM E AS X LEFT JOIN E AS Y ON ( X . `varchar_key` = Y . `datetime_key` ) ORDER BY X . `date_key` LIMIT 8 But it's not clear if such thread is stalled. Another example: Thread 32 (process 2645055 ): #0 0x00a2fa2c in _db_keyword_ (cs=0x179b1b50, keyword=0xd9cf08 "mutex", strict=65538) at dbug.c:1829 #1 0x00a2e840 in _db_doprnt_ (format=0xd9cf18 "%s (0x%lx) locking") at dbug.c:1342 #2 0x00a23614 in safe_mutex_lock (mp=0x18bc060, my_flags=0, file=0xd855e0 "ma_pagecache.c", line=2949) at thr_mutex.c:171 this is a DBUG_PRINT(), can never stall. Note that many threads have their trace ending with: "Backtrace stopped: previous frame identical to this frame (corrupt stack?)" which is worrying. I cannot see that the cause of the stall is in Maria. It could be elsewhere, for example look at this: there are 4 threads which are in: copy_fields (param=0x17f94fb8) at sql_select.cc:20371 this place of code is a for() loop; this may be an infinite loop there. I suggest two ways to debug this: - change the Random Query Generator so that it tells the thread ids of the stalled threads - build with --without-maria, and re-run; this will cause MyISAM to be used for internally generated tables (GROUP BY, ORDER BY) (for example: in 6.0-falcon, disable Maria in builds, see if it helps; if yes, contact the Maria team again; if not, re-enable Maria :)
[2 Mar 2009 12:33]
Guilhem Bichot
Need feedback (more debug info or tests) from Falcon team
[2 Mar 2009 13:28]
Philip Stoev
Guilhem, Thanks for your feedback. I did not realise that --without-maria can be used to convert the temporary tables back to MyISAM. I have now requested from the build team that the Falcon RQG tests are run on a binary produced using --without-maria. I will make the RQG dump better output for this "deadlock" situation.
[2 Mar 2009 14:51]
Philip Stoev
After running this manually it appears that there is no bug, it is just slow due to bad Sparc T2 performance and big joins with no index. To alleviate the issue , I have done the following for PB2: * Run the test with joining a small to a big table, rather than two big tables * Start the test with --loose-maria-pagecache-buffer-size=64M * Request that Maria is not compiled in at all for Falcon tests It is likely though that Maria performance is lower than MyISAM's , however this is outside of the scope of this bug.