Bug #40123 Falcon crash with falcon_serial_log_buffers=4000
Submitted: 18 Oct 2008 8:18 Modified: 9 Jan 2009 14:03
Reporter: Xuekun Hu Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:mysql-6.0-falcon OS:Linux (SLES10SP1(2.6.16.46-0.12-smp))
Assigned to: Kevin Lewis CPU Architecture:Any
Tags: F_MISCELLANY
Triage: Triaged: D1 (Critical)

[18 Oct 2008 8:18] Xuekun Hu
Description:
By setting falcon_serial_log_buffers=4000, starting falcon will crash. As the document, falcon_serial_log_buffers could be a value between 10 and 32768. My system total memory is 16GB. 

The core stack is: 

/home/mysql/mysql/libexec/mysqld(my_print_stacktrace+0x2e)[0x9b47ce]
/home/mysql/mysql/libexec/mysqld(handle_segfault+0x33b)[0x60b16b]
/lib64/libpthread.so.0[0x2b9117516e70]
/home/mysql/mysql/libexec/mysqld(_ZN9SerialLog13initializeLogEx+0x72)[0x873e42]
/home/mysql/mysql/libexec/mysqld(_ZN9SerialLog7recoverEv+0x2c8)[0x875378]
/home/mysql/mysql/libexec/mysqld(_ZN8Database12openDatabaseEPKc+0xf6)[0x821296]
/home/mysql/mysql/libexec/mysqld(_ZN10Connection11getDatabaseEPKcS1_P7Threads+0xcc)[0x815b3c]
/home/mysql/mysql/libexec/mysqld(_ZN10Connection12openDatabaseEPKcS1_S1_S1_S1_P7Threads+0xa7)[0x8184a7]
/home/mysql/mysql/libexec/mysqld(_ZN15StorageDatabase17getOpenConnectionEv+0x58)[0x7e2c08]
/home/mysql/mysql/libexec/mysqld(_ZN14StorageHandler10initializeEv+0x99)[0x7e5d19]
/home/mysql/mysql/libexec/mysqld(_ZN16StorageInterface11falcon_initEPv+0x159)[0x7ddb19]
/home/mysql/mysql/libexec/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x38)[0x6fe198]
/home/mysql/mysql/libexec/mysqld[0x7795fa]
/home/mysql/mysql/libexec/mysqld(_Z11plugin_initPiPPci+0x8c9)[0x77c0e9]
/home/mysql/mysql/libexec/mysqld[0x60f42f]
/home/mysql/mysql/libexec/mysqld(main+0x805)[0x610535]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2b91189458b4]
/home/mysql/mysql/libexec/mysqld(__gxx_personality_v0+0x271)[0x548c79]
081017 11:07:15 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=0
max_threads=1601
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4943199 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x40000
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[Falcon] Error: assertion (lockState == 0) failed at line 132 in file SyncObject.cpp

Fatal signal 6 while backtracing

How to repeat:
start mysql/falcon with --falcon_serial_log_buffers=4000
[19 Oct 2008 13:00] Sveta Smirnova
Thank you for the report.

On 32-bit Linux verified as described what is probably not a bug. On 64-bit Linux with 10G RAM server skips Falcon storage engine, but works. Error log file contains in both cases:

081019 13:59:49 [ERROR] Falcon : exception 'illegal memory allocate for -100662752 bytes'during initialization
081019 13:59:49 [ERROR] Plugin 'Falcon' init function returned error.
081019 13:59:49 [ERROR] Plugin 'Falcon' registration as a STORAGE ENGINE failed.
[20 Oct 2008 15:15] Ann Harrison
I suspect that this is a documentation error - or rather, an
oversight.  Each log buffer is a megabyte.  Allocating 32,768
1 MB buffers is a bad idea.   Maybe suggesting a number between
10 and 100 would be wiser.

If I ever get a working system again, I'll look at this.
[20 Oct 2008 16:49] Kevin Lewis
Ann,  The problem occurs in SerialLog::open() with this line;

	int len = windowBuffers * SRL_WINDOW_SIZE + sectorSize;
	bufferSpace = new UCHAR[len];

When windowBuffers == 4000, the integer goes negative on 32 bit machines.

The question is, should we allow that many windowBuffers?  Will it help on systems with massive amounts of memory?  It will not help consolidate writes since there is always only one writeBuffer.  Reading from these buffers happens during thaws and processing by the gopher threads.  

My hunch is that the gopher threads do not need that many buffers.  They should not be that far behind. Jim initially created this parameter because the gopher threads were getting behind.  But now that there is a pool of gophers, that does not happen.

Thawing should be a rare event.  Not something we reserve a lot of memory for.  

Based on that reasoning, I think a maximum of 100 or 200 is appropriate. (100Mb or 200 Mb of memory dedicated to serial log buffers)
[20 Oct 2008 17:46] Ann Harrison
The reason for allowing a variable number of window buffers is to
avoid I/O contention when the transaction mix includes some very
long transactions.  Lagging gophers also made the serial log grow
and caused I/O as buffers were reallocated between buffers, but
the major reason is long transactions.  Ideally, we would have
some performance reporting that indicated how often buffers were
remapped between windows - causing a section of the serial log
to disappear from memory, but at the moment we don't.  Using
that information, a system manager could allocate a few more
buffers to reduce that load.

However, 4000 buffers is way beyond what I would ever want to see -
though Olav has created situations where the serial log grew beyond
4GB. At that point, the log has to go to disk.

So, I don't recommend removing the parameter - there will be cases
where 20 or 40 serial log buffers will perform better than 10, but
I suggest we cap the number at 1000, and document that cap.  Beyond
there, the probability is that we'll get better performance by
allocating the memory to record cache and rereading the serial log
as necessary to move committed changes to the database.
[20 Oct 2008 23:09] Kevin Lewis
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56642

2871 Kevin Lewis	2008-10-20

       Bug#40123 - Make the upper limit of falcon-serial-log-buffer=1000
       Since the minimum value is referenced in a number of places,
       make a #define SRL_MIN_WINDOWS
[28 Oct 2008 8:10] Bugs System
Pushed into 6.0.8-alpha  (revid:klewis@mysql.com-20081020212811-t82q7vx7o5y6osd2) (version source revid:klewis@mysql.com-20081020212811-t82q7vx7o5y6osd2) (pib:5)
[9 Jan 2009 14:03] MC Brown
A note has been added to the 6.0.8 changelg: 

The maximum value for falcon-serial-log-buffer has been reduced to 1000.