Bug #15922 | Test case for bug#10100 fails with wrong error on Solaris 8 Sparc 64 bit | ||
---|---|---|---|
Submitted: | 22 Dec 2005 2:24 | Modified: | 12 Oct 2007 18:52 |
Reporter: | Kent Boortz | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: Stored Routines | Severity: | S2 (Serious) |
Version: | 5.0.17 5.0.18-pre 5.1.4 | OS: | Solaris (Solaris 8 64-bit) |
Assigned to: | CPU Architecture: | Any |
[22 Dec 2005 2:24]
Kent Boortz
[17 Jul 2006 18:21]
luojia chen
I got the same error when I compiled it on Solaris 10 x64(amd64). Kent, What compiler did you use to compile 5.0.18 with the "sp" test failed with the reported error, and how's the progress of this bug in the later version of MySQL(after 5.0.18) on Solaris platform? Thanks for your information!
[17 Jul 2006 20:52]
luojia chen
In addition, it will be great to know the information on what 2013 and 1436 error messages are and why it is supposed to fail exactly with 1436? Here is the source snippet which causes the error: set @@max_sp_recursion_depth=255| set @var=1| #disable log because error about stack overrun contains numbers which #depend on a system -- disable_result_log -- error ER_STACK_OVERRUN_NEED_MORE call bug10100p(255, @var)| -- error ER_STACK_OVERRUN_NEED_MORE call bug10100pt(1,255)| -- error ER_STACK_OVERRUN_NEED_MORE call bug10100pv(1,255)| -- error ER_STACK_OVERRUN_NEED_MORE call bug10100pd(1,255)| -- error ER_STACK_OVERRUN_NEED_MORE call bug10100pc(1,255)| -- enable_result_log If I change the value call bug10100p(255, @var)| to, say, call bug10100p(5, @var)| the test passes. It is not ok to the harness too, as it expects the test to fail with exact errno 1436, why?
[30 Aug 2006 19:25]
luojia chen
After the further investigation, we found the reason of "mysql" crash in this "sp" test case as bellow: The SQL query in sp test suppose to overflow small stack allocated for the thread and is waiting for correct error STACK_OVERRUN. Instead mysqld drops a core file. Searching how it's supposed to work I found that MySQL is using the following function to check whether we still have enough stack: bool check_stack_overrun(THD *thd, long margin, char *buf __attribute__((unused))) { long stack_used; DBUG_ASSERT(thd == current_thd); if ((stack_used=used_stack(thd->thread_stack,(char*) &stack_used)) >= (long) (thread_stack - margin)) { sprintf(errbuff[0],ER(ER_STACK_OVERRUN_NEED_MORE), stack_used,thread_stack,margin); my_message(ER_STACK_OVERRUN_NEED_MORE,errbuff[0],MYF(0)); thd->fatal_error(); return 1; } #ifndef DBUG_OFF max_stack_used= max(max_stack_used, stack_used); #endif return 0; } where used_stack() macro is defined above this function as: #if STACK_DIRECTION < 0 #define used_stack(A,B) (long) (A - B) #else #define used_stack(A,B) (long) (B - A) #endif You can see that it depends on STACK_DIRECTION value which is set in config.h by ./configure script. At optimizations of -xO4 and above using Sun's compiler the value of STACK_DIRECTION is set incorrectly to 1 (above zero) meaning that stack grows to higher addresses which is wrong. I've checked where's an error and found the following test in ./configure used to determine direction of stack growth: int find_stack_direction () { static char *addr = 0; auto char dummy; if (addr == 0) { addr = &dummy; return find_stack_direction (); } else return (&dummy > addr) ? 1 : -1; } You can see that the test checks addresses of local variable when the function is called for the first time and again for the second time. Basically this should give correct answer but on higher levels of optimizations our compiler decides to inline the call to find_stack_direction() because this should definitely increase the performance. It is strange for me that GCC doesn't inline this call, but I think it will probably do inline as well some day and then it would run into the same problem: Inlining makes both dummy variables (one for the first call of function and another one for recursive call) to be allocated on the same stack frame. In this condition noone can be sure about the placement of these two variables. So in order to resolve this problem completely, it will need to change the current way how the ./configure determines stack growth direction.
[23 Nov 2006 13:36]
Peter O'Gorman
This is also an issue on hpux11.23/IA64 with aCC and optimization at +O2.
[25 Nov 2006 11:43]
Kent Boortz
In the Solaris case a #pragma no_inline(find_stack_direction) after the function declaration, before main, solves the problem. But it is better to find a more permanent solution. The test used in Ruby configure.in, on the page http://www.opensource.apple.com/darwinsource/Current/ruby-22.2.2/ruby/configure.in is confirmed to work for Solaris 8 "Sun C 5.6 2004/07/15" with or without the "volatile" declaration. Another solution is to mix static knowledge with a test, like in "libsigsegv" http://cl-debian.alioth.debian.org/repository/pvaneynd/libsigsegv-upstream/configure.ac
[12 Oct 2007 18:52]
Konstantin Osipov
This is no longer relevant.