Bug #58528 Stack tracer errorneously reports the query pointer as invalid
Submitted: 26 Nov 2010 20:03 Modified: 26 Nov 2010 21:05
Reporter: Sasha Pachev Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Logging Severity:S3 (Non-critical)
Version: OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: invalid pointer, stack trace

[26 Nov 2010 20:03] Sasha Pachev
Description:
Stack tracer errorneously reports the query pointer as invalid on Linux when it happens to reside in the mmap() allocated space as opposed to brk() extended heap.

The stack tracer has the following sanity check to see if the pointer is valid:

#define PTR_SANE(p) ((p) && (char*)(p) >= heap_start && (char*)(p) <= heap_end)

heap_start is initialized like this:

#ifdef HAVE_BSS_START
extern char *__bss_start;
#endif

void my_init_stacktrace()
{
#ifdef HAVE_BSS_START
heap_start = (char*) &__bss_start;
#endif
}

In other words, not initialized if the linker does not define __bss_start, but this is not too big of a deal because it normally would define it. Still there should be a warning in the stack trace about heap_start not initialized instead of silently using an uninitialized value.

The bigger problem is in how heap_end is determined:

char *heap_end= (char*) sbrk(0);

sbrk(0) gives you the end of the heap, but the problem is that when you malloc() 128 K chunk or more,  with the default glibc behavior it ends up outside of the heap. malloc() will allocate small chunks with brk() call, and big ones with mmap(). The query pointer is allocated via MEM_ROOT system, so the length of the allocated chunk where the query pointer will reside will depend on the prior MEM_ROOT activity, and it could end up either in the heap or in the mmap() space. In the former case, we get the query, in the latter we will get a message about the invalid pointer.

How to repeat:
Execute a query exceeding 128 K in size that coredumps. This will take some tricks. For example you can use a UDF that follows a NULL pointer, or craft a corrupt .frm file.

Suggested fix:
Once handle_segfault() has been called, read the process memory map from /proc/$(pidof mysql)/maps (unless there is a better way which I could not find), parse it out, while you have it you may just as well print it to stderr along with the stats on total memory usage. Populate an address table, and use it to validate pointers against.

Important implementation details. Since this is executed from a crash handler, the code will often run under memory shortage. So all of the memory for this should be pre-allocated. No libc calls that allocate memory underneath should be issued. The memory for the address table initialization needs to be touched (e.g with memset) immediately after the allocation, otherwise we may segfault on a page fault with no pages available when we actually need it. To save memory, the pages that do not have read+write permissions enabled should be ignored, as a valid query pointer will never point to them. The initialization of the address table should take place after the stack trace has been printed. If we crash for some reason at this point, we will at least have left a legacy of a good stack trace instead of just crash. The amount of this memory should be user-configurable with the default of 256K.
[26 Nov 2010 21:05] Davi Arnaut
Closed as a duplicate of Bug#51817.
[26 Nov 2010 21:16] Davi Arnaut
Hi Sasha,

Very good explanation for the overall problem, its a shame that there was the other bug report already. I ended up using /proc/self/task/<tid>/mem, which saves the trouble of having to parse /proc/self/maps. Testing is appreciated :-)
[5 Dec 2010 12:44] Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[17 Dec 2010 12:49] Bugs System
Pushed into mysql-5.1 5.1.55 (revid:georgi.kodinov@oracle.com-20101217124435-9imm43geck5u55qw) (version source revid:mats.kindahl@oracle.com-20101201193331-1c07sjno2g7m46ix) (merge vers: 5.1.55) (pib:24)
[17 Dec 2010 12:53] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:georgi.kodinov@oracle.com-20101217124733-p1ivu6higouawv8l) (version source revid:davi.arnaut@oracle.com-20101130190653-n889okigpt36igxv) (merge vers: 5.5.8) (pib:24)