Bug #51817 incorrect assumption: thd->query at 0x2ab2a8360360 is an invalid pointer
Submitted: 8 Mar 2010 6:05 Modified: 10 Jan 2011 3:48
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Errors Severity:S3 (Non-critical)
Version:5.1.40, 5.5.1 OS:Linux (FC13 x64)
Assigned to: Davi Arnaut
Triage: Triaged: D3 (Medium)

[8 Mar 2010 6:05] Shane Bester
Description:
when mysqld crashes, a message is written to the log containing this:

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x2ab2a8360360 is an invalid pointer
thd->thread_id=36
thd->killed=KILL_QUERY

However, 0x2ab2a8360360 is a perfectly valid pointer, and I have
many corefiles to prove this.

(gdb) x/1s 0x2ab2a8360360
0x2ab2a8360360:  "show create table t1"

How to repeat:
not easily repeatable.  crash the server using one of many known crashing testcases and see if the query is always printed. it's not.

Suggested fix:
fix the logic that checks if a pointer is valid.  it's very important we do everything we can to print the query that caused a crash.
[17 Jun 2010 12:21] Shane Bester
very inconvenient for folks debugging customer's crashes... workaround is to get a corefile and wait for another crash.
[13 Oct 2010 9:44] Konstantin Osipov
Please provide hardware, OS and compiler information.
[13 Oct 2010 9:57] Shane Bester
This now happens 100% of the time on my FC13 x64 machine at home.

Official 5.x .tar.gz linux builds from dev.mysql.com and self-made binaries
all have the same issue, no matter if debug or release versions.

I'll add more info from a real case, and print all values in PTR_SANE macro.
[13 Oct 2010 11:12] Davi Arnaut
AFAIK, the only way to fix it on Linux is to open /proc/self/maps and verify whether the address is within one of the mapped memory regions and the access permission. Not complicate, added to my list as a back burner.
[19 Nov 2010 6:42] James Day
Thanks Davi. The query is pretty much the first thing that the Support team wants to see when we get a crash report because we can correlate that with known bugs and possibly produce a very rapid solution.
[22 Nov 2010 20:31] Davi Arnaut
Actually, there is a better way that does not involve parsing /proc/self/maps. There is /proc/self/mem, which can be used to access the process memory through a file descriptor. The offset is equivalent to the memory address, and reading fails if the address is not mapped, that is, if the offset is not within a mapped page.
[22 Nov 2010 23:19] Davi Arnaut
Meh, does not work because of some draconian permission rules. Only the main thread may read into the process memory...
[23 Nov 2010 11:13] Davi Arnaut
.. and there is a workaround, which is to use /proc/self/task/<tid>/mem.
[23 Nov 2010 11:58] Davi Arnaut
Output will look like the below on Linux:

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x6b8b4567): Can't read from address 0x6b8b4567: Input/output error.
Connection ID (thread ID): 2
Status: NOT_KILLED
[23 Nov 2010 12:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124733

3516 Davi Arnaut	2010-11-23
      Bug#51817: incorrect assumption: thd->query at 0x2ab2a8360360 is an invalid pointer
      
      The problem is that the logic which checks if a pointer is
      valid relies on a poor heuristic based on the start and end
      addresses of the heap. Apart from miscalculating the heap
      bounds, this approach also suffers from the fact that memory
      can come from other places than the heap.
      
      On Linux, the solution is to access the process's memory
      through /proc/self/task/<tid>/mem, which allows for retrieving
      the contents of pages within the virtual address space of
      the calling process. If a address range is not mapped, a
      input/output error is returned.
     @ client/mysqltest.cc
        Use new interface to my_safe_print_str.
     @ include/my_stacktrace.h
        Drop name from my_safe_print_str.
     @ mysys/stacktrace.c
        Access the process's memory through a file descriptor and
        dump the contents of a memory range. The file descriptor
        offset is equivalent to a offset into the address space.
        
        Do not print the name of the variable associated with the
        address. It can be better accomplished at a higher level.
     @ sql/mysqld.cc
        Put the variable dumping information within its own newline block.
        Use symbolic names which better convey information to the user.
[26 Nov 2010 21:23] Jonas Oreland
isn't it missing a "continue;" just after
+      if (errno != EINTR)
+        break;
[26 Nov 2010 21:26] Davi Arnaut
Only if we expect it to be interrupted by a signal. Which I think can't happen because we enter handle_segfault, the whole thing will exit(1). Need to test this to be sure..
[26 Nov 2010 21:28] Davi Arnaut
But, yes.. the construct is weird.. will fix :-)
[26 Nov 2010 21:29] Davi Arnaut
Bug#58528 has been closed as a duplicate of this one.
[26 Nov 2010 21:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/125206

3521 Davi Arnaut	2010-11-26
      Bug#51817: incorrect assumption: thd->query at 0x2ab2a8360360 is an invalid pointer
      
      The problem is that the logic which checks if a pointer is
      valid relies on a poor heuristic based on the start and end
      addresses of the data segment and heap.
      
      Apart from miscalculating the heap bounds, this approach also
      suffers from the fact that memory can come from places other
      than the heap. See Bug#58528 for a more detailed explanation.
      
      On Linux, the solution is to access the process's memory
      through /proc/self/task/<tid>/mem, which allows for retrieving
      the contents of pages within the virtual address space of
      the calling process. If a address range is not mapped, a
      input/output error is returned.
     @ client/mysqltest.cc
        Use new interface to my_safe_print_str.
     @ include/my_stacktrace.h
        Drop name from my_safe_print_str.
     @ mysys/stacktrace.c
        Access the process's memory through a file descriptor and
        dump the contents of the memory range. The file descriptor
        offset is equivalent to a offset into the address space.
        
        Do not print the name of the variable associated with the
        address. It can be better accomplished at a higher level.
     @ sql/mysqld.cc
        Put the variable dumping information within its own newline block.
        Use symbolic names which better convey information to the user.
[26 Nov 2010 22:03] Sasha Pachev
Davi:

Thanks for fixing the bug. I have two comments on the patch:

a) Why limit the query to 1024 bytes? There are many queries that are longer than that. In fact, most queries that crash the server would fit in that group.

b) It would be very helpful to dump /proc/self/maps to stderr and count up total memory footprint of the process. Most of the segfault crashes we see at Percona working with our customers are related to running the server out of memory, but it takes some wild goose chasing to determine that post-crash.

c) Print the unprintable characters anyway. We want to record the query that crashed the server in its full beauty. Put a warning at the start that the log should be examined with od or some other hex viewer if there are odd characters in the output.
[26 Nov 2010 22:12] Sasha Pachev
In addition it would also be good to dump /proc/self/stat and /proc/self/smaps if available.
[26 Nov 2010 22:14] Davi Arnaut
Sasha,

1) It is the behavior already established. I guess that at 1024 characters the user can already identify the query from some other source. But I guess we could increase it, but I think we need a limit anyway as the query string length variable might have been corrupted. Suggestions?

2) I don't want to have to parse /proc/self/maps as it is a bit hard to be done properly -- I don't want to increase the use of functions which aren't asynchronous-safe. OTOH, just dumping it is viable.

3) I don't know, I don't want to complicate this too much. I guess for these cases a core file is more useful.
[26 Nov 2010 22:17] Davi Arnaut
FWIW, we would gladly take patches improving the overall stacktrace code ;-)
[26 Nov 2010 22:19] Davi Arnaut
BTW, smaps and such can be quite large. I think it might be worth to aggregate those files in separate crash log file. From the usual log we print a reference (pathname) of the crash log. What do you think?
[27 Nov 2010 19:35] Sasha Pachev
Davi:

Parsing out /proc/self/maps may be an overkill after all if you are dumping /proc/self/status. But dumping it is a good idea. It will show you what versions of libc mysqld was using, if it was linked statically or dynamically, and if it loaded some funny plugins/UDFs. Believe it or not, we have seen things as odd as a Java VM loaded inside mysqld memory space in our consulting work. If you are dumping /proc/self/smaps you will not need /proc/self/maps, but /proc/self/smaps is not always available.

I think it is simpler to just dump it into the error log both in terms of development and user interaction. Note that a segfault on a busy server often means 30 minutes of downtime as InnoDB needs to do the recovery. So a lot of times you are dealing with a busy DBA in the state of crisis management. It is easier to tell him "send me the error log" than "read what it says in the error log and do what it tells you".

The query limit could be something like 1 MB and should be user configurable. 1 K is way too small. Note that in practice tracking down problem queries to their source in the application could be quite an adventure. Also, the query could have the critical parts that make a difference for crashing and not crashing at the end, and on a busy site those parts could vary in a multitude of ways. So you really want to have the ability to print the entire query of arbitrary size if desired, while at the same time protecting against the corruption of query_length variable and filling up the error log with garbage. A good way to do this is set the default to something like 1 MB, and in the rare case that is not enough the user will be able to change it in my.cnf.

Regarding the core files. We are currently working with a customer that runs mysqld with a memory footprint of 39G. The core dumping is disabled because it would take good 20 minutes to complete it. Thus the post-mortem stack tracing capability becomes critical.
[29 Nov 2010 9:34] Shane Bester
i have to take this opportunity to mention bug 35661 \o/
[29 Nov 2010 14:52] Davi Arnaut
Sasha,

Agree with your comments, although I disagree with the finer details on how to implement them. Anyway, the scope is to wide for this bug report, it would be nice if you could report a feature request on this -- having a more complete crash log. Lastly, the size of core files can be mitigated by controlling what is dumped. Take a look at the documentation for coredump_filter.
[29 Nov 2010 17:28] Sasha Pachev
Davi:

coredump_filter does not have enough granularity to distinguish between InnoDB buffer pool and MEM_ROOT allocations of THD. On most sites with a large memory footprint the primary source of the size is InnoDB buffer pool. So even with the coredump_filter if you want the query, you will unfortunately have to dump a very large buffer pool.
[29 Nov 2010 17:28] Sasha Pachev
Davi:

coredump_filter does not have enough granularity to distinguish between InnoDB buffer pool and MEM_ROOT allocations of THD. On most sites with a large memory footprint the primary source of the size is InnoDB buffer pool. So even with the coredump_filter if you want the query, you will unfortunately have to dump a very large buffer pool.
[30 Nov 2010 19:16] Davi Arnaut
Queued to mysql-5.1-bugteam and up.
[30 Nov 2010 20:37] James Day
Davi, Sasha's thought about RAM is good, it's a common frustration, no strong opinion on implementation, though error log is what we ask for and if it's not there people will be much more likely to miss it. But I'd take what's easy. :)

I agree that a feature request for anything more than you originally thought of is the way to go. Best to get this one done and closed rapidly. It's already delivering a lot of benefit.
[5 Dec 2010 12:43] Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[15 Dec 2010 0:47] Paul Dubois
Bug does not appear in any released 5.6.x version.

Setting report to Need Merge pending push to 5.1.x, 5.5.x.
[17 Dec 2010 12:48] Bugs System
Pushed into mysql-5.1 5.1.55 (revid:georgi.kodinov@oracle.com-20101217124435-9imm43geck5u55qw) (version source revid:mats.kindahl@oracle.com-20101201193331-1c07sjno2g7m46ix) (merge vers: 5.1.55) (pib:24)
[17 Dec 2010 12:51] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:georgi.kodinov@oracle.com-20101217124733-p1ivu6higouawv8l) (version source revid:davi.arnaut@oracle.com-20101130190653-n889okigpt36igxv) (merge vers: 5.5.8) (pib:24)
[10 Jan 2011 3:48] Paul Dubois
Noted in 5.1.55, 5.5.9 changelogs.

When mysqld printed crash dump information, it incorrectly indicated
that some valid pointers were invalid.
[11 Sep 2012 7:39] Shane Bester
printing 1024 characters of the crashing query is not enough.  i will open a feature request to increase this to 4M or so.