Bug #111432 | c++ Undefined Behavior with clang-16 | ||
---|---|---|---|
Submitted: | 15 Jun 2023 8:56 | Modified: | 16 Jun 2023 11:10 |
Reporter: | Przemysław Skibiński (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Compiling | Severity: | S3 (Non-critical) |
Version: | mysql-8.0.33 | OS: | Ubuntu (22.04.2) |
Assigned to: | CPU Architecture: | x86 |
[15 Jun 2023 8:56]
Przemysław Skibiński
[15 Jun 2023 12:52]
MySQL Verification Team
Hi Mr. Skibinski, Thank you for your bug report. Please, use our official binary from dev.mysql.com. In order to build properly MySQL binary to be of the same quality as our production binaries, you have to use the exact CMAKE script file and exact compiler versions as we do for your Linux variant. The entire building process changes from release to release, so we do not publish the details of the entire building process. Can't repeat.
[15 Jun 2023 14:26]
Tor Didriksen
Posted by developer: I haven't tried clang 16 yet, but this looks very familiar. Each new clang version will allocate larger stack frames in debug mode. Could you try cmake -DOPTIMIZE_DEBUG_BUILDS=1 I assume non-debug builds do not have the same symptoms? The failing tests are most likely stress-tests of our internal check_stack_overflow code. We monitor stack usage, and try to abort queries before they run out of stack. Seems that we need to tweak/tune this code again for clang 16 Debug builds.
[16 Jun 2023 8:57]
Tor Didriksen
Posted by developer: Fedora 38 has Clang 16.0.5 cmake .. -DWITH_DEBUG=1 -DWITH_SYSTEM_LIBS=1 ./mtr --mem main.execution_constants main.sp main.sp-error ============================================================================== TEST NAME RESULT TIME (ms) COMMENT ------------------------------------------------------------------------------ [ 25%] main.execution_constants [ pass ] 1761 [ 50%] main.sp [ pass ] 41629 [ 75%] main.sp-error [ pass ] 3629 [100%] shutdown_report [ pass ]
[16 Jun 2023 10:13]
Tor Didriksen
Posted by developer: Previous comment was for current head of trunk. 8.0.33-release does indeed crash, and OPTIMIZE_DEBUG_BUILDS did *not* help.
[16 Jun 2023 11:06]
Tor Didriksen
Posted by developer: Fixed by internal patch: Bug#35181008 Determine STACK_DIRECTION at runtime rather than configure time The STACK_DIRECTION variable computed by cmake at configure time was wrong for clang 16 So you will see the fix in an up-coming release.
[16 Jun 2023 11:10]
Przemysław Skibiński
The issue with clang-16 is probably related to: TRY_RUN(STACKDIR_RUN_RESULT STACKDIR_COMPILE_RESULT ${CMAKE_BINARY_DIR} ${CMAKE_SOURCE_DIR}/cmake/stack_direction.c ) clang-16 -o stack_detection stack_direction.c stack_direction.c:33:16: error: incompatible function pointer types initializing 'volatile int (*)(int *)' with an expression of type 'int (int *)' [-Wincompatible-function-pointer-types] volatile int (*ptr_f)(int *) = f; ^ ~ 1 error generated What leads to #define STACK_DIRECTION 1 in config.h instead of #define STACK_DIRECTION -1
[16 Jun 2023 12:08]
Yura Sorokin
Tor, one additional comment here. There is another problem with stack growth detection under Address Sanitizer on modern gcc and clang compilers (possibly under other sanitizers as well but this hasn't been tested). Starting from clang version 15 and gcc version 13, both of these compilers are using custom technique for allocating stack frames when Asan is enabled (-fsanitize=address compiler flag is specified). https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterReturn They rely on a custom function __asan_stack_malloc() that allocates a block of memory from an internal ASan memory pool. The problem is that this __asan_stack_malloc() may return absolutely random (or, I would at least say, unordered memory addresses) even for consequent calls. In other words, there is no such thing as "stack growth direction" in this case and we just simply cannot rely on this concept in the MySQL code. As for the concrete fixes, I would suggest to rework "check_stack_overrun()" logic so that it would detect on preprocessor level that the code is being compiled under Address Sanitizer of these modern compilers and always return "false" in this case. For instance, inside "configure.cmake" STACK_DIRECTION variable in addition to "1" and "-1" may be set a new value "0" meaning "stack growth direction concept does not exist". Moreover, the following 3 MTR test cases must be marked "not_asan.inc": main.execution_constants main.sp main.sp-error
[16 Jun 2023 12:12]
Yura Sorokin
Here is also "stack_direction.c" put into Compiler Explorer which helped me to identify this issue https://godbolt.org/z/3fjzqhavd Basically you can change compilers and add/remove "-fsanitize=address" and look at the "Program returned:" code.
[16 Jun 2023 12:56]
Tor Didriksen
Yes, that's when I re-wrote the stack direction stuff, when running ASAN with clang 15. So there's another patch pushed together with the one referenced earlier which fixes the ASAN problems. From the commit message: For ASAN_OPTIONS=detect_stack_use_after_return=true we simply ignore the stack check in check_stack_overrun(). The "fake stack" is not contiguous, and our stack usage calculations are simply wrong.
[16 Jun 2023 13:42]
Yura Sorokin
Thanks for the details, Tor.