Bug #111432 c++ Undefined Behavior with clang-16
Submitted: 15 Jun 2023 8:56 Modified: 16 Jun 2023 11:10
Reporter: Przemysław Skibiński (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Compiling Severity:S3 (Non-critical)
Version:mysql-8.0.33 OS:Ubuntu (22.04.2)
Assigned to: CPU Architecture:x86

[15 Jun 2023 8:56] Przemysław Skibiński
Description:
I compiled mysql-8.0.33 with clang-16.0.3 and clang-16.0.6 on Ubuntu 22.04.2.

For both clang-16.0.x the following MTR tests crash with SEGFAULT:
Failing test(s): main.execution_constants main.sp main.sp-error

I also tried gcc-12.1.0 and clang 15.0.7. There is no issue with main.execution_constants main.sp main.sp-error

For clang-16.0.x the stack looks like an infinite recursions to me. gdb shows:

Thread 48 "connection" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe42e8640 (LWP 360637)]

#0  0x000055555a7b6f57 in MYSQLparse (YYTHD=0x7fff44011930, parse_tree=0x7fffe41f59f8) at /data/mysql-server/mysql-8.0/sql/sql_yacc.yy:11782
#1  0x000055555a7743e5 in THD::sql_parser (this=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_class.cc:3057
#2  0x000055555a960868 in parse_sql (thd=0x7fff44011930, parser_state=0x7fffe41f5b90, creation_ctx=0x7fff441d41e8) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:7127
#3  0x000055555a6c04e8 in sp_compile (thd=0x7fff44011930, defstr=0x7fffe41f5e50, sql_mode=1073741824, creation_ctx=0x7fff441d41e8) at /data/mysql-server/mysql-8.0/sql/sp.cc:453
#4  0x000055555a6bfd95 in db_load_routine (thd=0x7fff44011930, type=enum_sp_type::PROCEDURE, sp_db=0x7fff455dd7e0 "test", sp_db_len=4, sp_name=0x7fff455e3b08 "bug10100p", sp_name_len=9, sphp=0x7fffe41f6a88, 
    sql_mode=1073741824, params=0x7fff441d4030 "IN `prm` int, INOUT `res` int", returns=0x555559018562 "", 
    body=0x7fff441d4050 "begin\nset res = res * prm;\nif prm > 1 then\ncall bug10100p(prm - 1, res);\nend if;\nend", sp_chistics=0x7fff441d41c8, definer_user=0x7fff441d41b0 "root", 
    definer_host=0x7fff441d41b8 "localhost", created=20230427190213, modified=20230427190213, creation_ctx=0x7fff441d41e8) at /data/mysql-server/mysql-8.0/sql/sp.cc:549
#5  0x000055555a6c69ad in sp_setup_routine (thd=0x7fff44011930, type=enum_sp_type::PROCEDURE, name=0x7fff455e3b28, cp=0x7fff44013f28) at /data/mysql-server/mysql-8.0/sql/sp.cc:1622
#6  0x000055555a767ff5 in Sql_cmd_call::execute_inner (this=0x7fff455e4930, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_call.cc:187
#7  0x000055555a9fa327 in Sql_cmd_dml::execute (this=0x7fff455e4930, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_select.cc:578
#8  0x000055555a95d8c7 in mysql_execute_command (thd=0x7fff44011930, first_level=false) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:4714
#9  0x000055555a6e7246 in sp_instr_stmt::exec_core (this=0x7fff455e4998, thd=0x7fff44011930, nextp=0x7fffe41fa178) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:971
#10 0x000055555a6e529a in sp_lex_instr::reset_lex_and_exec_core (this=0x7fff455e4998, thd=0x7fff44011930, nextp=0x7fffe41fa178, open_tables=false) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:441
#11 0x000055555a6e5ff4 in sp_lex_instr::validate_lex_and_execute_core (this=0x7fff455e4998, thd=0x7fff44011930, nextp=0x7fffe41fa178, open_tables=false) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:725
#12 0x000055555a6e6612 in sp_instr_stmt::execute (this=0x7fff455e4998, thd=0x7fff44011930, nextp=0x7fffe41fa178) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:897
#13 0x000055555a6d7519 in sp_head::execute (this=0x7fff455dcdc0, thd=0x7fff44011930, merge_da_on_success=true) at /data/mysql-server/mysql-8.0/sql/sp_head.cc:2221
#14 0x000055555a6d9fdb in sp_head::execute_procedure (this=0x7fff455dcdc0, thd=0x7fff44011930, args=0x7fff455d7cb0) at /data/mysql-server/mysql-8.0/sql/sp_head.cc:2866
#15 0x000055555a768183 in Sql_cmd_call::execute_inner (this=0x7fff455d87f0, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_call.cc:235
#16 0x000055555a9fa327 in Sql_cmd_dml::execute (this=0x7fff455d87f0, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_select.cc:578
#17 0x000055555a95d8c7 in mysql_execute_command (thd=0x7fff44011930, first_level=false) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:4714
....
#584 0x000055555a95d8c7 in mysql_execute_command (thd=0x7fff44011930, first_level=false) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:4714
#585 0x000055555a6e7246 in sp_instr_stmt::exec_core (this=0x7fff441d3eb8, thd=0x7fff44011930, nextp=0x7fffe42e2178) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:971
#586 0x000055555a6e529a in sp_lex_instr::reset_lex_and_exec_core (this=0x7fff441d3eb8, thd=0x7fff44011930, nextp=0x7fffe42e2178, open_tables=false) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:441
#587 0x000055555a6e5ff4 in sp_lex_instr::validate_lex_and_execute_core (this=0x7fff441d3eb8, thd=0x7fff44011930, nextp=0x7fffe42e2178, open_tables=false) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:725
#588 0x000055555a6e6612 in sp_instr_stmt::execute (this=0x7fff441d3eb8, thd=0x7fff44011930, nextp=0x7fffe42e2178) at /data/mysql-server/mysql-8.0/sql/sp_instr.cc:897
#589 0x000055555a6d7519 in sp_head::execute (this=0x7fff441b7840, thd=0x7fff44011930, merge_da_on_success=true) at /data/mysql-server/mysql-8.0/sql/sp_head.cc:2221
#590 0x000055555a6d9fdb in sp_head::execute_procedure (this=0x7fff441b7840, thd=0x7fff44011930, args=0x7fff44429700) at /data/mysql-server/mysql-8.0/sql/sp_head.cc:2866
--Type <RET> for more, q to quit, c to continue without paging--
#591 0x000055555a768183 in Sql_cmd_call::execute_inner (this=0x7fff4442a0c8, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_call.cc:235
#592 0x000055555a9fa327 in Sql_cmd_dml::execute (this=0x7fff4442a0c8, thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_select.cc:578
#593 0x000055555a95d8c7 in mysql_execute_command (thd=0x7fff44011930, first_level=true) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:4714
#594 0x000055555a954a60 in dispatch_sql_command (thd=0x7fff44011930, parser_state=0x7fffe42e6a68) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:5363
#595 0x000055555a95054c in dispatch_command (thd=0x7fff44011930, com_data=0x7fffe42e7a28, command=COM_QUERY) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:2050
#596 0x000055555a9531b8 in do_command (thd=0x7fff44011930) at /data/mysql-server/mysql-8.0/sql/sql_parse.cc:1439

#597 0x000055555abdfde4 in handle_connection (arg=0x55555f9d0480) at /data/mysql-server/mysql-8.0/sql/conn_handler/connection_handler_per_thread.cc:302
#598 0x000055555c9f4174 in pfs_spawn_thread (arg=0x55555faa28c0) at /data/mysql-server/mysql-8.0/storage/perfschema/pfs.cc:3042
#599 0x00007ffff4e94b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#600 0x00007ffff4f26a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

How to repeat:
1. Add "deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy-16 main" to /etc/apt/sources.list
2. sudo apt update; sudo apt install clang-16 llvm-16-dev
3. Compile the server with clang-16 with -DCMAKE_BUILD_TYPE=Debug.
4. Run MTR tests with --suite=main or "main.execution_constants main.sp main.sp-error".
[15 Jun 2023 12:52] MySQL Verification Team
Hi Mr. Skibinski,

Thank you for your bug report.

Please, use our official  binary from dev.mysql.com.

In order to build properly MySQL binary to be of the same quality as our production binaries, you have to use the exact CMAKE script file and exact compiler versions as we do for your Linux variant. The entire building process changes from release to release, so we do not publish the details of the entire building process.

Can't repeat.
[15 Jun 2023 14:26] Tor Didriksen
Posted by developer:
 
I haven't tried clang 16 yet, but this looks very familiar.
Each new clang version will allocate larger stack frames in debug mode.
Could you try cmake -DOPTIMIZE_DEBUG_BUILDS=1

I assume non-debug builds do not have the same symptoms?

The failing tests are most likely stress-tests of our internal check_stack_overflow code.
We monitor stack usage, and try to abort queries before they run out of stack.
Seems that we need to tweak/tune this code again for clang 16 Debug builds.
[16 Jun 2023 8:57] Tor Didriksen
Posted by developer:
 
Fedora 38 has Clang 16.0.5

cmake .. -DWITH_DEBUG=1 -DWITH_SYSTEM_LIBS=1

./mtr --mem main.execution_constants main.sp main.sp-error
==============================================================================
                  TEST NAME                       RESULT  TIME (ms) COMMENT
------------------------------------------------------------------------------
[ 25%] main.execution_constants                  [ pass ]   1761
[ 50%] main.sp                                   [ pass ]  41629
[ 75%] main.sp-error                             [ pass ]   3629
[100%] shutdown_report                           [ pass ]
[16 Jun 2023 10:13] Tor Didriksen
Posted by developer:
 
Previous comment was for current head of trunk.
8.0.33-release does indeed crash, and OPTIMIZE_DEBUG_BUILDS did *not* help.
[16 Jun 2023 11:06] Tor Didriksen
Posted by developer:
 
Fixed by internal patch:
Bug#35181008 	Determine STACK_DIRECTION at runtime rather than configure time

The STACK_DIRECTION variable computed by cmake at configure time was wrong for clang 16

So you will see the fix in an up-coming release.
[16 Jun 2023 11:10] Przemysław Skibiński
The issue with clang-16 is probably related to:
    TRY_RUN(STACKDIR_RUN_RESULT STACKDIR_COMPILE_RESULT    
     ${CMAKE_BINARY_DIR} 
     ${CMAKE_SOURCE_DIR}/cmake/stack_direction.c
     )

clang-16 -o stack_detection stack_direction.c
stack_direction.c:33:16: error: incompatible function pointer types initializing 'volatile int (*)(int *)' with an expression of type 'int (int *)' [-Wincompatible-function-pointer-types]
volatile int (*ptr_f)(int *) = f;
               ^               ~
1 error generated

What leads to 
#define STACK_DIRECTION 1
in config.h
instead of 
#define STACK_DIRECTION -1
[16 Jun 2023 12:08] Yura Sorokin
Tor, one additional comment here.
There is another problem with stack growth detection under Address Sanitizer on modern gcc and clang compilers (possibly under other sanitizers as well but this hasn't been tested).
Starting from clang version 15 and gcc version 13, both of these compilers are using custom technique for allocating stack frames when Asan is enabled (-fsanitize=address compiler flag is specified).
https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterReturn
They rely on a custom function __asan_stack_malloc() that allocates a block of memory from an internal ASan memory pool.
The problem is that this __asan_stack_malloc() may return absolutely random (or, I would at least say, unordered memory addresses) even for consequent calls.
In other words, there is no such thing as "stack growth direction" in this case and we just simply cannot rely on this concept in the MySQL code.

As for the concrete fixes, I would suggest to rework "check_stack_overrun()" logic so that it would detect on preprocessor level that the code is being compiled under Address Sanitizer of these modern compilers and always return "false" in this case. For instance, inside "configure.cmake" STACK_DIRECTION variable in addition to "1" and "-1" may be set a new value "0" meaning "stack growth direction concept does not exist".

Moreover, the following 3 MTR test cases must be marked "not_asan.inc":
main.execution_constants
main.sp
main.sp-error
[16 Jun 2023 12:12] Yura Sorokin
Here is also "stack_direction.c" put into Compiler Explorer which helped me to identify this issue
https://godbolt.org/z/3fjzqhavd
Basically you can change compilers and add/remove "-fsanitize=address" and look at the "Program returned:" code.
[16 Jun 2023 12:56] Tor Didriksen
Yes, that's when I re-wrote the stack direction stuff, when running ASAN with clang 15. So there's another patch pushed together with the one referenced earlier which fixes the ASAN problems. From the commit message:

    For ASAN_OPTIONS=detect_stack_use_after_return=true we simply ignore
    the stack check in check_stack_overrun(). The "fake stack" is not
    contiguous, and our stack usage calculations are simply wrong.
[16 Jun 2023 13:42] Yura Sorokin
Thanks for the details, Tor.