Bug #65711 Server crashes in intern_plugin_lock on concurrent install plugin / set variable
Submitted: 22 Jun 2012 13:30 Modified: 22 Jun 2012 14:04
Reporter: Elena Stepanova Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Options Severity:S3 (Non-critical)
Version:5.5.25, 5.5.26, 5.6.5 OS:Any
Assigned to: CPU Architecture:Any

[22 Jun 2012 13:30] Elena Stepanova
Description:
#3  <signal handler called>
#4  0x00000000005f41be in intern_plugin_lock (lex=0x1822cd0, rc=0x139ed30) at sql/sql_plugin.cc:731
#5  0x00000000005f83cc in find_sys_var (thd=0x1821380, str=0x18e5e98 "rpl_semi_sync_master_enabled", length=28) at sql/sql_plugin.cc:2337
#6  0x00000000006ef1a8 in find_sys_var_null_base (thd=0x1821380, tmp=0x7fcb16ab2590) at sql/sql_yacc.yy:413
#7  0x000000000070fc80 in MYSQLparse (yythd=0x1821380) at sql/sql_yacc.yy:13070
#8  0x00000000005f0de7 in parse_sql (thd=0x1821380, parser_state=0x7fcb16ab4150, creation_ctx=0x0) at sql/sql_parse.cc:7353
#9  0x00000000005ed515 in mysql_parse (thd=0x1821380, rawbuf=0x18e5dc0 "SET GLOBAL rpl_semi_sync_master_enabled = ON", length=44, parser_state=0x7fcb16ab4150) at sql/sql_parse.cc:5581
#10 0x00000000005e1af9 in dispatch_command (command=COM_QUERY, thd=0x1821380, packet=0x18a3c61 "SET GLOBAL rpl_semi_sync_master_enabled = ON", packet_length=44) at sql/sql_parse.cc:1036
#11 0x00000000005e0e50 in do_command (thd=0x1821380) at sql/sql_parse.cc:773
#12 0x00000000006c590a in do_handle_one_connection (thd_arg=0x1821380) at sql/sql_connect.cc:814
#13 0x00000000006c535f in handle_one_connection (arg=0x1821380) at sql/sql_connect.cc:733
#14 0x0000000000aafcc1 in pfs_spawn_thread (arg=0x18e1bc0) at storage/perfschema/pfs.cc:1015
#15 0x00007fcb1d1d1efc in start_thread (arg=0x7fcb16ab5700) at pthread_create.c:304
#16 0x00007fcb1c54459d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

Query (2dfce40): SET GLOBAL rpl_semi_sync_master_enabled = ON
Connection ID (thread ID): 2
Status: NOT_KILLED

How to repeat:
# MTR test case 
# (you might try to run it with --repeat=N 
# if you are not getting the crash right away)

--connect (con1,127.0.0.1,root,,test)
--error 0,ER_UNKNOWN_SYSTEM_VARIABLE
SET GLOBAL rpl_semi_sync_master_enabled = OFF;
--error 0,ER_SP_DOES_NOT_EXIST
UNINSTALL PLUGIN rpl_semi_sync_master;
--send
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
--connection default
--error 0,ER_UNKNOWN_SYSTEM_VARIABLE
SET GLOBAL rpl_semi_sync_master_enabled = ON;
--connection con1
--reap

# End of MTR test case

# If you are still not getting the crash, try this RQG grammar
# cat semi.yy

thread1:
	SET GLOBAL rpl_semi_sync_master_enabled = on_off ;

query:
	INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so' | UNINSTALL PLUGIN rpl_semi_sync_master ;

on_off:
	ON | OFF ;

# End of RQG grammar
# Run it as

perl runall.pl \
--grammar=semi.yy \
--queries=100M \
--threads=2 \
--duration=300 \
--skip-gendata \
--basedir=<your basedir> --vardir=<your vardir>

# Make sure the tests findsthe library all right 
# and if not, additionally provide plugin-dir on the command line
[22 Jun 2012 14:04] Valeriy Kravchuk
Thank you for the bug report. Verified on Mac OS X, crashed on second run:

macbook-pro:mysql-test openxs$ ./mtr bug65711
Logging: ./mtr  bug65711
120622 17:02:14 [Warning] Setting lower_case_table_names=2 because file system for /var/folders/dX/dXCzvuSlHX4Op1g-o1jIWk+++TI/-Tmp-/ijkGNSZUzL/ is case insensitive
120622 17:02:14 [Note] Plugin 'FEDERATED' is disabled.
MySQL Version 5.5.26
Checking supported features...
 - skipping ndbcluster
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /Users/openxs/dbs/5.5/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/Users/openxs/dbs/5.5/mysql-test/var'...
Installing system database...
Using server port 58190

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
main.bug65711                            [ fail ]
        Test ended at 2012-06-22 17:02:37

CURRENT_TEST: main.bug65711
mysqltest: At line 10: query 'SET GLOBAL rpl_semi_sync_master_enabled = ON' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 0...

The result from queries just before the failure was:
SET GLOBAL rpl_semi_sync_master_enabled = OFF;
UNINSTALL PLUGIN rpl_semi_sync_master;
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

Server [mysqld.1 - pid: 36110, winpid: 36110, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
120622 17:02:24 [Warning] Setting lower_case_table_names=2 because file system for /Users/openxs/dbs/5.5/mysql-test/var/mysqld.1/data/ is case insensitive
120622 17:02:24 [Note] Plugin 'FEDERATED' is disabled.
120622 17:02:24 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
120622 17:02:24 InnoDB: The InnoDB memory heap is disabled
120622 17:02:24 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120622 17:02:24 InnoDB: Compressed tables use zlib 1.2.3
120622 17:02:25 InnoDB: Initializing buffer pool, size = 8.0M
120622 17:02:25 InnoDB: Completed initialization of buffer pool
120622 17:02:25 InnoDB: highest supported file format is Barracuda.
120622 17:02:25  InnoDB: Waiting for the background threads to start
120622 17:02:26 InnoDB: 1.1.8 started; log sequence number 1595675
120622 17:02:26 [Note] Server hostname (bind-address): '0.0.0.0'; port: 13000
120622 17:02:26 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
120622 17:02:26 [Note] Server socket created on IP: '0.0.0.0'.
120622 17:02:26 [Note] Event Scheduler: Loaded 0 events
120622 17:02:26 [Note] /Users/openxs/dbs/5.5/bin/mysqld: ready for connections.
Version: '5.5.26-debug-log'  socket: '/Users/openxs/dbs/5.5/mysql-test/var/tmp/mysqld.1.sock'  port: 13000  Source distribution
14:02:26 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_threads=151
thread_count=2
connection_count=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60768 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x101835000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 10517de68 thread_stack 0x40000
0   mysqld                              0x0000000000387643 my_print_stacktrace + 50
1   mysqld                              0x00000000000355ee handle_fatal_signal + 1148
2   libSystem.B.dylib                   0x00000000805ed3aa _sigtramp + 26
3   mysqld                              0x0000000000565a22 _db_end_ + 416
4   mysqld                              0x000000000023485a _Z12find_sys_varP3THDPKcj + 248
5   mysqld                              0x000000000032cadf _Z21case_stmt_action_caseP3LEX + 119
6   mysqld                              0x00000000003525bb _Z10MYSQLparsePv + 150191
7   mysqld                              0x000000000020bf75 _Z9parse_sqlP3THDP12Parser_stateP19Object_creation_ctx + 209
8   mysqld                              0x000000000021b3e8 _Z11mysql_parseP3THDPcjP12Parser_state + 174
9   mysqld                              0x000000000021c2ce _Z16dispatch_command19enum_server_commandP3THDPcj + 3026
10  mysqld                              0x000000000021d959 _Z10do_commandP3THD + 633
11  mysqld                              0x000000000030aced _Z24do_handle_one_connectionP3THD + 397
12  mysqld                              0x000000000030adb3 handle_one_connection + 45
13  mysqld                              0x0000000000542dab pfs_spawn_thread + 147
14  libSystem.B.dylib                   0x00000000805b5dcb _pthread_start + 316
15  libSystem.B.dylib                   0x00000000805b5c8d thread_start + 13

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (1018e8410): SET GLOBAL rpl_semi_sync_master_enabled = ON
Connection ID (thread ID): 2
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
----------SERVER LOG END-------------
[15 Aug 2012 2:52] liu hickey
For mysql_install_plugin, the logic is first plugin_add, and then plugin_initialize.
plugin->state is set to PLUGIN_IS_READY until plugin->init() succeeded in plugin_initialize(), while the plugin might already registed to system_variable_hash.

Therefore, for such piece code below, it might be problematic: the var is found in system_variable_hash, but it's plugin in still uninitialized.

intern_find_sys_var

  if ((var= intern_find_sys_var(str, length)) &&
      (pi= var->cast_pluginvar()))
  {
    mysql_rwlock_unlock(&LOCK_system_variables_hash);
    LEX *lex= thd ? thd->lex : 0; 
    if (!(plugin= my_intern_plugin_lock(lex, plugin_int_to_ref(pi->plugin))))
      var= NULL; /* failed to lock it, it must be uninstalling */
    else 
    if (!(plugin_state(plugin) & PLUGIN_IS_READY))
    {    
      /* initialization not completed */
      var= NULL;
      intern_plugin_unlock(lex, plugin);
    }    
  }

The workaround is to void the plugin still be NULL case in my_intern_plugin_lock:
--- sql/sql_plugin.cc   (revision 88)
+++ sql/sql_plugin.cc   (working copy)
@@ -713,6 +713,9 @@
 static plugin_ref intern_plugin_lock(LEX *lex, plugin_ref rc)
 {
   st_plugin_int *pi= plugin_ref_to_int(rc);
+  if (!pi)
+    DBUG_RETURN(NULL);
+
   DBUG_ENTER("intern_plugin_lock");
 
   mysql_mutex_assert_owner(&LOCK_plugin);

Also, we could adjust the logic mentioned above, adding a flag to indicate the plugin_initialize is successful after plugin_add, such as a bool var of plugin_initized for sys_var_pluginvar, and it's set to true after plugin->init() successful, then check the flag before calling the intern_plugin_lock(). But it's more simple and clear with the workaround patch above.