Bug #102052 RACE CONDITION ON SECURITY_CONTEXT::M_USER
Submitted: 22 Dec 2020 16:16 Modified: 24 Dec 2020 5:29
Reporter: songlei wang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Security: Privileges Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: string

[22 Dec 2020 16:16] songlei wang
Description:
Class String defined in sql_string,h is not thread_safe, it will raise a crash in following condition:
1. A thread executes 'thd->rewritten_query.mem_free()' in function 'dispatch_command' before returning.
2. Another thread executes 'thd->rewritten_query.c_ptr_safe()' with the thd object which belong to step one thread.

The code of 'c_ptr_safe()' is as follows:
  char *c_ptr_safe() {
    if (m_ptr && m_length < m_alloced_length) {

      /* here thd->rewritten_query.mem_free() begin and finish */

      m_ptr[m_length] = 0; /*here will raise a crash*/
    } else
      (void)mem_realloc(m_length);
    return m_ptr;
  }

How to repeat:
It is a multi thread coordination problem. We add some DEBUG code in source code as follows:

file sql_parse.cc:
bool dispatch_command (
..
..
done:
..
..
  thd->m_digest = NULL;

  /* Added by me begin*/
  if(thd->rewritten_query.length())
    DBUG_EXECUTE_IF("free_rewritten", {DBUG_SET("d, free_s");});
  /* Added by me end*/

  /* Prevent rewritten query from getting "stuck" in SHOW PROCESSLIST. */
  thd->rewritten_query.mem_free();

  thd_manager->dec_thread_running();

file sql_show.cc
class Fill_process_list : public Do_THD_Impl {
...
  virtual void operator()(THD *inspect_thd) {
  ...
  ...
    mysql_mutex_lock(&inspect_thd->LOCK_thd_query);
    {
      const char *query_str;
      size_t query_length;

      if (inspect_thd->rewritten_query.length()) {
        /* Added by me begin*/
        DBUG_EXECUTE_IF("show_rewritten", { DBUG_SET("d, safe_s");});
        /* Added by me end*/

        query_str = inspect_thd->rewritten_query.c_ptr_safe();
        query_length = inspect_thd->rewritten_query.length();

Then running follow command in debug version:
1 thread one:
mysql> set debug="d, free_rewritten";
Query OK, 0 rows affected (10.00 sec)

mysql> set password='canal';
Query OK, 0 rows affected (0.00 sec)

2 thread two
mysql> set debug="d, show_rewritten";

mysql> select * from PROCESSLIST;
mysql> select * from PROCESSLIST;

It will raise a coredump file:

15:57:24 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7f8440000cc0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f8630176040 thread_stack 0x46000
.../mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x43) [0x4bebcab]
.../mysqld(handle_fatal_signal+0x299) [0x38ef802]
/lib64/libpthread.so.0(+0xf620) [0x7f864bb8a620]
.../mysqld(String::c_ptr_safe()+0x69) [0x354e513]
.../mysqld(Fill_process_list::operator()(THD*)+0x7a8) [0x374a520]
.../mysqld(Do_THD::operator()(THD*)+0x34) [0x352082a]
.../mysqld(Do_THD std::for_each<THD**, Do_THD>(THD**, THD**, Do_THD)+0x34) [0x3520f88]
.../mysqld(Global_THD_manager::do_for_all_thd_copy(Do_THD_Impl*)+0x157) [0x351fd87]
.../mysqld() [0x3741333]
.../mysqld(do_fill_information_schema_table(THD*, TABLE_LIST*, QEP_TAB*)+0x104) [0x374607f]
.../mysqld(MaterializeInformationSchemaTableIterator::Init()+0x9f) [0x3a274ff]
.../mysqld(SELECT_LEX_UNIT::ExecuteIteratorQuery(THD*)+0x569) [0x37c9257]
.../mysqld(SELECT_LEX_UNIT::execute(THD*)+0x21a) [0x37c973c]
.../mysqld(Sql_cmd_dml::execute_inner(THD*)+0xc6) [0x3724326]
.../mysqld(Sql_cmd_dml::execute(THD*)+0x3fa) [0x3723a46]
.../mysqld(mysql_execute_command(THD*, bool)+0x5b71) [0x36c0f90]
.../mysqld(mysql_parse(THD*, Parser_state*)+0x6bb) [0x36c3a70]
.../mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1461) [0x36b8a00]
.../mysqld(do_command(THD*, std::function<bool (THD*, COM_DATA const*, enum_server_command)>*)+0x4c4) [0x36b6e3f]
.../mysqld(do_command(THD*)+0x1d) [0x36b6f30]
.../mysqld() [0x38da7b6]
.../mysqld() [0x539b62e]
/lib64/libpthread.so.0(+0x7e25) [0x7f864bb82e25]
/lib64/libc.so.6(clone+0x6d) [0x7f864a58ef1d]
[22 Dec 2020 16:31] songlei wang
Adding another modified file:

sql_string.h

class String {
...

  char *c_ptr_safe() {
    if (m_ptr && m_length < m_alloced_length) {
      /* Added by me */
      DBUG_EXECUTE_IF("safe_s", { sleep(10); });
      /* Added by me */

      m_ptr[m_length] = 0;
    } else
      (void)mem_realloc(m_length);
    return m_ptr;
  }

...
...
  void mem_free() {
    if (m_is_alloced) {
      m_is_alloced = false;
      m_alloced_length = 0;
      my_free(m_ptr);

      /* Added by me */
      DBUG_EXECUTE_IF("free_s",{ sleep(10); });
      /* Added by me */

      m_ptr = NULL;
      m_length = 0; /* Safety */
    }
  }
[23 Dec 2020 13:05] songlei wang
REFACTORING OF REWRITTEN_QUERY has be fixed in commit c69368d07b10b9b491714e24940bc1745a648d4c. 

But inspect_sctx->user()、inspect_sctx->host() also have the same problem. 

The command COM_CHANGE_USER will realloc the inspect_sctx->user()、inspect_sctx->host(), it will access a freed point in follow condition:

Thread one

/*change user*/
acl_authenticate  
|server_mpvio_update_thd(thd, &mpvio);
||thd->security_context()->assign_user(
|||m_user.copy(user_arg, user_arg_length, system_charset_info);
||||string::alloc      
|||||String::real_alloc
||||||mem_free();               /*here will free the older m_user*/

Thread two
Fill_process_list::operator() {
  LEX_CSTRING inspect_sctx_user = inspect_sctx->user();
  ...
   const char *val = nullptr;
    if (inspect_sctx_user.str)
      val = inspect_sctx_user.str;
    else if (inspect_thd->system_thread)
      val = "system user";
    else
      val = "unauthenticated user";

    /*here m_user of thread one is freed, the variable 
      val will access a address witch already be freed*/

    table->field[1]->store(val, strlen(val), system_charset_info);
[23 Dec 2020 14:00] MySQL Verification Team
Hi Mr. wang,

Thank you for your bug report.

Regarding our class String, we do not think that there is any problem.

However, we fully agree with your analysis of  inspect_sctx->user() and inspect_sctx->host() methods. Thank you for your contribution.

Verified as reported.
[24 Dec 2020 2:02] songlei wang
Thanks for replying.

I agree with that class String has no problem, But the method String::copy and String::mem_free is non atomic operation. Multiple threads perform non atomic operations on the same String object at the same time,it may raise unexpected memory error.
[24 Dec 2020 5:29] songlei wang
repeat easier as follows:

1. modify file ../sql/sql_parse.cc. Adding 'sleep(10)' for COM_CHANGE_USER

bool dispatch_command(THD *thd, const COM_DATA *com_data, ... {
..
..
    case COM_CHANGE_USER: {
      int auth_rc;
      thd->status_var.com_other++;

      thd->cleanup_connection();
      USER_CONN *save_user_connect =
          const_cast<USER_CONN *>(thd->get_user_connect());
      LEX_CSTRING save_db = thd->db();
      Security_context save_security_ctx(*(thd->security_context()));
      
      sleep(10);

      auth_rc = acl_authenticate(thd, COM_CHANGE_USER);
      auth_rc |= mysql_audit_notify(
...

2.  modify file sql/sql_show.cc. Adding 'sleep(10)' for special user 'u01'.

class Fill_process_list : public Do_THD_Impl {
  void operator()(THD *inspect_thd) override {
    Security_context *inspect_sctx = inspect_thd->security_context();
    LEX_CSTRING inspect_sctx_user = inspect_sctx->user();
    LEX_CSTRING inspect_sctx_host = inspect_sctx->host();
    LEX_CSTRING inspect_sctx_host_or_ip = inspect_sctx->host_or_ip();

    if(strcmp(inspect_sctx_user.str, "u01") == 0)
      sleep(10);

    const char *client_priv_user =
      m_client_thd->security_context()->priv_user().str;

3. Adding two user 'u01' and 'u0200000000'

4. Running capi method 'mysql_change_user()', change user 'u01' to 'u0200000000'. It will finish in ten seconds due to 'sleep(10)' adding by us.

5. Running command 'select * from information_schema.PROCESSLIST' before 'mysql_change_user()' finished.

It show the result as follow:

mysql> select * from PROCESSLIST;
+----+-----------------+--------------------+--------------------+---------+------+------------------------+---------------------------+
| ID | USER            | HOST               | DB                 | COMMAND | TIME | STATE                  | INFO                      |
+----+-----------------+--------------------+--------------------+---------+------+------------------------+---------------------------+
| 10 | root            | localhost          | information_schema | Query   |    0 | executing              | select * from PROCESSLIST |
| 11 | SER             | localhost          | NULL               | Sleep   |    2 |                        | NULL                      |
|  6 | event_scheduler | localhost          | NULL               | Daemon  |   27 | Waiting on empty queue | NULL                      |
+----+-----------------+--------------------+--------------------+---------+------+------------------------+---------------------------+

The USER column of second record should be 'u01' or 'u0200000000', but it show a unexpected value even raise a cordump file due to it accessed freed memory.
[11 Jan 13:30] MySQL Verification Team
Hi Mr. Wang,

Thank you for the analysis.

It is added to our internal bugs database.
[19 Mar 14:52] MySQL Verification Team
Hi,

We have one additional question for you........

Have you tried starting server release 8.0.23 with --performance-schema-show-processlist and do the same.

That would be a good workaround.

Also, this code is thoroughly changed in the latest release, so please let us know if you still see a problem.

Thanks in advance.
[23 Mar 13:29] MySQL Verification Team
Hi Mr. wang,

We are still waiting on your feedback, whether this workaround is usable to you.

Thanks in advance.