Bug #58785 Two threads may end up with the same query_id
Submitted: 7 Dec 2010 13:57 Modified: 5 Jun 2012 16:08
Reporter: Davi Arnaut (OCA) Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.5 OS:Any
Assigned to: CPU Architecture:Any

[7 Dec 2010 13:57] Davi Arnaut
In sql_parse.cc there is a race between the time a query_id is retrieved and when it's incremented:

 903   thd->set_query_id(get_query_id());
 904   if (!(server_command_flags[command] & CF_SKIP_QUERY_ID))
 905     next_query_id();

Two threads having the same query_id could pose some problems. One is that the query_id is used to initialize a XA transaction id. Discussing with Serg, he said that two transactions could end up having the same xid, which could cause storage engines to get confused about what change belong to what transaction, recovery would be broken, etc.

How to repeat:
Add a debug sync point after set_query_id and observe in a debugger.

Suggested fix:
Discussing with Serg, the following suggestions are worth exploring:

- Add a warn_unused_result attribute to next_query_id() and probably get rid of get_query_id().
- There is no reason to skip query_id for the PING and STATISTICS command.
- THD::set_query_id shouldn't need to take THD::LOCK_thd_data.
[5 Jun 2012 16:08] Paul Dubois
Noted in 5.6.6 changelog.

Due to a race condition, it was possible for two threads to end up
with the same query ID for different queries.
[13 Jan 2014 13:04] Paul Dubois
Also backported to 5.5.37.
[27 Mar 2014 13:48] Laurynas Biveinis
5.5$ bzr log -r 4582 -n0
revno: 4582
committer: Thayumanavar <thayumanavar.x.sachithanantha@oracle.com>
branch nick: mysql-5.5
timestamp: Mon 2014-01-13 12:04:16 +0530
  BUG#18054998 - BACKPORT FIX FOR BUG#11765785 to 5.5
  This is a backport of the patch of bug#11765785. Commit message
  by Prabakaran Thirumalai from bug#11765785 is reproduced below:
  Global Query ID (global_query_id ) is not incremented for PING and 
  statistics command. These two query types are filtered before 
  incrementing the global query id. This causes race condition and 
  results in duplicate query id for different queries originating from 
  different connections.
  sqlparse.cc::dispath_command() is the only place in code which sets 
  thd->query_ id to global_query_id and then increments it based on the 
  query type. In all other places it is incremented first and then 
  assigned to thd->query_id.
  This is done such that global_query_id is not incremented for PING 
  and statistics commands in dispatch_command() function.
  As per suggestion from Serg, "There is no reason to skip query_id for 
  the PING and STATISTICS command.", removing the check which filters 
  PING and statistics commands.
  Instead of using get_query_id() and next_query_id() which can still 
  cause race condition if context switch happens soon after executing 
  get_query_id(), changing the code to use next_query_id() instead of 
  get_query_id() as it is done in other parts of code which deals with 
  Removed get_query_id() function and forced next_query_id() caller 
  to use the return value by specifying warn_unused_result attribute.