MySQL Bugs: #35321: Falcon performance issue with concurrent updates

Bug #35321	Falcon performance issue with concurrent updates
Submitted:	16 Mar 2008 16:39	Modified:	29 Jul 2008 14:07
Reporter:	Philip Stoev	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S2 (Serious)
Version:	6.0-falcon-team	OS:	Any
Assigned to:	MC Brown	CPU Architecture:	Any

Description:
When running a workload that involves heavy updates, one time-consuming UPDATE query will block all other UPDATE queries on the same table from progressing until it has completed its work. Once the offending query completes, the other ones will swiftly move through until a new offending query is issued.

All queries operate on non-overlapping records. The CPU usage is 100% at most, even though the machine has two cores.

How to repeat:
Test case will be uploaded shortly.

Test case for bug #35321

Attachment: bug35321.zip (application/x-zip-compressed, text), 30.48 KiB.

Please unpack the attached test case so that the .txt files are in mysql-test and the .test files are in mysql-test/t. Then run:

$ perl ./mysql-test-run.pl --stress --stress-init-file=bug35321_init.txt \
 --stress-test-file=bug35321_run.txt    --stress-test-duration=7200 \
 --stress-threads=10 --skip-ndb

Immediately after starting the test, you will see that the test progresses in fits and starts (along with numerous "record changed since last update") errors, which are the subject of another bug). When you run SHOW PROCESSLIST using a separate client, you will see:

mysql> show processlist\G
*************************** 1. row ***************************
     Id: 3
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 9
  State: Sending data
   Info: UPDATE view1 SET t1_blob = CURDATE() WHERE view1.t1_uuid = (SELECT t1_uuid FROM view2 WHERE t1_uuid
*************************** 3. row ***************************
     Id: 5
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 8
  State: Updating
   Info: UPDATE view1 SET t1_blob = (SELECT t2_date FROM view2 WHERE view2.t1_uuid = view1.t1_uuid) WHERE vie
3 rows in set (0.00 sec)

The first query is the one blocking the second one from exectuting. If you have more threads, there will still be one running query and several blocked. If you kill the first query, the other ones will complete until a new blocking query is issued.

What is special about this query is that has a very inefficient execution plan. However, it still updates a single row which never overlaps with any row updated by any other queries, meaning that we should not observe such a serialization.

This serialization also happens on Innodb. One thread is "sending data" and the others are in "table lock". Please reassign or close this bug if you feel that Falcon is unable to offer any more concurrency. Thank you.

The table is pretty heavy indexed -- I was wondering if it admissible for a bad optimizer plan to impact concurrency in such a way. Ideally a bad query plan should only increase CPU and I/O usage, but not block other queries.

Assigning this to the Optimizer team since this is similar to the problem that is shown in falcon.falcon_bug_34351_A.test and falcon.falcon_bug_34351_A.test.
The optimizer does not use the index to select the correct records WHERE t1_uuid = @uuid.  Instead, it uses sequential access.  On the delete statement, this causes record locking while the record is evaluated and unnecessary contention between clients.  

Note that when you used a quoted string in the query instead of the variable @uuid, it will use the index.

Igor, Valery:

Can you please provide some feedback for this bug. What is going on in here is that update expressions in the form of indexed_column = @variable trigger a full table scan rather than an index lookup.

Apart from the fact that this is genuinely slower, it causes locking and concurrency issues in transactional storage engines. Instead of being able to lock on a specific record and update it, additional records are being touched and locked.

Philip,

Are you sure your @uuid server variable value is a string with the same character set/collation as that indexed column?

I hope the following results I've got (on normal 6.0.4) while trying to create a simpler test case will explain what I mean:

mysql> explain select * from ti where c2 = '1'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ti
         type: ref
possible_keys: i1
          key: i1
      key_len: 123
          ref: const
         rows: 55
        Extra: Using index condition
1 row in set (0.00 sec)

Index is used with explicit string. Now:

mysql> set @id='1';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from ti where c2 = @id\G
ERROR 1267 (HY000): Illegal mix of collations (utf8_general_ci,IMPLICIT) and (ut
f8mb3_general_ci,IMPLICIT) for operation '='

Hm, let's try explicit number:

mysql> set @id=1;
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from ti where c2 = @id\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ti
         type: ALL
possible_keys: i1
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 524928
        Extra: Using where
1 row in set (0.00 sec)

Index is NOT used. But look:

mysql> set names utf8;
Query OK, 0 rows affected (0.02 sec)

mysql> set @id='1';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from ti where c2 = @id\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ti
         type: ref
possible_keys: i1
          key: i1
      key_len: 123
          ref: const
         rows: 55
        Extra: Using index condition
1 row in set (0.00 sec)

Now it works, and index is used.

Ok, here is how it works:

mysql> set @z = convert(UUID() using latin1);
mysql> select * from t1 where f1 = @z;

mysql> show variables like '%character%';
+--------------------------+--------------------------------------------+
| Variable_name            | Value                                      |
+--------------------------+--------------------------------------------+
| character_set_client     | latin1                                     |
| character_set_connection | latin1                                     |
| character_set_database   | latin1                                     |
| character_set_filesystem | binary                                     |
| character_set_results    | latin1                                     |
| character_set_server     | latin1                                     |
| character_set_system     | utf8                                       |
| character_sets_dir       | /build/6.0-falcon-team/sql/share/charsets/ |
+--------------------------+--------------------------------------------+

Users may find this behavoir unintuitive and have trouble seeing the logic in it. Can we at least have some form or warning/documentation that UUID()'s return value is in a character set that is not shared with any of the columns created using CREATE TABLE on the same server installation?

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I am reopening this bug. Can we rationalize or otherwise document the fact that the character set of the value returned by UUID() silently prevents indexes from being used?

I've added the following paragraphs to the UUID section of the manual: 

              The UUID() function
              returns a string using the character set defined by the
              character_set_server parameter. If you
              are using UUID values in your tables and these columns are
              indexed the character set of your column or table should
              match the character set used when the
              UUID() was called. If you
              do not use the same character set for the column and the
              UUID value, then the indexes on those columns will not be
              used, which may lead to a reduction in performance and
              locked tables during operations as the table is searched
              sequentially for the value.

              You can convert between different character sets when
              using UUID-based strings using the
             CONVERT() function.