Bug #83912 Time spent sleeping before entering InnoDB is not measured/reported separately
Submitted: 21 Nov 2016 15:28 Modified: 23 Nov 2016 8:07
Reporter: Valeriy Kravchuk Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S3 (Non-critical)
Version:5.6, 5.7, 5.6.34 OS:Any
Assigned to: CPU Architecture:Any
Tags: performance_schema, sleeping before entering InnoDB, statistics, wait

[21 Nov 2016 15:28] Valeriy Kravchuk
Description:
Performance Schema does not tarck time spent "sleeping before entering InnoDB", like this:

---TRANSACTION 11254044, not started sleeping before entering InnoDB
mysql tables in use 1, locked 0
MySQL thread id 3, OS thread handle 0x7fb53aaa3b00, query id 156 localhost root statistics
select * from t0 where id = 13

In some cases (see details below) wait may take many seconds and even minutes, and it is NOT clear what we were waiting for. Time is reported at "statistics" stage without any way to get more details.

How to repeat:
With MySQL server started with --no-defaults create big enough table like this:

CREATE TABLE `t0` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`c1` int(11) DEFAULT NULL,
`c2` char(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Put some data there (insert into t0(c1, c2) select rand()*1000, rand() from t0;) to end up like this:

mysql> show table status like 't0'\G
*************************** 1. row ***************************
Name: t0
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 4065835
Avg_row_length: 129
Data_length: 528465920
Max_data_length: 0
Index_length: 0
Data_free: 5242880
Auto_increment: 4652971
Create_time: 2016-11-09 10:23:17
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.02 sec)

Then create N+2 tables, ti, as copies of this one:

create table t1 like t0;
insert into t1 select * from t0; 

N is the number of cores on the system used.

Then limit innodb_thread_concurrency to N and enable all instruments and consumers in performance_schema.

Start N+2 concurrent updates for large enough portions of the tables ti:

for i in `seq 1 6`; do bin/mysqlslap -uroot --no-drop --create-schema=test --number-of-queries=1000 iternations=10 --concurrency=1 --query="update t$i set c2 = rand() where id between 2000 and 300000;" & done

In my case N was 4.

Now, while that updates are running and using all InnoDB threads, in a separate session run simple SELECT like this from the unrelated table, t0:

MariaDB [test]> select * from t0 where id = 13;
+----+------+--------------------+
| id | c1   | c2                 |
+----+------+--------------------+
| 13 |   75 | 0.9712750250206096 |
+----+------+--------------------+
1 row in set (13.07 sec)

Depending on the hardware and time to run it you may see even longer execution:

MariaDB [test]> select * from t0 where id = 13;
+----+------+--------------------+
| id | c1   | c2                 |
+----+------+--------------------+
| 13 |   75 | 0.9712750250206096 |
+----+------+--------------------+
1 row in set (2 min 1.56 sec)

The time is mostly spent waiting for a free InnoDB thread to appear, but in performance_schema it is reported as spent on "statistcis" stage without any specific wait. You can check with the query like this:

SELECT thread_id, event_id, nesting_event_id, CONCAT( CASE WHEN event_name LIKE 'stage%' THEN
CONCAT(' ', event_name) WHEN event_name LIKE 'wait%' AND
nesting_event_id IS NOT NULL THEN CONCAT(' ', event_name) ELSE
IF(digest_text IS NOT NULL, SUBSTR(digest_text, 1, 64), event_name) END,
' (',ROUND(timer_wait/1000000000, 2),'ms) ') event
FROM (
(SELECT thread_id,
event_id, event_name, timer_wait, timer_start, nesting_event_id,
digest_text FROM events_statements_history_long)
UNION
(SELECT
thread_id, event_id, event_name, timer_wait, timer_start,
nesting_event_id, NULL FROM events_stages_history_long)
UNION
(SELECT
thread_id, event_id, event_name, timer_wait, timer_start,
nesting_event_id, NULL FROM events_waits_history_long)
) events
ORDER BY thread_id, event_id;

to get something like the following:

...
|        26 |      379 |             NULL | SELECT * FROM `t0` WHERE ID = ?  (13
072.50ms)                               |
|        26 |      380 |              379 |  stage/sql/init (0.05ms)
                                        |
|        26 |      383 |              379 |  stage/sql/checking permissions (0.0
0ms)                                    |
|        26 |      384 |              379 |  stage/sql/Opening tables (0.02ms)
                                        |
|        26 |      386 |              379 |  stage/sql/After opening tables (0.0
0ms)                                    |
|        26 |      387 |              379 |  stage/sql/System lock (0.00ms)
                                        |
|        26 |      389 |              379 |  stage/sql/Table lock (0.00ms)
                                        |
|        26 |      391 |              379 |  stage/sql/init (0.02ms)
                                        |
|        26 |      392 |              379 |  stage/sql/optimizing (0.01ms)
                                        |
|        26 |      393 |              379 |  stage/sql/statistics (13072.32ms)
                                        |
|        26 |      396 |              379 |  stage/sql/preparing (0.00ms)
                                        |
|        26 |      397 |              379 |  stage/sql/Unlocking tables (0.02ms)
                                        |
|        26 |      398 |              379 |  stage/sql/executing (0.00ms)
                                        |
|        26 |      399 |              379 |  stage/sql/Sending data (0.01ms)
                                        |
|        26 |      400 |              379 |  stage/sql/end (0.00ms)
                                        |
|        26 |      401 |              379 |  stage/sql/query end (0.00ms)

|        26 |      402 |              379 |  stage/sql/closing tables (0.00ms)
                                        |
|        26 |      403 |              379 |  stage/sql/Unlocking tables (0.01ms)
                                        |
|        26 |      405 |              379 |  stage/sql/freeing items (0.00ms)
                                        |
|        26 |      408 |              379 |  stage/sql/cleaning up (0.00ms)
                                        |
...

That is, no details on any waits that caused "statistics" stage to take 13 seconds.

Suggested fix:
Please, measure/count time spent "sleeping before entering InnoDB" separately at any stage of any query execution. That would help a lot in troubleshooting performance problems like the one outlined above via Performance Schema queries.
[21 Nov 2016 15:32] Valeriy Kravchuk
In gdb thread doing SELECT has the following backtrace while waiting:

Thread 14 (Thread 0x7f6523b71b00 (LWP 3603)):
#0 0x00007f65212a3903 in select () from /lib64/libc.so.6
#1 0x00005579d21e4deb in os_thread_sleep (tm=tm@entry=23582)
at /home/openxs/git/server/storage/xtradb/os/os0thread.cc:307
#2 0x00005579d2241db9 in srv_conc_enter_innodb_with_atomics (
trx=trx@entry=0x7f64f40e2668)
at /home/openxs/git/server/storage/xtradb/srv/srv0conc.cc:298
#3 srv_conc_enter_innodb (trx=trx@entry=0x7f64f40e2668)
at /home/openxs/git/server/storage/xtradb/srv/srv0conc.cc:597
#4 0x00005579d21893b5 in innobase_srv_conc_enter_innodb (trx=0x7f64f40e2668)
at /home/openxs/git/server/storage/xtradb/handler/ha_innodb.cc:1936
#5 ha_innobase::index_read (this=0x7f64f406f820, buf=0x7f64f4032820 "\377\r",
key_ptr=<optimized out>, key_len=<optimized out>,
find_flag=<optimized out>)
at /home/openxs/git/server/storage/xtradb/handler/ha_innodb.cc:10123
#6 0x00005579d20431dd in handler::index_read_idx_map (this=0x7f64f406f820,
buf=0x7f64f4032820 "\377\r", index=0, key=0x7f64f413b848 "\r",
keypart_map=1, find_flag=HA_READ_KEY_EXACT)
at /home/openxs/git/server/sql/handler.cc:5441
#7 0x00005579d204751c in handler::ha_index_read_idx_map (this=0x7f64f406f820,
buf=0x7f64f4032820 "\377\r", index=<optimized out>,
key=0x7f64f413b848 "\r", keypart_map=1,
find_flag=find_flag@entry=HA_READ_KEY_EXACT)
at /home/openxs/git/server/sql/handler.cc:2646
#8 0x00005579d1f17c87 in join_read_const (tab=tab@entry=0x7f64f413a780)
at /home/openxs/git/server/sql/sql_select.cc:18896
#9 0x00005579d1f17df4 in join_read_const_table (thd=0x7f651f213008,
tab=tab@entry=0x7f64f413a780, pos=<optimized out>, pos=<optimized out>)
at /home/openxs/git/server/sql/sql_select.cc:18771
#10 0x00005579d1f1ced1 in make_join_statistics (
join=join@entry=0x7f64f4139be0, tables_list=...,
keyuse_array=keyuse_array@entry=0x7f64f4139f10)
at /home/openxs/git/server/sql/sql_select.cc:3929
#11 0x00005579d1f24727 in JOIN::optimize_inner (this=0x7f64f4139be0)
at /home/openxs/git/server/sql/sql_select.cc:1361
#12 0x00005579d1f279c8 in JOIN::optimize (this=0x7f64f4139be0)
at /home/openxs/git/server/sql/sql_select.cc:1040
#13 0x00005579d1f283af in mysql_select (thd=thd@entry=0x7f651f213008,
rref_pointer_array=rref_pointer_array@entry=0x7f651f217288,
tables=0x7f64f4139208, wild_num=<optimized out>, fields=..., conds=
0x7f64f4139998, og_num=0, order=0x0, group=0x0, having=0x0,
proc_param=0x0, select_options=2147748608, result=0x7f64f4139bc0, unit=
0x7f651f216910, select_lex=0x7f651f217010)
at /home/openxs/git/server/sql/sql_select.cc:3424
#14 0x00005579d1f28570 in handle_select (thd=thd@entry=0x7f651f213008,
lex=lex@entry=0x7f651f216848, result=result@entry=0x7f64f4139bc0,
setup_tables_done_option=setup_tables_done_option@entry=0)
at /home/openxs/git/server/sql/sql_select.cc:384
#15 0x00005579d1ecf948 in execute_sqlcom_select (thd=thd@entry=0x7f651f213008,
all_tables=0x7f64f4139208) at /home/openxs/git/server/sql/sql_parse.cc:5893
#16 0x00005579d1edc28d in mysql_execute_command (thd=thd@entry=0x7f651f213008)
at /home/openxs/git/server/sql/sql_parse.cc:2965
#17 0x00005579d1edef2b in mysql_parse (thd=0x7f651f213008,
rawbuf=<optimized out>, length=<optimized out>,
parser_state=<optimized out>)
at /home/openxs/git/server/sql/sql_parse.cc:7316
#18 0x00005579d1ee1e80 in dispatch_command (command=command@entry=COM_QUERY,
thd=thd@entry=0x7f651f213008,
packet=packet@entry=0x7f650d3c8009 "select * from t0 where id=13",
packet_length=packet_length@entry=28)
at /home/openxs/git/server/sql/sql_parse.cc:1487
#19 0x00005579d1ee25d3 in do_command (thd=0x7f651f213008)
at /home/openxs/git/server/sql/sql_parse.cc:1108
#20 0x00005579d1f9998c in do_handle_one_connection (
thd_arg=thd_arg@entry=0x7f651f213008)
at /home/openxs/git/server/sql/sql_connect.cc:1350
#21 0x00005579d1f99b57 in handle_one_connection (arg=0x7f651f213008)
at /home/openxs/git/server/sql/sql_connect.cc:1262
#22 0x00007f652381b61a in start_thread () from /lib64/libpthread.so.0
#23 0x00007f65212ad5fd in clone () from /lib64/libc.so.6
[23 Nov 2016 8:07] Umesh Shastry
Hello Valerii,

Thank you for the report and test case.
Observed this with 5.6.34.

Thanks,
Umesh