MySQL Bugs: #84858: Time is spent at "statistics" stage for primary key lookups

Bug #84858	Time is spent at "statistics" stage for primary key lookups
Submitted:	7 Feb 2017 13:09	Modified:	9 Feb 2017 6:32
Reporter:	Valeriy Kravchuk	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.6, 5.7, 5.6.35	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	const optimization, join_read_const, statistics

Description:
Manual (https://dev.mysql.com/doc/refman/5.7/en/general-thread-states.html) says:

"statistics

The server is calculating statistics to develop a query execution plan. If a thread is in this state for a long time, the server is probably disk-bound performing other work."

There are cases (see below for details) when "statistics" may be a misleading name, as time is spent NOT on calculating or accessing statistics (as the plan is obvious, say, in case of single row lookup by primary key), but somewhere inside the storage engine while reading actual data for the row.

Additional comment in the manual helps to guess this may be the case, but "probably disk-bound" part may still not apply and is not really clear. 

How to repeat:
Check https://bugs.mysql.com/bug.php?id=83912 on how to see the query like this:

select * from t0 where id = 13;

(where t0 is the InnoDB table and id is the primary key) spending many seconds at "statistics" stage. 

In reality, based on profiler outputs:

                        |          |--71.70%-- srv_conc_enter_innodb(trx_t*)
                        |          |          ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)
                        |          |          handler::index_read_idx_map(unsigned char*, unsigned int, unsigned char const*, unsigned long, ha_rkey_function)
                        |          |          handler::ha_index_read_idx_map(unsigned char*, unsigned int, unsigned char const*, unsigned long, ha_rkey_function)
                        |          |          join_read_const(st_join_table*)
                        |          |          join_read_const_table(THD*, st_join_table*, st_position*) [clone .isra.301]
                        |          |          make_join_statistics(JOIN*, List<TABLE_LIST>&, st_dynamic_array*)
                        |          |          JOIN::optimize_inner()
                        |          |          JOIN::optimize()
                        |          |          mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_o                        |          |          handle_select(THD*, LEX*, select_result*, unsigned long)
                        |          |          execute_sqlcom_select(THD*, TABLE_LIST*)
                        |          |          mysql_execute_command(THD*)
                        |          |          mysql_parse(THD*, char*, unsigned int, Parser_state*)
                        |          |          dispatch_command(enum_server_command, THD*, char*, unsigned int)
                        |          |          do_command(THD*)
                        |          |          do_handle_one_connection(THD*)
                        |          |          handle_one_connection
                        |          |          start_thread

(see http://mysqlentomologist.blogspot.com/2017/01/perf-basics-for-mysql-profiling.html) and source code study it is clear that the time is spent on the actual row access (aka const optimization).

So, stage name is misleading (at best) for this case.

Suggested fix:
Introduce new stage ("Const optimization" or anything but "statistics") to show properly where the time is spent in this case. 

Alternatively, attribute it to the "Sending data" or some other existing stage that is more appropriate.

Hello Valerii,

Thank you for the report and feedback.

Thanks,
Umesh