Bug #67736 default P_S configuration wastes CPU on a busy server
Submitted: 28 Nov 2012 3:55 Modified: 28 Nov 2012 18:24
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S1 (Critical)
Version:5.6-head OS:Any
Assigned to: Marc ALFF CPU Architecture:Any

[28 Nov 2012 3:55] Domas Mituzas
Description:
On a busy server with lots of tables and table_open_cache set to proper value (e.g. 50000) default PERFORMANCE_SCHEMA configuration will result in server wasting insane amount of CPU on P_S table setup, as it has to loop over a 20k-sized array, and the code of create_table() looks like this:

  while (++attempts <= table_max)
  {
    /* See create_mutex() */
    index= PFS_atomic::add_u32(& table_monotonic_index, 1) % table_max;
    pfs= table_array + index;
...

table_max is 20k, and conditions that break out of the loop don't seem to happen much.

It seems that we have lots of P_S drops: 

mysql> show status like '%perf%table%';
+-----------------------------------------+--------+
| Variable_name                           | Value  |
+-----------------------------------------+--------+
| Performance_schema_table_handles_lost   | 403850 |
| Performance_schema_table_instances_lost | 1074   |
+-----------------------------------------+--------+
2 rows in set (0.07 sec)

mysql> show status like '%perf%table%';
+-----------------------------------------+--------+
| Variable_name                           | Value  |
+-----------------------------------------+--------+
| Performance_schema_table_handles_lost   | 404031 |
| Performance_schema_table_instances_lost | 1074   |
+-----------------------------------------+--------+
2 rows in set (0.07 sec)

Example stack:

Breakpoint 1, create_table (share=0x7fb047c2be40, opening_thread=0x7fb038e89080, identity=0x7fa0d1478700) at /mysqlbuild/storage/perfschema/pfs_instr.cc:1306
1306	/mysqlbuild/storage/perfschema/pfs_instr.cc: No such file or directory.
(gdb) bt
#0  create_table (share=0x7fb047c2be40, opening_thread=0x7fb038e89080, identity=0x7fa0d1478700) at /mysqlbuild/storage/perfschema/pfs_instr.cc:1306
#1  0x0000000000596432 in handler::rebind_psi (this=0x7fa0d1478700) at /mysqlbuild/sql/handler.cc:2444
#2  0x00000000006709fb in open_table (thd=0x7fa1b00008c0, table_list=0x7fa1b1060a10, ot_ctx=0x7fb05c687150) at /mysqlbuild/sql/sql_base.cc:2880
#3  0x0000000000679536 in open_and_process_table (ot_ctx=0x7fb05c687150, has_prelocking_list=false, prelocking_strategy=0x7fb05c687210, flags=0, counter=0x7fa1b0002b00, tables=0x7fa1b1060a10, lex=0x7fa1b0002a58, thd=0x7fa1b00008c0)
    at /mysqlbuild/sql/sql_base.cc:4532
#4  open_tables (thd=0x7fa1b00008c0, start=0x7fb05c6871f8, counter=0x7fa1b0002b00, flags=0, prelocking_strategy=0x7fb05c687210) at /mysqlbuild/sql/sql_base.cc:4967
#5  0x00000000006796b1 in open_normal_and_derived_tables (thd=0x7fa1b00008c0, tables=0x7fa1b1060a10, flags=<optimized out>) at /mysqlbuild/sql/sql_base.cc:5655
#6  0x00000000006a0fa4 in mysql_insert (thd=0x7fa1b00008c0, table_list=0x7fa1b1060a10, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_UPDATE, ignore=false)
    at /mysqlbuild/sql/sql_insert.cc:712
#7  0x00000000006b55ed in mysql_execute_command (thd=0x7fa1b00008c0, statement_start_time=0x7fb05c688a10, post_parse=<optimized out>) at /mysqlbuild/sql/sql_parse.cc:3425
#8  0x00000000006bae1f in mysql_parse (thd=0x7fa1b00008c0, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, last_timer=0x7fb05c689338, async_commit=<optimized out>)
    at /mysqlbuild/sql/sql_parse.cc:6266
c#9  0x000000000081e98c in Query_log_event::do_apply_event (this=0x7fa1b1c512b0, rli=0x5e586340, 
    query_arg=0x7fa1b2087e8d "INSERT INTO"..., 
    q_len_arg=274) at /mysqlbuild/sql/log_event.cc:4688
o#10 0x000000000081d181 in Log_event::apply_event (this=0x7fa1b1c512b0, rli=0x5e586340) at /mysqlbuild/sql/log_event.cc:2932
#11 0x000000000084e619 in apply_event_and_update_pos (ptr_ev=0x7fb05c6897c0, thd=0x7fa1b00008c0, rli=0x5e586340) at /mysqlbuild/sql/rpl_slave.cc:3345
#12 0x00000000008516b2 in exec_relay_log_event (rli=0x5e586340, thd=0x7fa1b00008c0) at /mysqlbuild/sql/rpl_slave.cc:3756
#13 handle_slave_sql (arg=<optimized out>) at /mysqlbuild/sql/rpl_slave.cc:5574
#14 0x0000000000a69991 in pfs_spawn_thread (arg=0x7fa1086b9f70) at /mysqlbuild/storage/perfschema/pfs.cc:1853
#15 0x00007fb05f64ef81 in start_thread (arg=0x7fb05c68a700) at pthread_create.c:301
#16 0x00007fb05e38e85d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

How to repeat:
run 5.6 with lots of tables and large table cache and default P_S configuration

Suggested fix:
loops are bad
[28 Nov 2012 10:35] Marc ALFF
Thanks for the bug report.

This is definitively a bug, the heuristic to compute automated sizing is not working here.

For a table open cache of 50K, a reasonable value for the number of table handles should be 50K / 0.50 = 100K, not 20K.

That is, 100K table handles in table_array[] with an average load factor of 50 percent is expected to scale well.

With 20K, the table is full, and attempting to allocate a entry is definitively going to fail, causing the problem seen, as reported by the "lost" counters.

The code at fault is:

PFS_sizing_data *estimate_hints(PFS_global_param *param)
{
  /*
    Sanitize hints, to avoid returning extremely high or low estimates.
    If the real configuration used is outside of these bounds,
    manual tuning will be preferable.
  */

  enforce_range_long(& param->m_hints.m_max_connections, 10, 65535);
  enforce_range_long(& param->m_hints.m_table_definition_cache, 100, 10000);
  enforce_range_long(& param->m_hints.m_table_open_cache, 100, 10000);

---

Suggested fix:
Remove the call to enforce_range_long entirely, imposing limits like this turns out to be a bad idea.

Suggested work around:
In the my.cnf file, instead of relying to automated sizing (-1), use explicitly a value for the number of table handles, as in:

performance-schema-max-table-handles=100000

Also, the same problem seem to affect the number of table instances

Use
performance-schema-max-table-instances=N
where N is computed based on a 75 percent load factor, as follows:

table definition cache / 0.75 = table instances sizing (N)

Setting this bug to need feedback, please indicate:
- the value of the table_definition_cache used
- the value of max_connection used
- the result of show status and show variables for every performance schema variables

Thank you.
[28 Nov 2012 10:44] Marc ALFF
Correction:

table instance sizing = total number of tables / 0.75,

assuming you have even more tables than what can fit in the table definition cache.

Please indicate the total number of tables in the DB also.
[28 Nov 2012 14:46] Domas Mituzas
In this instance it was:

mysql> show variables like 'table_%cache%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| table_definition_cache     | 60000 |
| table_open_cache           | 50000 |
| table_open_cache_instances | 1     |
+----------------------------+-------+

max_connections is at 5000

mysql> show variables like '%perf%';
+--------------------------------------------------------+--------+
| Variable_name                                          | Value  |
+--------------------------------------------------------+--------+
| performance_schema                                     | ON     |
| performance_schema_accounts_size                       | 100    |
| performance_schema_digests_size                        | 10000  |
| performance_schema_events_stages_history_long_size     | 10000  |
| performance_schema_events_stages_history_size          | 10     |
| performance_schema_events_statements_history_long_size | 10000  |
| performance_schema_events_statements_history_size      | 10     |
| performance_schema_events_waits_history_long_size      | 10000  |
| performance_schema_events_waits_history_size           | 10     |
| performance_schema_hosts_size                          | 100    |
| performance_schema_max_cond_classes                    | 80     |
| performance_schema_max_cond_instances                  | 40100  |
| performance_schema_max_file_classes                    | 50     |
| performance_schema_max_file_handles                    | 32768  |
| performance_schema_max_file_instances                  | 50411  |
| performance_schema_max_mutex_classes                   | 200    |
| performance_schema_max_mutex_instances                 | 131000 |
| performance_schema_max_rwlock_classes                  | 30     |
| performance_schema_max_rwlock_instances                | 70400  |
| performance_schema_max_socket_classes                  | 10     |
| performance_schema_max_socket_instances                | 10020  |
| performance_schema_max_stage_classes                   | 150    |
| performance_schema_max_statement_classes               | 170    |
| performance_schema_max_table_handles                   | 20000  |
| performance_schema_max_table_instances                 | 12500  |
| performance_schema_max_thread_classes                  | 50     |
| performance_schema_max_thread_instances                | 10100  |
| performance_schema_session_connect_attrs_size          | 512    |
| performance_schema_setup_actors_size                   | 100    |
| performance_schema_setup_objects_size                  | 100    |
| performance_schema_users_size                          | 100    |
+--------------------------------------------------------+--------+
31 rows in set (0.08 sec)

mysql> show status like '%perf%';
+-----------------------------------------------+---------+
| Variable_name                                 | Value   |
+-----------------------------------------------+---------+
| Performance_schema_accounts_lost              | 0       |
| Performance_schema_cond_classes_lost          | 0       |
| Performance_schema_cond_instances_lost        | 0       |
| Performance_schema_digest_lost                | 0       |
| Performance_schema_file_classes_lost          | 0       |
| Performance_schema_file_handles_lost          | 0       |
| Performance_schema_file_instances_lost        | 0       |
| Performance_schema_hosts_lost                 | 0       |
| Performance_schema_locker_lost                | 0       |
| Performance_schema_mutex_classes_lost         | 0       |
| Performance_schema_mutex_instances_lost       | 0       |
| Performance_schema_rwlock_classes_lost        | 0       |
| Performance_schema_rwlock_instances_lost      | 0       |
| Performance_schema_session_connect_attrs_lost | 0       |
| Performance_schema_socket_classes_lost        | 0       |
| Performance_schema_socket_instances_lost      | 0       |
| Performance_schema_stage_classes_lost         | 0       |
| Performance_schema_statement_classes_lost     | 0       |
| Performance_schema_table_handles_lost         | 9910394 |
| Performance_schema_table_instances_lost       | 1074    |
| Performance_schema_thread_classes_lost        | 0       |
| Performance_schema_thread_instances_lost      | 0       |
| Performance_schema_users_lost                 | 0       |
+-----------------------------------------------+---------+
23 rows in set (0.07 sec)
[28 Nov 2012 15:38] Marc ALFF
Hi Domas, thanks for the feedback.

So, based on table_open_cache = 50K,
I suggest to use:
performance_schema_max_table_handles = 100K (load factor 50 percent)

Based on table_definition_cache = 60K,
I suggest to use:
performance_schema_max_table_instances = 80K (load factor 75 percent)

80K table instances is probably very generous, considering that your workload only lost 1074 tables with 12500 accounted for, which means the server workload has touched "only" 14K (12500+1074) tables so far.

Please let us know if that resolves the CPU waste itself.

In the mean time, I am fixing the bug that caused the estimates to be so off.

Regards,
-- Marc Alff, Oracle
[28 Nov 2012 18:24] Paul DuBois
Noted in 5.6.10, 5.7.1 changelogs.

Autosizing of Performance Schema parameters could result in settings
that caused excessive CPU use.