Bug #71278 NUMBER OF ROWS IN PERFORMANCE SCHEMA TABLES
Submitted: 2 Jan 2014 17:22 Modified: 17 Feb 2014 18:07
Reporter: Valeriy Kravchuk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S3 (Non-critical)
Version:5.6 OS:Any
Assigned to: Marc ALFF CPU Architecture:Any
Tags: missing manual, performance_schema, threads

[2 Jan 2014 17:22] Valeriy Kravchuk
Description:
I've noted the following in EXPLAIN output:

mysql> explain select * from performance_schema.threads\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: threads
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000
        Extra: NULL
1 row in set (0.00 sec)

So, optimizer should assume that there are 1000 rows in this table (for checking plans involving joins with some other tables this is important). At the same time I do not understand where 1000 comes from. Look:

mysql> show variables like 'max_conn%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| max_connect_errors | 100   |
| max_connections    | 100   |
+--------------------+-------+
2 rows in set (0.00 sec)

mysql> show variables like 'max_user%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| max_user_connections | 0     |
+----------------------+-------+
1 row in set (0.04 sec)

mysql> show variables like 'perf%';
+--------------------------------------------------------+-------+
| Variable_name                                          | Value |
+--------------------------------------------------------+-------+
| performance_schema                                     | ON    |
| performance_schema_accounts_size                       | 100   |
| performance_schema_digests_size                        | 10000 |
| performance_schema_events_stages_history_long_size     | 10000 |
| performance_schema_events_stages_history_size          | 10    |
| performance_schema_events_statements_history_long_size | 10000 |
| performance_schema_events_statements_history_size      | 10    |
| performance_schema_events_waits_history_long_size      | 10000 |
| performance_schema_events_waits_history_size           | 10    |
| performance_schema_hosts_size                          | 100   |
| performance_schema_max_cond_classes                    | 80    |
| performance_schema_max_cond_instances                  | 3300  |
| performance_schema_max_file_classes                    | 50    |
| performance_schema_max_file_handles                    | 32768 |
| performance_schema_max_file_instances                  | 6770  |
| performance_schema_max_mutex_classes                   | 200   |
| performance_schema_max_mutex_instances                 | 15600 |
| performance_schema_max_rwlock_classes                  | 30    |
| performance_schema_max_rwlock_instances                | 9000  |
| performance_schema_max_socket_classes                  | 10    |
| performance_schema_max_socket_instances                | 220   |
| performance_schema_max_stage_classes                   | 150   |
| performance_schema_max_statement_classes               | 168   |
| performance_schema_max_table_handles                   | 4000  |
| performance_schema_max_table_instances                 | 12500 |
| performance_schema_max_thread_classes                  | 50    |
| performance_schema_max_thread_instances                | 300   |
| performance_schema_session_connect_attrs_size          | 512   |
| performance_schema_setup_actors_size                   | 100   |
| performance_schema_setup_objects_size                  | 100   |
| performance_schema_users_size                          | 100   |
+--------------------------------------------------------+-------+
31 rows in set (0.00 sec)

None of the values above are related to 1000 (as a number of threads available in P_S.THREADS table) in any obvious way. Moreover, this great blog post:

http://marcalff.blogspot.com/2013/04/on-configuring-performance-schema.html

also does NOT explain this, even in a separate section titled "Provide sizing data".

How to repeat:
Try to find out in the manual why EXPLAIN report the number of rows as 1000 for PERFORMANCE_SCHEMA.THREADS table and is there any way to change that (and internal size of whatever array or structure is used).

Suggested fix:
Explain how this "size" is determined.
[2 Jan 2014 18:23] MySQL Verification Team
Hello Valeriy,

Thank you for the bug report.
Verified as described.

Thanks,
Umesh
[6 Jan 2014 14:13] Marc ALFF
Taking this as a server bug in the performance schema, instead of a doc bug.

Currently, the performance schema only supports full table scans,
and has *no* indexes and/or condition push down exposed to the optimizer,
so that the only thing the optimizer can possible care about is:
- whether the table is empty (0 rows)
- whether the table is a constant (exactly 1 row)
- whether the table has many rows (2 rows or more).
In other words, 3 different values are used : 0, 1, and "MANY"

All the optimizer heuristics are the same for N >= 2 rows, so returning a hard coded "1000" as of today, or a better and more accurate estimate, has no visible effect in any case.
The output of EXPLAIN SELECT is confusing, but there are no change of behavior what-so-ever caused by the inaccurate estimate returned for the number of rows.

Now, considering that the code (improved since then) already supports a better way to compute a number of rows for each table (see PFS_engine_table_share::get_row_count()),
fixing this bug to implement a m_get_row_count() callback for every table is better, and therefore needs to be done to improve the code base, and avoid confusion with EXPLAIN.

For the record, the size of table performance_schema.threads is controlled by the parameter named performance_schema_max_thread_instances.

While EXPLAIN SELECT does not show it, SHOW ENGINE PERFORMANCE_SCHEMA STATUS output is accurate and does show the correct sizes.
[17 Feb 2014 18:07] Paul DuBois
Noted in 5.7.4 changelog.

Previously, for EXPLAIN output, the rows-examined estimate for
Performance Schema tables always displayed as 1000. Now a more 
accurate estimate is displayed based on sizing parameters used when
allocating memory for each table. This results in no change of
behavior because Performance Schema tables have no indexes.