Bug #79533 events_statements_summary_by_digest: no unique rows by DIGEST/SCHEMA_NAME
Submitted: 5 Dec 2015 8:45 Modified: 28 Sep 2016 14:44
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S3 (Non-critical)
Version:5.6.27,5.7.9, 5.6.30, 5.7.12 OS:Any
Assigned to: Marc Alff CPU Architecture:Any
Tags: duplicate rows, events_statements_summary_by_digest, performance_schema

[5 Dec 2015 8:45] Simon Mudd
Description:
https://dev.mysql.com/doc/refman/5.7/en/statement-summary-tables.html indicates that the events_statements_summary_by_digest table indicates: "Each row summarizes events for given schema/digest values."

I would therefore expect to see one row for each DIGEST, SCHEMA_NAME combination.

However, when checking on some servers I see this is not the case.

While not critical if you want people to process P_S tables (and this one is an interesting one) then making sure it's clear what information should be expected to be found in these tables is most important, especially when applications are retrieving data from them.

How to repeat:
On MySQL 5.7.9 I ran the following on my laptop which has basically no activity:

root@localhost [performance_schema]> select DIGEST, SCHEMA_NAME, COUNT(*), DIGEST_TEXT FROM events_statements_summary_by_digest group by DIGEST, SCHEMA_NAME HAVING COUNT(*) > 1;
+----------------------------------+--------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| DIGEST                           | SCHEMA_NAME        | COUNT(*) | DIGEST_TEXT                                                                                                                                     |
+----------------------------------+--------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| eca71dab2b353011064b7097129aeac0 | performance_schema |        2 | SELECT `MD5` ( `DIGEST_TEXT` ) AS ? , SCHEMA_NAME , `DIGEST_TEXT` , `COUNT_STAR` , `SUM_TIMER_WAIT` FROM `events_statements_summary_by_digest`  |
+----------------------------------+--------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

root@localhost [performance_schema]> SELECT @@VERSION;
+-----------+
| @@VERSION |
+-----------+
| 5.7.9-log |
+-----------+
1 row in set (0.00 sec)

root@localhost [performance_schema]> 

Expectation that there would be no results fails.

I also see this in 5.6.17. (not shown as it's an older version)

And also seen in 5.6.27:

root@myhost [performance_schema]> select DIGEST, SCHEMA_NAME, COUNT(*), DIGEST_TEXT FROM events_statements_summary_by_digest group by DIGEST, SCHEMA_NAME HAVING COUNT(*) > 1;
+----------------------------------+-------------+----------+----------------------------------------------------------------------------------+
| DIGEST                           | SCHEMA_NAME | COUNT(*) | DIGEST_TEXT                                                                      |
+----------------------------------+-------------+----------+----------------------------------------------------------------------------------+
| 64babe82fcc4ce6669911698e9688fd6 | xxx         |        2 | SELECT `xxxxxxxxx` FROM `xxxxxxxxxxxx` WHERE yyyyyyyy = ? AND `zzzzzzzz` IN (?)  |
+----------------------------------+-------------+----------+----------------------------------------------------------------------------------+
1 row in set (0.01 sec)

root@myhost [performance_schema]> select @@version;
+------------+
| @@version  |
+------------+
| 5.6.27-log |
+------------+
1 row in set (0.00 sec)

(I've had to anonymise the query above.)

Suggested fix:
(1) Make documentation more explicit about expected behaviour.
(2) If this is a bug then please fix it.
[5 Dec 2015 8:47] Simon Mudd
Adjust synopsis to reflect uniqueness must be by DIGEST/SCHEMA_NAME
[5 Dec 2015 11:58] Simon Mudd
Sample output I see (on 5.7.9) is:

root@localhost [performance_schema]> select * from events_statements_summary_by_digest where digest = 'eca71dab2b353011064b7097129aeac0'\G
*************************** 1. row ***************************
                SCHEMA_NAME: performance_schema
                     DIGEST: eca71dab2b353011064b7097129aeac0
                DIGEST_TEXT: SELECT `MD5` ( `DIGEST_TEXT` ) AS ? , SCHEMA_NAME , `DIGEST_TEXT` , `COUNT_STAR` , `SUM_TIMER_WAIT` FROM `events_statements_summary_by_digest` 
                 COUNT_STAR: 0
             SUM_TIMER_WAIT: 0
             MIN_TIMER_WAIT: 0
             AVG_TIMER_WAIT: 0
             MAX_TIMER_WAIT: 0
              SUM_LOCK_TIME: 0
                 SUM_ERRORS: 0
               SUM_WARNINGS: 0
          SUM_ROWS_AFFECTED: 0
              SUM_ROWS_SENT: 0
          SUM_ROWS_EXAMINED: 0
SUM_CREATED_TMP_DISK_TABLES: 0
     SUM_CREATED_TMP_TABLES: 0
       SUM_SELECT_FULL_JOIN: 0
 SUM_SELECT_FULL_RANGE_JOIN: 0
           SUM_SELECT_RANGE: 0
     SUM_SELECT_RANGE_CHECK: 0
            SUM_SELECT_SCAN: 0
      SUM_SORT_MERGE_PASSES: 0
             SUM_SORT_RANGE: 0
              SUM_SORT_ROWS: 0
              SUM_SORT_SCAN: 0
          SUM_NO_INDEX_USED: 0
     SUM_NO_GOOD_INDEX_USED: 0
                 FIRST_SEEN: 2015-12-05 00:56:17
                  LAST_SEEN: 2015-12-05 00:56:17
*************************** 2. row ***************************
                SCHEMA_NAME: performance_schema
                     DIGEST: eca71dab2b353011064b7097129aeac0
                DIGEST_TEXT: SELECT `MD5` ( `DIGEST_TEXT` ) AS ? , SCHEMA_NAME , `DIGEST_TEXT` , `COUNT_STAR` , `SUM_TIMER_WAIT` FROM `events_statements_summary_by_digest` 
                 COUNT_STAR: 149
             SUM_TIMER_WAIT: 79366886000
             MIN_TIMER_WAIT: 279540000
             AVG_TIMER_WAIT: 532663000
             MAX_TIMER_WAIT: 8023227000
              SUM_LOCK_TIME: 26480000000
                 SUM_ERRORS: 0
               SUM_WARNINGS: 0
          SUM_ROWS_AFFECTED: 0
              SUM_ROWS_SENT: 2896
          SUM_ROWS_EXAMINED: 2896
SUM_CREATED_TMP_DISK_TABLES: 0
     SUM_CREATED_TMP_TABLES: 0
       SUM_SELECT_FULL_JOIN: 0
 SUM_SELECT_FULL_RANGE_JOIN: 0
           SUM_SELECT_RANGE: 0
     SUM_SELECT_RANGE_CHECK: 0
            SUM_SELECT_SCAN: 150
      SUM_SORT_MERGE_PASSES: 0
             SUM_SORT_RANGE: 0
              SUM_SORT_ROWS: 0
              SUM_SORT_SCAN: 0
          SUM_NO_INDEX_USED: 150
     SUM_NO_GOOD_INDEX_USED: 0
                 FIRST_SEEN: 2015-12-05 00:56:17
                  LAST_SEEN: 2015-12-05 11:32:12
2 rows in set (0.00 sec)

root@localhost [performance_schema]> 

In this particular case it seems there there is an empty row at the FIRST_SEEN time, the other row has been updated later and has real values.
[5 Dec 2015 12:06] Simon Mudd
Note the column values in the second row:

                 COUNT_STAR: 149     <<  the query is the same and COUNT_STAR < SUM_SELECT_SCAN or SUM_NO_INDEX_USED
            SUM_SELECT_SCAN: 150
          SUM_NO_INDEX_USED: 150

and all zeros in the first row. It looks to me like there have been 2 updates happening at the same time.
So while P_S tables are supposed to be non-[b]locking it seems like there's some "corruption" happening here.
I guess the empty row represents the missing 'COUNT_STAR' value.
[5 Dec 2015 15:56] Simon Mudd
Further investigation shows that there may be values in either row.
[7 Dec 2015 9:37] Umesh Shastry
Hello Simon,

Thank you for the report.
Observed this behavior with 5.6.27 build.

Thanks,
Umesh
[10 May 2016 7:10] Umesh Shastry
5.6.30 test results

Attachment: 79533_5.6.30.results (application/octet-stream, text), 11.29 KiB.

[10 May 2016 7:23] Umesh Shastry
5.7.12 test results

Attachment: 79533_5.7.12.results (application/octet-stream, text), 9.84 KiB.

[17 May 2016 21:18] Benjamin Kochie
I can also confirm this bug on MySQL (Percona 5.6.25-73.1)

We only started seeing this after deploying a daily TRUNCATE TABLE script to reset counters for this metrics table.
[19 May 2016 6:14] Umesh Shastry
Community build - 5.6.30

Attachment: 79533_community.results (application/octet-stream, text), 13.99 KiB.

[28 Sep 2016 14:44] Paul Dubois
Posted by developer:
 
Noted in 5.6.35, 5.7.17, 8.0.1 changelogs.

The Performance Schema events_statements_summary_by_digest table
could contain multiple rows for the same statement digest and schema
combination, rather than the expected single (unique) row.