Bug #89559 P_S recording wrong digest/digest_text for select statements using views
Submitted: 6 Feb 20:45 Modified: 14 Feb 16:14
Reporter: dob bod Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S4 (Feature request)
Version:5.7.21 OS:Any
Assigned to: CPU Architecture:Any
Tags: digest, DIGEST_TEXT, events_statements_summary_by_digest, performance_schema

[6 Feb 20:45] dob bod
Description:
The table performance_schema.events_statements_summary_by_digest is recording the same digest/digest_text for very different select statements that use views. Actually, the digest/digest_text of the view is being recorded. 
The bug is seen in MySQL 8.0 too.
I suspect that MySQL Enterprise Monitor is affected by this bug too, grouping several different statements under the same digest.

How to repeat:
1) Execute any select statement that uses a view. 
2) Find the view's digest/digest_text in the table events_statements_summary_by_digest:
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 3;
3) Add a where condition or a join to the statement. Execute again.
4) Observe the difference in the COUNT_STAR column:
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 3;
*) In MySQL 8.0.4, observe the difference in the fields DIGEST_TEXT and QUERY_SAMPLE_TEXT.

Suggested fix:
The digest should be calculated based on the text of the select statement instead the text of the view.
[7 Feb 16:36] Sinisa Milivojevic
Hi!

Thank you for your bug report. 

What is the algorithm used for the views that you were testing ???

If your views were TEMPTABLE, then your report should be looked at. However, if you use MERGE views, then there is no way of separating queries on those views with queries on the tables used by those vuews.

Thank you .....
[7 Feb 16:37] Sinisa Milivojevic
If you use TEMPTABLE views, then please provide the exact P_S queries that you have used ....
[7 Feb 19:27] dob bod
Hi!
In my tests, It seems that the results are the same independently of the algorithm.
The script I am using is below:

create database tempor;

use tempor;

create table temp_a (  id integer primary key, nome varchar(50) );

create table temp_b (  id integer primary key, id_a integer, nome varchar(50), foreign key (id_a) references temp_a(id) );

insert into temp_a values ( 1, 'casa');

insert into temp_a values ( 2, 'barco');

insert into temp_a values ( 3, 'predio');

insert into temp_a values ( 4, 'navio');

insert into temp_b values ( 1 , 1 , 'triplex');

insert into temp_b values ( 2 , 2 , 'lancha model B');

insert into temp_b values ( 3 , 1 , 'sobrado');

insert into temp_b values ( 4 , 1 , 'mansao');

#create ALGORITHM = MERGE view ab_view as ( select a.id as id_a, a.nome as nome_a, b.id as id_b, b.nome as nome_b from temp_a a, temp_b b where a.id = b.id_a );
create ALGORITHM = TEMPTABLE view ab_view as ( select a.id as id_a, a.nome as nome_a, b.id as id_b, b.nome as nome_b from temp_a a, temp_b b where a.id = b.id_a );

select sleep(5);

select * from ab_view;

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

select sleep(1);

select * from ab_view where nome_b like '%tri%';

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

select sleep(1);

select a.* from temp_a a , ab_view b where a.id = b.id_a;

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

select sleep(1);

select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%bar%';

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

select sleep(1);

select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%bar%' and a.id = 2;

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

select sleep(1);

select * from ab_view where id_a = 1;

#the previous select statement statistics are recorded under the DIGEST / DIGEST_TEXT of the view
select * from performance_schema.events_statements_summary_by_digest order by LAST_SEEN desc limit 5\G

drop database tempor;
[7 Feb 20:02] dob bod
In other words, when I execute "select * from ab_view where nome_b like '%tri%'", I expect to see something like "select * from ab_view where nome_b like '?'" as a DIGEST_TEXT, and when I execute "select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%bar%'", I expect to see "select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%?%'". However, in both cases, I am seeing the view's digest:
"( SELECT `a` . `id` AS `id_a` , `a` . `nome` AS `nome_a` , `b` . `id` AS `id_b` , `b` . `nome` AS `nome_b` FROM `tempor` . `temp_a` `a` JOIN `tempor` . `temp_b` `b` WHERE ( `a` . `id` = `b` . `id_a` ) )"
[8 Feb 13:24] Sinisa Milivojevic
Hi!

Specifying algorithm in CREATE VIEW does not need necessarily that MySQL will use that algorithm. Sometimes it can't use it and will choose, by itself, the other one. So, the only thing missing after CREATE VIEW , in both cases, is that you send us view attributes, so that we can see what was the actual algorithm used.

Thank you in advance.
[8 Feb 19:32] dob bod
script output with view merge

Attachment: MERGE.txt (text/plain), 38.57 KiB.

[8 Feb 19:33] dob bod
script output with view temptable

Attachment: TEMPTABLE.txt (text/plain), 37.66 KiB.

[8 Feb 19:35] dob bod
Hello!

What command do I need to execute to get the view attributes?
I suppose MySQL is choosing the specified algorithm. When using "ALGORITHM = TEMPTABLE", SUM_CREATED_TMP_TABLES is incremented. When using "ALGORITHM = MERGE", it's not.

Thank you.
[9 Feb 13:03] Sinisa Milivojevic
Hi!

You have not provided any info that we asked from you in your two latest attachments.

Also, our Reference Manual has all the answers to the questions that you asked. Read entire 23.5 chapter and mostly sections 23.5.2 and 23.5.5.
[9 Feb 15:14] dob bod
Hi! 

Sorry. I think I know what you want now:

show create table ab_view
--------------
*************************** 1. row ***************************
                View: ab_view
         Create View: CREATE ALGORITHM=TEMPTABLE DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `ab_view` AS (select `a`.`id` AS `id_a`,`a`.`nome` AS `nome_a`,`b`.`id` AS `id_b`,`b`.`nome` AS `nome_b` from (`temp_a` `a` join `temp_b` `b`) where (`a`.`id` = `b`.`id_a`))
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.00 sec)

select * from views where table_name like 'ab_view'
--------------
*************************** 1. row ***************************
       TABLE_CATALOG: def
        TABLE_SCHEMA: tempor
          TABLE_NAME: ab_view
     VIEW_DEFINITION: (select `a`.`id` AS `id_a`,`a`.`nome` AS `nome_a`,`b`.`id` AS `id_b`,`b`.`nome` AS `nome_b` from `tempor`.`temp_a` `a` join `tempor`.`temp_b` `b` where (`a`.`id` = `b`.`id_a`))
        CHECK_OPTION: NONE
        IS_UPDATABLE: NO
             DEFINER: root@localhost
       SECURITY_TYPE: DEFINER
CHARACTER_SET_CLIENT: utf8
COLLATION_CONNECTION: utf8_general_ci
1 row in set (0.00 sec)

Thank you.
[9 Feb 16:19] Sinisa Milivojevic
Hi!

How have you setup statement digest instrumentation ???

Can you also let us know the setup of the P_S consumers and P_S events ???
[9 Feb 16:59] dob bod
performance_schema setup

Attachment: SHOW_SETUP.txt (text/plain), 127.86 KiB.

[9 Feb 16:59] dob bod
Hello!

The case is just an example. It was ran in a fresh install with default configuration. No setup changes were done.
That being said, the configurations I have here are attached above.

Thanks
[12 Feb 17:15] Sinisa Milivojevic
The output from the test case.

Attachment: output.txt (text/plain), 43.04 KiB.

[12 Feb 17:16] Sinisa Milivojevic
Hi!

I have got a result that is uploaded in the file named output.txt.

Is it the same result as yours ???

If it is the same result, what is the problem ??? That query's text is changed and does not include view name ???
[12 Feb 22:16] dob bod
Hi!!
The last select is not the same, but I can explain the problem anyway.
The problem is that every query should have their own unique DIGEST and DIGEST_TEXT, since they are different. For instance, 
"select * from ab_view where nome_b like '%tri%'", 
should have a DIGEST_TEXT like 
"select * from ab_view where nome_b like '?'", 
and 
"select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%bar%' and a.id = 2"
should have a DIGEST_TEXT like
"select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '?' and a.id = ?".
However, every execution that has a view involved is recorded under the DIGEST 89c4e1026cee35d60dd22568af23905f. All the statistics for different queries are grouped. For instance, SUM_SELECT_FULL_JOIN has a value of 2 in the end. How could I know which of the queries is doing SUM_SELECT_FULL_JOIN?

Thank you.
[13 Feb 16:24] dob bod
The last select is not the same, but I can explain the problem anyway.
The problem is that every query should have their own unique DIGEST and DIGEST_TEXT, since they are different. For instance, 
"select * from ab_view where nome_b like '%tri%'", 
should have a DIGEST_TEXT like 
"select * from ab_view where nome_b like '?'", 
and 
"select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '%bar%' and a.id = 2"
should have a DIGEST_TEXT like
"select a.* from temp_a a , ab_view b where a.id = b.id_a and a.nome like '?' and a.id = ?".
However, every execution that has a view involved is recorded under the DIGEST 89c4e1026cee35d60dd22568af23905f. All the statistics for different queries are grouped. For instance, SUM_SELECT_FULL_JOIN has a value of 2 in the end. How could I know which of the queries is doing SELECT_FULL_JOIN?

Thank you.
[13 Feb 16:37] dob bod
Sorry for duplicating the answer, I was trying to correct the text.
[13 Feb 17:01] Sinisa Milivojevic
Hi!

Turns out this is intended behaviour. However, I do agree with you so I am verifying this report as the feature request.
[14 Feb 16:14] dob bod
Hi!!

It would be great if that feature could be implemented. 
Tools based on the performance_schema (like MySQL Enterprise Monitor, Percona PMM, etc ) can be almost useless if a lot of view are used (is my case).

Thank you very much!