Description:
In table_esms_by_digest::index_next we iterate through statements_digest_stat_array without checking digest_stat->m_lock.is_populated() like we do in table_esms_by_digest::rnd_next
The only check we do in index_next, is checking if m_first_seen is 0 but that is not sufficient because in find_or_create_digest, we will set m_first_seen for entries that end up being unused/deallocated.
This leads to situations where we return back invalid rows.
Also, I'm giving an example in the events_statements_summary_by_digest table, but by code inspection, I think this problem probably affects other tables as well.
How to repeat:
Apply following changes (compile with -DWITH_UNIT_TESTS=0 since adding DEBUG_SYNC doesn't build for some unit tests, but we only care about building mysqld)
diff --git a/storage/perfschema/pfs_digest.cc b/storage/perfschema/pfs_digest.cc
index 828172925dc..58a5fe3b53b 100644
--- a/storage/perfschema/pfs_digest.cc
+++ b/storage/perfschema/pfs_digest.cc
@@ -34,6 +34,7 @@
#include <assert.h>
#include <string.h>
+#include "sql/debug_sync.h"
#include "my_compiler.h"
#include "my_murmur3.h"
@@ -281,6 +282,7 @@ search:
/* Lookup LF_HASH using this new key. */
entry = reinterpret_cast<PFS_statements_digest_stat **>(
lf_hash_search(&digest_hash, pins, &hash_key, sizeof(PFS_digest_key)));
+ DEBUG_SYNC(current_thd, "after_lf_hash_search");
if (entry && (entry != MY_LF_ERRPTR)) {
/* If digest already exists, update stats and return. */
Run this mtr test:
use performance_schema;
truncate table events_statements_summary_by_digest;
connect (con1, localhost, root,,);
CONNECTION con1;
use test;
SET DEBUG_SYNC='after_lf_hash_search SIGNAL not_found WAIT_FOR continue';
connect (con2, localhost, root,,);
CONNECTION con2;
use test;
SET DEBUG_SYNC='now WAIT_FOR not_found';
SET DEBUG_SYNC='now SIGNAL continue';
connection default;
select schema_name, digest, digest_text, count_star from events_statements_summary_by_digest USE INDEX() where schema_name = 'test';
select schema_name, digest, digest_text, count_star from events_statements_summary_by_digest where schema_name = 'test';
Observe the following results:
use performance_schema;
truncate table events_statements_summary_by_digest;
use test;
SET DEBUG_SYNC='after_lf_hash_search SIGNAL not_found WAIT_FOR continue';
use test;
SET DEBUG_SYNC='now WAIT_FOR not_found';
SET DEBUG_SYNC='now SIGNAL continue';
select schema_name, digest, digest_text, count_star from events_statements_summary_by_digest USE INDEX() where schema_name = 'test';
schema_name digest digest_text count_star
test bdf269459ee1e562d4b251cd893cc077fa83a57af67cc7648615bb8b1dc2a298 USE `test` 2
test 38bf19041c8c8063d5d4dcd0b6cc0bad490baa443d56905804bf3af070fdd1e6 SET `DEBUG_SYNC` = ? 3
select schema_name, digest, digest_text, count_star from events_statements_summary_by_digest where schema_name = 'test';
schema_name digest digest_text count_star
test bdf269459ee1e562d4b251cd893cc077fa83a57af67cc7648615bb8b1dc2a298 USE `test` 2
test 38bf19041c8c8063d5d4dcd0b6cc0bad490baa443d56905804bf3af070fdd1e6 SET `DEBUG_SYNC` = ? 3
test 38bf19041c8c8063d5d4dcd0b6cc0bad490baa443d56905804bf3af070fdd1e6 SET `DEBUG_SYNC` = ? 0
Note that the last query is incorrect since we have 2 rows sharing the same (digest,schema_name) values, violating the unique key constraint on that table. Also it doesn't make sense that the two queries should return different number of results based on whether an index is used or not.
Suggested fix:
Check m_lock.is_populated() before returning back the row in index_next.
Also, I don't think we should be relying on the m_first_seen check in general if we already have the is_populated() check.