Bug #102292 ANALYZE TABLE capitalized table name, statistics results are not updated.
Submitted: 19 Jan 2021 9:54 Modified: 20 Jan 2021 10:14
Reporter: qichao zhou Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.22, 8.0.23 OS:Any (CentOS Linux release 8.0.1905 (Core))
Assigned to: CPU Architecture:Any (4.18.0-80.el8.x86_64 )

[19 Jan 2021 9:54] qichao zhou
Description:
 When setting lower_case_table_names to 1, ANALYZE TABLE capitalized table name, statistics return normal, but statistics results are not updated. ANALYZE TABLE lowercase table names to work properly. See the following example.

How to repeat:
mysql> select count(*) from test2;
+----------+
| count(*) |
+----------+
|      162 |
+----------+
1 row in set (0.01 sec)

mysql> select TABLE_SCHEMA,TABLE_NAME,TABLE_ROWS from information_schema.tables where TABLE_SCHEMA ='zhouqch' and TABLE_NAME='test2';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| zhouqch      | test2      |         62 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql> analyze table TEST2;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| zhouqch.test2 | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.01 sec)

mysql> select TABLE_SCHEMA,TABLE_NAME,TABLE_ROWS from information_schema.tables where TABLE_SCHEMA ='zhouqch' and TABLE_NAME='test2';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| zhouqch      | test2      |         62 |
+--------------+------------+------------+
1 row in set (0.01 sec)

mysql> analyze table test2;
+---------------+---------+----------+----------+
| Table         | Op      | Msg_type | Msg_text |
+---------------+---------+----------+----------+
| zhouqch.test2 | analyze | status   | OK       |
+---------------+---------+----------+----------+
1 row in set (0.01 sec)

mysql> select TABLE_SCHEMA,TABLE_NAME,TABLE_ROWS from information_schema.tables where TABLE_SCHEMA ='zhouqch' and TABLE_NAME='test2';
+--------------+------------+------------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS |
+--------------+------------+------------+
| zhouqch      | test2      |        162 |
+--------------+------------+------------+
1 row in set (0.00 sec)

mysql>

Suggested fix:
Optimize the performance of statistics in case lower_case_table_names is 1, ANALYZE TABLE uppercase table name update statistics.
[20 Jan 2021 10:14] MySQL Verification Team
Hello qichao zhou,

Thank you for the report.

regards,
Umesh
[3 Mar 2021 8:04] MySQL Verification Team
Bug #102792 marked as duplicate of this one
[23 Dec 2021 10:00] Tor Didriksen
Posted by developer:
 
The patch for this particular problem is quite simple:

   setup_table_stats_record(
       thd, ts_obj.get(), dd::String_type(table->db, strlen(table->db)),
-      dd::String_type(table->alias, strlen(table->alias)), file->stats,
-      file->checksum(), file->ha_table_flags() & (ulong)HA_HAS_CHECKSUM,
+      dd::String_type(table->table_name, strlen(table->table_name)),
+      file->stats, file->checksum(),
+      file->ha_table_flags() & static_cast<ulong>(HA_HAS_CHECKSUM),
       analyze_table->found_next_number_field);

However, as my reviewer pointed out:

While this change seems to fix issue for case of
--lower-case-table-names=1 it doesn't fix similar issues for
--lower-case-table-names=2. Moreover I think it will actually
break some previously working scenarios using --lctn=2
mode. E.g. if now in --lctn=2 one consistently
uses "TableNameWithSomeUpperCaseLetters" as table name in all
statements including ANALYZE TABLE, then the stats should be
properly updated and visible I_S.TABLES queries, but after your
fix this will be broken since underlying DD.TABLES and
DD.TABLE_STATS will use different versions/casing of table name
which will be compared using utf8_bin collation.

I think the real reason behind both issue reported in the bug and
similar issues for lctn=2 mode is that our DD mysql.table_stats
and mysql.index_stats tables use wrong collation for schema and
table name columns on systems with lctn = 1|2. IMO they should be
using the same collation as mysql.tables.name columns use on this
systems - i.e. utf8_tolower_ci.

I think that the proper fix for the problem is to change the
collation of schema and table_name columns in
mysql.table/index_stats tables (in case of lctn>0). Of course
this means that upgrade will become non-trivial (we either have
to remove potential duplicate rows during upgrade or wipe these
tables clean, but the latter might cause some screaming and
better to avoid, IMO).

Alternatively, if there is a need to provide fix for original
issue reported ASAP, we can probably adjust your fix to limit it
only to --lctn=1 case and report separate bug about --lctn=2
cases, which will require more work/bigger fix. But if there is
no pressure from Support or Customer I would prefer not to go
this way as it means more or less double work.