Bug #120318 InnoDB partition statistics not auto-updated after ALTER TABLE REORGANIZE PARTITION
Submitted: 22 Apr 6:31
Reporter: Ayush Pandey (OCA) Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:9.6.0 OS:Linux
Assigned to: CPU Architecture:Any

[22 Apr 6:31] Ayush Pandey
Description:
When executing ALTER TABLE ... REORGANIZE PARTITION (or COALESCE PARTITION,
or REBUILD PARTITION), the partition statistics are never recalculated. 
The statistics tables are left with stale entries:

  - mysql.innodb_table_stats — n_rows, clustered_index_size, and
    sum_of_other_index_sizes remain at 0 for new partitions even though
    the partitions physically contain the copied rows.

  - mysql.innodb_index_stats — cardinality (n_diff_pfx*), n_leaf_pages,
    and size are stale or missing for every index of the affected
    partitions.

Impact on query optimizer

- With stale index cardinality in mysql.innodb_index_stats (and n_rows=0
  in mysql.innodb_table_stats) for partitions that actually contain data,
  the optimizer picks suboptimal plans — wrong indexes, wrong join
  orders, even full table scans instead of index seeks. The only
  workaround is a manual ANALYZE TABLE after every REORGANIZE.

Affected operations

  - ALTER TABLE ... REORGANIZE PARTITION p_old INTO (...)   -- split
  - ALTER TABLE ... REORGANIZE PARTITION p1, p2 INTO (...)  -- merge
  - ALTER TABLE ... COALESCE PARTITION N
  - ALTER TABLE ... REBUILD PARTITION

Expected: After the inplace ALTER succeeds, both innodb_table_stats and
innodb_index_stats reflect actual row counts and cardinality for each
new/modified partition, without needing ANALYZE TABLE.

Actual: New partitions show n_rows=0 in innodb_table_stats and
stale/missing entries in innodb_index_stats until ANALYZE TABLE is run.

How to repeat:
DROP TABLE IF EXISTS sales_data;

-- Create a partitioned table
CREATE TABLE sales_data (
    id INT NOT NULL AUTO_INCREMENT,
    sale_date DATE NOT NULL,
    region VARCHAR(50),
    amount DECIMAL(10,2),
    status VARCHAR(20),
    PRIMARY KEY (id, sale_date),
    KEY idx_status (status),
    KEY idx_region (region)
)

PARTITION BY RANGE (YEAR(sale_date)) (
    PARTITION p2020 VALUES LESS THAN (2021),
    PARTITION p2021 VALUES LESS THAN (2022),
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

-- Insert data with skewed distribution
-- Most records have status 'completed', very few have 'pending'

INSERT INTO sales_data (sale_date, region, amount, status)
SELECT 
    DATE_ADD('2020-01-01', INTERVAL (a.n + b.n * 10 + c.n * 100 + d.n * 1000) MOD 1460 DAY) as sale_date,
    CONCAT('Region_', (a.n + b.n) MOD 10) as region,
    ROUND(RAND() * 10000, 2) as amount,
    -- Only ~0.5% will be 'pending'
    CASE WHEN (a.n + b.n * 10 + c.n * 100) MOD 200 = 0 THEN 'pending' ELSE 'completed' END as status
FROM 
    (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) a,
    (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) b,
    (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) c,
    (SELECT 0 as n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 
     UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) d
LIMIT 100000;

-- Verify data distribution
SELECT status, COUNT(*) as count, 
       ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM sales_data), 2) as percentage
FROM sales_data
GROUP BY status;

-- make sure stats up to date
analyze table sales_data;

-- check table stats
select * from mysql.innodb_table_stats where table_name like '%sales_data%';

-- this should be optimal execution plan
EXPLAIN SELECT * FROM sales_data WHERE status = 'pending'\G

-- use reorganize partition into the same partitions to reset the table stats
ALTER TABLE sales_data REORGANIZE PARTITION p2020, p2021, p2022, p2023, p2024 INTO (PARTITION p2020 VALUES LESS THAN (2021), PARTITION p2021 VALUES LESS THAN (2022), PARTITION p2022 VALUES LESS THAN (2023), PARTITION p2023 VALUES LESS THAN (2024), PARTITION p2024 VALUES LESS THAN (2025));

-- check table stats to ensure they are clear
select * from mysql.innodb_table_stats where table_name like '%sales_data%';

-- this should be full table scan execution plan
EXPLAIN SELECT * FROM sales_data WHERE status = 'pending'\G

--Even after waiting 60 seconds, the innodb stats table is not automatically updated.
select sleep(60);

-- check again
select * from mysql.innodb_table_stats where table_name like '%sales_data%';

Suggested fix:
After successfully copying the data to new partition, update stats for each new partition (persistent or transient based on the table setting) to refresh the statistics.