MySQL Bugs: #82185: table status inconsistent with transactions and blobs

Bug #82185	table status inconsistent with transactions and blobs
Submitted:	11 Jul 2016 10:52
Reporter:	Andrii Nikitin	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.6.31 5.7.13	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
SHOW TABLE STATUS may show inconsistent numbers with big blobs and transactions.

E.g. with script below TABLE STATUS reports inconsistent stats after parallel transaction gets committed.

I understand that the script shows edge case scenario - but on practice the problem still exist and often size of .idb file differs too much from data_length+index_length+data_free (when long transactions exist / existed).

It looks that workaround is to run ANALYZE TABLE twice in a row when no long active transactions exist

How to repeat:
# make sure innodb_log_file_size is >1G
# open 2 connections and execute commands below:

conn#1:
set global max_allowed_packet=128*1024*1024;
\r
create table t(a longblob) engine=innodb;
# Prepare data:
insert into t select repeat('a',128*1024*1024);
update t set a = repeat('b',128*1024*1024);
analyze table t;
analyze table t;
show table status like 't';

# Actual: Avg_row_length=134M Data_length=134M Data_free=134M
# Expect: Avg_row_length=134M Data_length=134M Data_free=134M

-------------------------------------------
conn#2: 
\r
set tx_isolation='repeatable-read';
start transaction;
select count(*) from t;

-------------------------------------------
conn#1:
update t set a = repeat('c',128*1024*1024);
show table status like 't';
# Actual: Avg_row_length=128M Data_length=128M Data_free=0M
# Expect: Avg_row_length=128M Data_length=270M Data_free=0M
# Data_length+Data_free should be 260M, because idb grows to 260M
# now try analyze table
analyze table t;
show table status like 't';
# Actual: Avg_row_length=270M Data_length=270M Data_free=0M
# ^ now Data_length is correct but Avg_row_length is twice bigger

-------------------------------------------
# let's commit transaction from conn2: 
conn#2: 
commit;

-------------------------------------------
conn#1: 
# table status doesn't change:
show table status like 't';
# Actual: Avg_row_length=270M Data_length=270M Data_free=0M
# Expect: Avg_row_length=128M Data_length=270M Data_free=0M

# but values will become even more inconsistent after analyze:
analyze table t;
show table status like 't';
# Actual: Avg_row_length=270M Data_length=270M Data_free=132M
# Expect: Avg_row_length=128M Data_length=128M Data_free=128M

# to work around - try ANALYZE TABLE once again
analyze table t;
show table status like 't';
# Avg_row_length=135M Data_length=135M Data_free=135M
# Those are correct values ^^

Suggested fix:
Make behavior consistent:
- Old versions of rows must be consistently included into 'data_length' or 'data_free'.
- Correctly calculate Avg_row_length when old versions of rows present.

It is similar to bug #82184, just repeatable on 5.6 as well and covers effects of active InnoDB snapshot on stat calculation