Bug #18413 Data usage for varsize columns is not correctly reported to mysqld
Submitted: 22 Mar 2006 9:08 Modified: 8 Jul 2006 7:17
Reporter: Chengyao Jiang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.7 OS:Linux (Linux)
Assigned to: Martin Skold

[22 Mar 2006 9:08] Chengyao Jiang
Description:
I have a table with MyISAM store engine and its size is about 340M with 5,000,000 rows
When I convert it to NDBCLUSTER with 5.0.18 store engine, its size becomes 1.4G
With MySQL 5.1.7 beta, its size becomes 259.84M
Huh, much smaller than MySQL 5.0.18
But I confused about it, why its smaller than the original ?
As mentioned in MySQL Reference Manual, it has overheads in MySQL Cluster,so the size is theoretically bigger than original with MyISAM
My calculation is:
select table_name,engine,round(data_length/1024/1024,2) from Information_Schema.tables where table_name = 'user1'
+------------+------------+--------------------------------+
| table_name | engine | round(data_length/1024/1024,2) |
+------------+------------+--------------------------------+
| user1 | MyISAM | 332.81 |
| user1 | NDBCLUSTER | 259.84 |
+------------+------------+--------------------------------+

http://lists.mysql.com/cluster/3285

How to repeat:
none
[22 Mar 2006 9:11] Valerii Kravchuk
Thank you for a problem report. Please, send the SHOW CREATE TABLE and SHOW TABLE STATUS results for your table.
[28 Mar 2006 0:59] Chengyao Jiang
show create table
| user1 | CREATE TABLE `user1` (
  `id` int(11) NOT NULL,
  `username` varchar(18) NOT NULL default '',
  `nickname` varchar(20) NOT NULL default '',
  `password` varchar(32) NOT NULL default '',
  `sex` char(1) NOT NULL default '',
  `rdate` date NOT NULL default '0000-00-00',
  `rtime` time NOT NULL default '00:00:00',
  `securecode` varchar(32) NOT NULL default '',
  `vis` tinyint(1) NOT NULL default '0',
  `logdate` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY (`id`),
  KEY `username` (`username`),
  KEY `nickname` (`nickname`)
) ENGINE=ndbcluster DEFAULT CHARSET=gb2312 PARTITION BY KEY ()  |

mysql> show table status;
+-------+------------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-------------------+----------+----------------+-----------------------+
| Name  | Engine     | Version | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation         | Checksum | Create_options | Comment               |
+-------+------------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-------------------+----------+----------------+-----------------------+
| user1 | NDBCLUSTER |      10 | Dynamic    | 4999253 |             36 |   273809408 |               0 |            0 |         0 |           NULL | NULL        | NULL        | NULL       | gb2312_chinese_ci |     NULL |                | number_of_replicas: 1 |
+-------+------------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+-------------+-------------+------------+-------------------+----------+----------------+-----------------------+
1 row in set (0.00 sec)
[12 May 2006 8:35] Valerii Kravchuk
Have you tried to perform OPTIMIZE TABLE for your MyISAM table before converting and check it's "optimized" size?

In case the above will not change the results you got, please, try to repeat with a newer version, 5.1.9-beta, and inform about the results.
[12 May 2006 11:22] Jonas Oreland
Hi,

5.1 introduces *true* varchar,
but the data_length column does currently only show the
  size for the *fixed* size columns...

Changing title therefore...

ps.
  you can use "all dump 1000" to see how much memory actually is allocated.
  it should be much smaller in 5.1, but not as little as MyIsam
ds.
[21 Jun 2006 11:50] Stewart Smith
Perhaps we should report "total" in the SHOW statement and create a WL for reporting fixed and varsized usage through I_S.FILES?

e.g. rows for
MEMORY#DataMemory:Fixed
MEMORY#DataMemory:varsized

or whatever somebody decides the names of the "files" should be.
[3 Jul 2006 15:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8650
[4 Jul 2006 8:40] Martin Skold
Note that 10 pages are allocated at a time for varsized data so the reported
data usage will jump in 320K increments.
[8 Jul 2006 7:17] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://www.mysql.com/doc/en/Installing_source_tree.html
[8 Jul 2006 7:18] Jon Stephens
Documented in 5.1.12 changelog and I_S.TABLES section of 5.1 Manual.