Bug #43683 OPTIMIZE TABLE doesn't return memory
Submitted: 16 Mar 2009 22:38 Modified: 5 Aug 2009 7:55
Reporter: Gustaf Thorslund Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Linux
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: 6.4 (from bzr today), memory usage, mysql-5.1-telco-6.3 -> cluster 6.3, optimize table

[16 Mar 2009 22:38] Gustaf Thorslund
Description:
Running OPTIMIZE TABLE on a table where half of the rows just have been removed doesn't return any DATA MEMORY.

How to repeat:
CREATE TABLE `foo` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `line` varchar(1024) /*!50120 COLUMN_FORMAT DYNAMIC */ DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1

# Did some filling and removing of the table before so lets do.
mysql> truncate table foo;

ndb_mgm> all report memory

ndb_mgm> Node 1: Data usage is 0%(29 32K pages of total 3200)
Node 1: Index usage is 0%(19 8K pages of total 12832)
Node 2: Data usage is 0%(29 32K pages of total 3200)
Node 2: Index usage is 0%(19 8K pages of total 12832)
Node 3: Data usage is 0%(18 32K pages of total 3200)
Node 3: Index usage is 0%(2 8K pages of total 12832)
Node 4: Data usage is 0%(18 32K pages of total 3200)
Node 4: Index usage is 0%(2 8K pages of total 12832)

mysql> load data infile '/tmp/big.txt' into table foo (line);
mysql> load data infile '/home/gustaf/issues/34420/big.txt' into table foo (line);

Query OK, 385230 rows affected, 18549 warnings (16.12 sec)
Records: 385230  Deleted: 0  Skipped: 0  Warnings: 18549
# Some lines where a bit too long, that's why the warnings are there.

ndb_mgm> all report memory

ndb_mgm> Node 1: Data usage is 18%(607 32K pages of total 3200)
Node 1: Index usage is 2%(349 8K pages of total 12832)
Node 2: Data usage is 18%(607 32K pages of total 3200)
Node 2: Index usage is 2%(349 8K pages of total 12832)
Node 3: Data usage is 18%(597 32K pages of total 3200)
Node 3: Index usage is 2%(332 8K pages of total 12832)
Node 4: Data usage is 18%(597 32K pages of total 3200)
Node 4: Index usage is 2%(332 8K pages of total 12832)

mysql> delete from foo where id % 2 = 0;
Query OK, 192615 rows affected (3.04 sec)

ndb_mgm> all report memory

Node 1: Data usage is 18%(607 32K pages of total 3200)
Node 1: Index usage is 1%(214 8K pages of total 12832)
Node 2: Data usage is 18%(607 32K pages of total 3200)
Node 2: Index usage is 1%(206 8K pages of total 12832)
Node 3: Data usage is 18%(597 32K pages of total 3200)
Node 3: Index usage is 1%(198 8K pages of total 12832)
Node 4: Data usage is 18%(597 32K pages of total 3200)
Node 4: Index usage is 1%(206 8K pages of total 12832)

mysql> optimize table foo;
+----------+----------+----------+----------+
| Table    | Op       | Msg_type | Msg_text |
+----------+----------+----------+----------+
| test.foo | optimize | status   | OK       | 
+----------+----------+----------+----------+
1 row in set (38.43 sec)

ndb_mgm> all report memory

Node 1: Data usage is 18%(607 32K pages of total 3200)
Node 1: Index usage is 1%(214 8K pages of total 12832)
Node 2: Data usage is 18%(607 32K pages of total 3200)
Node 2: Index usage is 1%(206 8K pages of total 12832)
Node 3: Data usage is 18%(597 32K pages of total 3200)
Node 3: Index usage is 1%(198 8K pages of total 12832)
Node 4: Data usage is 18%(597 32K pages of total 3200)
Node 4: Index usage is 1%(206 8K pages of total 12832)

The test was run on 6.4 from bzr, more specific:
hillbilly% bzr version-info
revision-id: pekka@mysql.com-20090316170507-kp3n7um1txenx487
date: 2009-03-16 19:05:07 +0200
build-date: 2009-03-16 23:19:55 +0100
revno: 2952
branch-nick: cluster-6.4

Also tested with 6.3 from bzr with same result.
hillbilly% bzr version-info 
revision-id: jonas@mysql.com-20090316151554-pm72q8jn4awmwpse
date: 2009-03-16 16:15:54 +0100
build-date: 2009-03-16 23:35:17 +0100
revno: 2913
branch-nick: cluster-6.3

Suggested fix:
Not sure, but it sure would be nice.
[16 Mar 2009 23:01] Gustaf Thorslund
[ndbd default]
DataMemory = 100M
IndexMemory = 100M
NoOfReplicas = 2

big.txt was generated by merging a number of files from the source code.
[1 Jul 2009 10:10] Joffrey MICHAIE
Hello,

please find next the results with real data 

DataMemory = 26.5G
2 DataNodes (Replicas = 2)

Starting from scratch (initial)

Adding 2.2M "items" (data in many tables)

Node 3: Data usage is 55%(469241 32K pages of total 848000)
Node 3: Index usage is 42%(136082 8K pages of total 320032)
Node 4: Data usage is 55%(469241 32K pages of total 848000)
Node 4: Index usage is 42%(136082 8K pages of total 320032)

Doing Alter table to convert *int and char to COLUMN_FORMAT=Dynamic (excepted Primary Key)

Node 3: Data usage is 51%(439941 32K pages of total 848000)
Node 3: Index usage is 39%(126183 8K pages of total 320032)
Node 4: Data usage is 51%(439941 32K pages of total 848000)
Node 4: Index usage is 39%(126183 8K pages of total 320032)

Adding 1M new "items"

Node 3: Data usage is 75%(638826 32K pages of total 848000)
Node 3: Index usage is 59%(189852 8K pages of total 320032)
Node 4: Data usage is 75%(638832 32K pages of total 848000)
Node 4: Index usage is 59%(189853 8K pages of total 320032)

Removing 1M items by range (from Id 1M to Id 2M)

Node 3: Data usage is 74%(628834 32K pages of total 848000)
Node 3: Index usage is 47%(150730 8K pages of total 320032)
Node 4: Data usage is 74%(628840 32K pages of total 848000)
Node 4: Index usage is 47%(150729 8K pages of total 320032)

Running Optimize table on ALL tables :

Node 3: Data usage is 74%(627988 32K pages of total 848000)
Node 3: Index usage is 47%(150730 8K pages of total 320032)
Node 4: Data usage is 74%(627994 32K pages of total 848000)
Node 4: Index usage is 47%(150729 8K pages of total 320032)

Rolling restart the nodes :
1st node ...

Node 3: Data usage is 57%(486128 32K pages of total 848000)
Node 3: Index usage is 42%(137613 8K pages of total 320032)
Node 4: Data usage is 74%(627994 32K pages of total 848000)
Node 4: Index usage is 47%(150729 8K pages of total 320032)

2nd node ...

Node 3: Data usage is 57%(486128 32K pages of total 848000)
Node 3: Index usage is 42%(137613 8K pages of total 320032)
Node 4: Data usage is 57%(486128 32K pages of total 848000)
Node 4: Index usage is 42%(137613 8K pages of total 320032)

I believe that the Optimize command should have freed more than ~900 pages of DataMemory, with Ms of lines deleted.

Thanks in advance,
Joffrey
[1 Jul 2009 12:54] Gustaf Thorslund
Joffrey,

Looks like you experienced the same problem as I described in the bug, right?

/Gustaf
[4 Aug 2009 10:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79981

3003 Jonas Oreland	2009-08-04
      ndb - bug#45971 - crash during optimize table, make sure free list are searched correctly
                        reuse get_alloc_page (with size + 1) since index might grow
            bug#43683 - optimize table does not return memory,
                        try to move to page with *least* amount of free space
[4 Aug 2009 12:41] Jon Stephens
Versions? 6.2+?
[5 Aug 2009 6:57] Jon Stephens
(See Bug#45971)
[5 Aug 2009 7:55] Jon Stephens
Documented bugfix in the NDB-6.3.26 and 7.0.7 changelogs as follows:

        In some cases, OPTIMIZE TABLE on an NDB table did not free any
        DataMemory.