Bug #40306 Replication of externally build tables (like "myisampack"ed ones)
Submitted: 24 Oct 2008 13:53 Modified: 11 Nov 2008 15:55
Reporter: Serdar S. Kacar Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:5.1.x, 6.x OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Triage: Needs Triage: D5 (Feature request)

[24 Oct 2008 13:53] Serdar S. Kacar
Description:
Externally build tables cannot be replicated easily. The most affected kind is probably ones that are "myisampack"ed.

Why one should use "myisampack" is another issue but to give a hint :
- Consider batches (millions of rows) of data flowing that belong different 
  ranges (say days). 
- Those data should be indexed and served immediately.
- Highly fragmented recent batches in a range should be integrated into larger
  ones (myisampack --join) regularly that otherwise result in millions 
  of tables.
- Total data amount is tens of TBs.
In this case, 
- First of all, InnoDB eliminated as substancial growth 
  in disk space requirement and, 16K blocks, low cache hit in buffer pool, etc.
  (Even this storage engine could be allocated for tasks those have 
   low memory requirements but need all data fit in memory)
- In case of MyISAM, partition or merge table solutions would get the job done :
  - not crash safe !
  - %100 increase in disk space requirement relative to packed ones.
  - INSERT ... SELECT is not as efficient as "myisampack --join" 
    even all the tables involved are locked.
  - Did ever you tried "SELECT .. FROM MergedTableA JOIN MergedTableB" ? 
    Do not ! It is not designed for this purpose.
  - Partitioning is promising but is far from being mature yet. 
    One day partitioning over "Maria" (but not MyISAM) could be feasible 
    for this case.

Unfortunately, myisampack is an external tool and, logically, should be so. 
Consequently, it has nothing to do with binary log directly. As of today, one can at best do
- SELECT * INTO OUTFILE
- LOAD DATA INFILE
to make it replicated. But,
- table during build on the slave is not crash safe,
- build table is not packed,
- building index will steal from CPU and the need for myisam_sort_buffer_size
  forces decrease in memory allocation for other modules 
  (probably key_buffer_size).

Replication of externally build tables could be performed by external replicator. But that breaks replication order (serializability.)
There would be other statements in binary log that depend on the existance of the introduced file like
CREATE VIEW ... SELECT ... FROM xxx; 
  /* will not allow to do that if the base table is not found. */
So an external replication solution combined with MySQL replication, has to deal with complications - let alone their central management issues.

The feature request is essentially introducing a new replication statement like
INTRODUCED TABLE ... [FORCE BINARY] [OVERWRITE]
That will be written to binary log as a statement.

Encountering this statement while playing master bin log, replication agent on the slave would try to get the table.

Replication agent on the master :

- If table not found, return an error.

- Else if the table has its own tables (like MyISAM) 
  and (the file formats compatible with the that of slave 
       OR "FORCE BINARY" option specified)
  and is read-only (like packed MyISAM as of today) :
  * Simply send (.MYD, .MYI and frm) files in binary mode

- Else if the table has its own tables (like MyISAM)
  and (the file formats compatible with the that of slave 
       OR "FORCE BINARY" option specified) :
  * "FLUSH TABLES xxx WITH READ LOCK;" 
  * Send (.MYD, .MYI and frm) files in binary mode
  * "UNLOCK TABLES"

- (In the first implementation) Else return "cannot do" error 

- (When the implementation evolved) Else 
  * "FLUSH TABLES xxx WITH READ LOCK;" 
  * -- todo: something similar to RBR 
  * "UNLOCK TABLES"

Replication agent on the slave :

- In binary mode :
  Save .frm file to file .frm.trans. 
  When all files transferred, remove .trans extention from the .frm file.
  (To avoid storage engine intervention during the transfer.) 

- In data mode :
  Not yet implemented.
 

Acceptable behaivours :

- Slave storage engine cannot read data if storage engine file format is lower 
  than that of introduced file.
  Storage engine will probably emit "Can't open file: '%s'" on 
  statements using this table. 
  No problem as this situation is the result of [FORCE BINARY] option.

- .frm file (one that should copied last) already exists on the slave. Either 
  - skip introduction with a warning message, or, 
  - if it can manage to replace a table already openned by the storage engine 
    and [OVERWRITE] option specified, overwrite it.

How to repeat:
N/A
[11 Nov 2008 15:55] Susanne Ebrecht
Many thanks for writing a feature request. We will discuss this.