Description:
Externally build tables cannot be replicated easily. The most affected kind is probably ones that are "myisampack"ed.
Why one should use "myisampack" is another issue but to give a hint :
- Consider batches (millions of rows) of data flowing that belong different
ranges (say days).
- Those data should be indexed and served immediately.
- Highly fragmented recent batches in a range should be integrated into larger
ones (myisampack --join) regularly that otherwise result in millions
of tables.
- Total data amount is tens of TBs.
In this case,
- First of all, InnoDB eliminated as substancial growth
in disk space requirement and, 16K blocks, low cache hit in buffer pool, etc.
(Even this storage engine could be allocated for tasks those have
low memory requirements but need all data fit in memory)
- In case of MyISAM, partition or merge table solutions would get the job done :
- not crash safe !
- %100 increase in disk space requirement relative to packed ones.
- INSERT ... SELECT is not as efficient as "myisampack --join"
even all the tables involved are locked.
- Did ever you tried "SELECT .. FROM MergedTableA JOIN MergedTableB" ?
Do not ! It is not designed for this purpose.
- Partitioning is promising but is far from being mature yet.
One day partitioning over "Maria" (but not MyISAM) could be feasible
for this case.
Unfortunately, myisampack is an external tool and, logically, should be so.
Consequently, it has nothing to do with binary log directly. As of today, one can at best do
- SELECT * INTO OUTFILE
- LOAD DATA INFILE
to make it replicated. But,
- table during build on the slave is not crash safe,
- build table is not packed,
- building index will steal from CPU and the need for myisam_sort_buffer_size
forces decrease in memory allocation for other modules
(probably key_buffer_size).
Replication of externally build tables could be performed by external replicator. But that breaks replication order (serializability.)
There would be other statements in binary log that depend on the existance of the introduced file like
CREATE VIEW ... SELECT ... FROM xxx;
/* will not allow to do that if the base table is not found. */
So an external replication solution combined with MySQL replication, has to deal with complications - let alone their central management issues.
The feature request is essentially introducing a new replication statement like
INTRODUCED TABLE ... [FORCE BINARY] [OVERWRITE]
That will be written to binary log as a statement.
Encountering this statement while playing master bin log, replication agent on the slave would try to get the table.
Replication agent on the master :
- If table not found, return an error.
- Else if the table has its own tables (like MyISAM)
and (the file formats compatible with the that of slave
OR "FORCE BINARY" option specified)
and is read-only (like packed MyISAM as of today) :
* Simply send (.MYD, .MYI and frm) files in binary mode
- Else if the table has its own tables (like MyISAM)
and (the file formats compatible with the that of slave
OR "FORCE BINARY" option specified) :
* "FLUSH TABLES xxx WITH READ LOCK;"
* Send (.MYD, .MYI and frm) files in binary mode
* "UNLOCK TABLES"
- (In the first implementation) Else return "cannot do" error
- (When the implementation evolved) Else
* "FLUSH TABLES xxx WITH READ LOCK;"
* -- todo: something similar to RBR
* "UNLOCK TABLES"
Replication agent on the slave :
- In binary mode :
Save .frm file to file .frm.trans.
When all files transferred, remove .trans extention from the .frm file.
(To avoid storage engine intervention during the transfer.)
- In data mode :
Not yet implemented.
Acceptable behaivours :
- Slave storage engine cannot read data if storage engine file format is lower
than that of introduced file.
Storage engine will probably emit "Can't open file: '%s'" on
statements using this table.
No problem as this situation is the result of [FORCE BINARY] option.
- .frm file (one that should copied last) already exists on the slave. Either
- skip introduction with a warning message, or,
- if it can manage to replace a table already openned by the storage engine
and [OVERWRITE] option specified, overwrite it.
How to repeat:
N/A