MySQL Bugs: #30529: NDB does not use default values extra columns in new rows

Bug #30529	NDB does not use default values extra columns in new rows
Submitted:	21 Aug 2007 7:16	Modified:	26 May 2010 14:58
Reporter:	Rafal Somla	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Replication	Severity:	S3 (Non-critical)
Version:	mysql-5.1	OS:	Any
Assigned to:	Martin Skold	CPU Architecture:	Any
Tags:	5.1.21

Description:
When calling write_row method of NDB handler, one fills table->write_set bitmap which tells which columns in the input record should be written into table and which should be ignored. 

This is used in replication, when rows are written in "update mode" so that if during a write a conflicting row is found then its columns which are present in write_set are updated with new values.

However, the same write operation (with some columns excluded from write_set) can be used to insert new rows into a table. In that case, if default values are defined for the missing columns, they should be used to initialized them. But, instead, the missing columns always get NULL value.

For example, consider inserting a row into empty table t(a int, b char default 'x'). Suppose write_set is "10" so that only column a is affected. After write_row(1,-) (value for column b is ignored), table will contain row (1,NULL)
instead of (1,'x').

How to repeat:
This problem manifests itself in rpl_ndb_extraCol test whose results are wrong (and wrong results are stored in the result file). A simplified version of the test could look like this:
-------------------------------------------------------------------------
--source include/have_ndb.inc
--source include/master-slave.inc

# create table on slave with extra column
connection slave;
stop slave;
reset slave;

create table t (a int, b char default 'x') engine=ndb;

# create table on master
connection master;

create table t (a int) engine=ndb;

# replicate write row to slave
reset master;
insert into t values (1);

connection slave;
sync_slave_with_master;
select * from t;
-------------------------------------------------------------------------

On slave t will contain (1,NULL) instead of (1,'x'). Replicataion setup is used
here to force write_row() with incomplete write_set which happens when slave applies Write_rows event. 

Suggested fix:
When executing write_row() set values of columns not present in write_set to their default values.

Opps, correction to the test script - I forgot to add "start slave"...
-------------------------------------------------------------------------
--source include/have_ndb.inc
--source include/master-slave.inc

# create table on slave with extra column
connection slave;
stop slave;
reset slave;

create table t (a int, b char default 'x') engine=ndb;

# create table on master
connection master;

create table t (a int) engine=ndb;

# replicate write row to slave
reset master;
insert into t values (1);

connection slave;
start slave;
sync_slave_with_master;
select * from t;
-------------------------------------------------------------------------

This is a behavior of NDB and is not related to how the server executes code. Since NDB does not store the default values internally, it will automatically write NULL values for all columns that are not included in the write_set. This behavior is going to change as soon as NDB has native support for handling the default values.

to be fixed with WL#4197 ndb native support for default values.
(I think)

David?

T

This problem appears to be even worse for extra columns that are NOT NULL (with some non-NULL default value).

In this case, it seems that rows inserted on the master are simply dropped on the slave and not replicated. Ie. this will lead to rows present on the master but missing on the slave.

I checked a signal trace, and what happens is that the slave thread does a
writeTuple() without value for the extra column. This tries to insert NULL for the
extra column (as described above), but as the column is non-NULLable, this fails
with contraint violation.

This constraint violation error is apparently ignored by the slave, replication
does not stop but simply carries on with wrong (missing) data in the table.

I think there is some confusion where the bug is here. Currently no storage
engines knows anything about default values, these are all filled in by the
layer calling the handler interface. The ndb handler is the only handler that
uses the write_set (to optimize communication to the data nodes). If the bit for
a field in the write_set is not set then the field is set to NULL (if the field
is defined as NOT NULL then a value of the field must ALWAYS be passed, thus
the bit in the write set must ALWAYS be set). When a field has a defined default
value the bit in the write_set must be set in this case so that the value is
passed properly. The reason why it works anyway in other storage engines is
because they ignore the write_set and always write complete rows.
The bug must be in the replication layer since it must supply the default values, not the storage engine. In the future Ndb might support adding default values internally, but this would just be an optimization (the default values
do not need to be sent over the network), NOT the fix for this bug.

The replication layer must set the bits in the write_set properly.

This is fixed in mysql-5.1-telco-7.1 and is verified
by test rpl_ndb_extraCol. Note that this test passes
in earlier releases, but with the above incorrect
NULL values on slave. (test is active in older releases
since it verifies other functionality as well).

BUG#47742 seems to be related to BUG#30529.

Lars, can you show up and explain what changes you're thinking about doing ?

Martin,

The consequences of you proposal is that every column should either have:

- the real value, and set the bit in the write_set, or
- the default value, and set the bit in the write_set.

This means that all bits of the write_set will always be set, which renders the idea of having a write_set pretty useless.

Furthermore, should Cluster later decide to support default values, how shall the engine detect if a column is a default value or not? It is of course possible to compare the actual value and see if it is a default value, which works fine for small values, but is a waste of cycles if the table contain, say, strings.

If, instead, the bits of the write_set indicate what columns the user provided (as it is now), it is easy to detect what columns hold default values since a clear bit means that the default was used, and a set bit means that the value in the record was given by the user.

However, if all the bits of the write_set is set (the consequence of your proposal), this will, as far as I can tell, actually force NDB to always transfer all columns to the nodes, or implement the expensive comparison mentioned above.

It will also force any other engines that only want to process the columns that were actually provided by the user to process all columns, even if not necessary (for example Falcon or some column-oriented engines).

We could of course add a few table flags to configure this behavior, but the set of table flags is already large enough so that it is problematic for storage engine implementors, and server developers as well, to understand what flags should be used where.

I guess we agree to that view: bug is not in write_set. Problem has to be solved on ndb side for now.

Since the slave side guarantees to have the default value filled in for all column that have a default value, the problem can be solved by reading the value from the record for each column that does not have the bit set but does have a default value.

Checking if a field has a default value can be done by checking the NO_DEFAULT_VALUE_FLAG in the Field::flags field, which seems to be well-defined at least for non-blobs.

Test file

Attachment: rpl_ndb_bug#30529.test (application/octet-stream, text), 781 bytes.

Result file

Attachment: rpl_ndb_bug#30529.result (application/octet-stream, text), 362 bytes.

The 'native default values' feature (WL4197) has been 'backported' to mysql-5.1-telco-7.0.

It is available from 7.0.15 and 7.1.4.

To clarify :
- Replication is correct to *not* set the write set bits for columns with no user/event-supplied value.
- The problem occurs because Ndb replication sometimes logs Update operations as WRITE_ROW events and sometimes logs only updated columns. When applying these at the slave we cannot tell whether the event was originally an INSERT from a Master with fewer columns, or an UPDATE where not all columns were affected.
- The Ndb slave code maps WRITE_ROW onto the NdbApi write operation which internally becomes an Insert if the row does not exist and an Update if it does.
- In cases where the NdbApi write operation becomes an Insert and there are no values for some columns, the native defaults will be used.
- Prior to WL4197 these could only ever be NULL.
- Where the column was defined as NOT NULL, the Insert (write) had to fail.

An additional problem, as described above, is that in some modes, there is no 'user specified' default for a column, but MySQL provides a 'type-specific' default, usually a variant of '0', meaning that it might be expected for a NOT NULL column to be set to this default when no value is supplied.
This functionality is also supported, as MySQLD tells Ndb about this 'type specific' default in the cases where it should be used (the MySQLD is not in a STRICT mode, and the column is not Nullable).

Tests and scenarios which failed previously should be retested against 7.0.15 or 7.1.4.

Note that the 7.1 branch referred to in previous comments is now known as telco-ng and is deprecated.

Per developer comments this was fixed by completion of WL#4197.

I've updated the relevant changelog entry to reference the correct Cluster versions, as well as this bug report.

Closed.