Bug #20252 NDB API: NdbIndexScanOperation::readTuples ignores the batch parameter
Submitted: 4 Jun 2006 0:04 Modified: 4 Jul 2006 15:07
Reporter: Anatoly Pidruchny (Candidate Quality Contributor)
Status: Closed
Category:Server: Cluster Severity:S3 (Non-critical)
Version:5.0.21-max OS:Any (all)
Assigned to: Jonas Oreland Target Version:

[4 Jun 2006 0:04] Anatoly Pidruchny
Description:
As documented in Ndb.hpp, the batch parameter of the NdbScanOperation::readTuples() and
NdbIndexScanOperation::readTuples controls how many rows will be locked per fragment.
This same parameter specifies how many records will be returned to the client from the
server by the NdbScanOperation::nextResult(true) operation. Our testing and then looking
at the source code showed that the batch parameter is really ignored. In fact, the
readTuples() function with the batch parameter in the NdbScanOperation.hpp is even
effectively removed, because it is defined under some macro
"ndb_readtuples_impossible_overload", and its signature is the same as the signature of
another readTuples function with no batch parameter. The NdbIndexScanOperation.hpp has
the readTuples funciton with the batch parameter, but it is also ignored:

  inline int readTuples(LockMode lock_mode,
                        Uint32 batch, 
                        Uint32 parallel,
                        bool order_by,
                        bool order_desc = false,
                        bool read_range_no = false) {
    Uint32 scan_flags =
      (SF_OrderBy & -(Int32)order_by) |
      (SF_Descending & -(Int32)order_desc) |
      (SF_ReadRangeNo & -(Int32)read_range_no);
    return readTuples(lock_mode, scan_flags, parallel);
  }

We need to use the batch parameter, because we want to avoid excessive locking of records
during scans with an exclusive lock. One of the very important features of our application
requires finding one record, doing a scan with exclusive lock, then updating this one
record. Several processes with multiple threads are going to do this operation in
parallel, so it is vital that the operation does not lock many records unnecessarily. It
is OK to lock 2 records instead of 1 (if there are 2 data fragments), but it is not OK to
lock more then a hundred records, when really only one record is going to be updated.

I think the NDBCLUSTER engine of the MySQL could also benefit if it uses the batch
parameter for SELECT ... FOR UPDATE LIMIT x statements.

How to repeat:
Please see the attached test program. During my testing on a test 2 ndbd-node Cluster,
this program prints:

populate: Success!
Real Batch Size is 123
Real Batch Size is 6
Real Batch Size is 20
Real Batch Size is 1
scan_print: Success!

If the batch parameter were not ignored, then the output would always have Real Batch
Size 1, because the test Cluster has only one data fragment.

Suggested fix:
1. In NdbScanOperation.hpp, add the batch parameter into the readTuples function in
NdbScanOperation class with default value 0. Value 0 means max performance.
2. Add a m_batchSize field into the NdbScanOperation class. The function
NdbScanOperation::readTuples should save the batch parameter into the m_batch field.
3. In NdbIndexScanOperation.hpp, add the batch parameter into the first readTuples
function in NdbIndexScanOperation class with default value 0. Make the second inline
readTuples function pass the batch parameter to the first readTuples function. In the
implementation, pass the batch parameter from the NdbIndexScanOperation::readTuples to
the NdbScanOperation::readTuples.
4. In the NdbScanOperation.cpp, make the function NdbScanOperation::prepareSendScan take
into account the value of the m_batchSize field. It should assign the m_batchSize to the
batch_size local variable. NdbScanOperation::prepareSendScan calls
NdbReceiver::calculate_batch_size function to calculate the batch size. The batch_size
parameter of the NdbReceiver::calculate_batch_size function should be used as in-out, not
as just out parameter. The function NdbReceiver::calculate_batch_size should just accept
the passed value of the batch_size parameter, if it is greater then 0 and less then the
value that would be calculated by this function. If the value of the batch_size parameter
is 0 or greater then the maximum possible value, then use the calculated maximum possible
value.
[4 Jun 2006 0:05] Anatoly Pidruchny
The batch parameter of readTuples function is ignored.

Attachment: batch_param_ignored.cpp (application/octet-stream, text), 8.95 KiB.

[5 Jun 2006 14:54] Valeriy Kravchuk
Changed category to a more appropriate one.
[6 Jun 2006 9:58] Jonas Oreland
Hi,

This is something that has been on my todo for ages now.
Great that there now is a bug report, then we hopefully get around to fix it.

/Jonas
[27 Jun 2006 11:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8302
[27 Jun 2006 12:02] Jonas Oreland
Hi

Would you like to apply patch, and test it before I push it ?

/Jonas
[27 Jun 2006 18:49] Anatoly Pidruchny
Hi, Jonas,

thank you very much for taking care of this issue. I am going to apply the patch and test
it today.

Anatoly.
[28 Jun 2006 2:01] Anatoly Pidruchny
Hi, Jonas,

I have applied and tested the patch today. It worked like a charm! Please push it to the
main and public repositories when you are ready. I hope the fix will also be propagated
to version 5.1.

Thanks,

Anatoly.
[28 Jun 2006 11:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8388
[29 Jun 2006 11:50] Tomas Ulin
pushed to 5.1.12
[4 Jul 2006 13:40] Jonas Oreland
pushed into 5.0.24
[4 Jul 2006 15:07] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of
that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available
version, including the bug fix. More information about accessing the source trees is
available at

    http://www.mysql.com/doc/en/Installing_source_tree.html
[4 Jul 2006 15:11] Jon Stephens
Documented bugfix in 5.0.23/5.1.12 changelogs; updated NDB API Guide method descriptions
with note about this parameter's behaviour.
[13 Jul 2006 5:29] Paul DuBois
5.0.x fix went to 5.0.25 instead.