Bug #19940 Valgrind warning in ps_7ndb test
Submitted: 19 May 2006 12:49 Modified: 20 Dec 2006 8:37
Reporter: Kristian Nielsen Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.0.22 OS:Linux (Linux/all)
Assigned to: justin he CPU Architecture:Any

[19 May 2006 12:49] Kristian Nielsen
Description:
There are valgrind warnings in test ps_7ndb. The problem is also exposed by the following minimal test case:

    use test;

    -- source include/have_ndb.inc
    drop table if exists t1 ;
    create table t1
    (
      a int, b varchar(10),
      primary key(a)
    ) engine = NDB;

    prepare stmt1 from 'insert into t1 values(?, ? )';
    set @arg01= 10002 ;
    set @NULL = null;
    execute stmt1 using @arg01, @NULL ;
    delete from t1 where a > 10000 ;
    set @arg01= 10003 ;
    execute stmt1 using @arg01, @arg01 ;

    drop table t1;

==18521== Syscall param socketcall.send(msg) points to uninitialised byte(s)
==18521==    at 0x4066428: send (socket.S:101)
==18521==    by 0x855725F: TransporterRegistry::performSend() (TransporterRegistry.cpp:1082)
==18521==    by 0x8557439: TransporterRegistry::forceSendCheck(int) (TransporterRegistry.cpp:1135)
==18521==    by 0x8548DB3: TransporterFacade::forceSend(unsigned) (TransporterFacade.cpp:770)
==18521==    by 0x853D358: Ndb::sendPrepTrans(int) (Ndbif.cpp:1133)
==18521==    by 0x853D493: Ndb::sendPollNdb(int, int, int) (Ndbif.cpp:1205)
==18521==    by 0x854CE2B: NdbTransaction::executeNoBlobs(NdbTransaction::ExecType, NdbTransaction::AbortOption, int) (NdbTransaction.cpp:442)
==18521==    by 0x854D118: NdbTransaction::execute(NdbTransaction::ExecType, NdbTransaction::AbortOption, int) (NdbTransaction.cpp:273)
==18521==    by 0x833307D: execute_no_commit(ha_ndbcluster*, NdbTransaction*) (ha_ndbcluster.cc:241)
==18521==    by 0x831E0FB: ha_ndbcluster::write_row(char*) (ha_ndbcluster.cc:2173)
==18521==    by 0x8294474: write_record(THD*, st_table*, st_copy_info*) (sql_insert.cc:1114)
==18521==    by 0x82961C8: mysql_insert(THD*, st_table_list*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (sql_insert.cc:505)
==18521==    by 0x8231FCA: mysql_execute_command(THD*) (sql_parse.cc:3297)
==18521==    by 0x82A00E0: Prepared_statement::execute(String*, bool) (sql_prepare.cc:2931)
==18521==    by 0x82A03D5: mysql_sql_stmt_execute(THD*) (sql_prepare.cc:2322)
==18521==    by 0x822F846: mysql_execute_command(THD*) (sql_parse.cc:2522)
==18521==  Address 0x5FA931E is 1,846 bytes inside a block of size 262,148 alloc'd
==18521==    at 0x401C41A: malloc (vg_replace_malloc.c:149)
==18521==    by 0x860499D: operator new[](unsigned) (my_new.cc:33)
==18521==    by 0x8590A9B: SendBuffer::initBuffer(unsigned) (SendBuffer.cpp:43)
==18521==    by 0x858D936: TCP_Transporter::initTransporter() (TCP_Transporter.cpp:148)
==18521==    by 0x85561C1: TransporterRegistry::createTCPTransporter(TransporterConfiguration*) (TransporterRegistry.cpp:316)
==18521==    by 0x856FAC4: IPCConfig::configureTransporters(unsigned, ndb_mgm_configuration const&, TransporterRegistry&) (IPCConfig.cpp:351)
==18521==    by 0x8547E5C: TransporterFacade::init(unsigned, ndb_mgm_configuration const*) (TransporterFacade.cpp:529)
==18521==    by 0x8548251: TransporterFacade::start_instance(int, ndb_mgm_configuration const*) (TransporterFacade.cpp:368)
==18521==    by 0x8538CF7: Ndb_cluster_connection::connect(int, int, int) (ndb_cluster_connection.cpp:530)
==18521==    by 0x832657B: ndbcluster_init() (ha_ndbcluster.cc:4972)
==18521==    by 0x82F8C2B: ha_init() (handler.cc:483)
==18521==    by 0x8214700: init_server_components() (mysqld.cc:3086)
==18521==    by 0x8218716: main (mysqld.cc:3412)

The problem is actually in ha_ndbcluster::set_ndb_value(), in the buffer field->ptr used to store the value for column 'b'. This buffer, which is of size 11 (1 + max. length of colum), is only initialized for the bytes actually used in the string (6 bytes in this case, 1 for length and 5 for character data). The rest of the buffer is not initialized.

But ha_ndbcluster::set_ndb_value() copies all of the buffer into the NDB send buffer, and later NDB sends the data out on the network. This does not cause maloperation, but does produce the Valgrind error as Valgrind can not see that the data will not be used at the other end of the socket.

Also sending uninitialized memory on the network could potentially expose sensitive server data, and overall does not feel right.

How to repeat:
See the Valgrind run in 5.0 pushbuild.

Or run

    mysql-test-run.pl --valgrind-all ps_7ndb

or the above minimal test.

Suggested fix:
I suppose we cannot change the NDB protocol to not send the unnecessary, uninitialized bytes. Certainly not in 5.0, which is GA.

I propose to make sure that the character buffers used internally are fully initialized. This seems to be the case anyway in most cases, since this problem occurs only in this particular test.

One attractive option would be to make sure that ha_ndbcluster::set_ndb_value() only accesses the valid part of field->ptr, ie. it should copy only the first 1+5 bytes in this case and zero-fill the rest of the send buffer by itself.

Another option would be to find where the field is originally created, and make sure it is fully initialized there.

If we can live with sending random (potentially sensitive?) data across the NDB network connections, and we do not want to pay the extra performance price of initializing the buffers in production releases, the initialization could be protected with #ifdef HAVE_purify. Then it will only happen for Valgrind runs.

It is in any case important to get a fix for this, since the policy is now for releases to have a clean Valgrind run, and errors such as these make it harder to catch new problems.
[15 Jun 2006 20:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7724