Bug #81624 reading a blob somtimes returns only a part of the value (disk table)
Submitted: 27 May 2016 7:31 Modified: 30 Sep 2016 10:54
Reporter: Hendrik Woltersdorf Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.4.10 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: BLOB, ndbcluster

[27 May 2016 7:31] Hendrik Woltersdorf
Description:
We use MySQL-Cluster in version 7.4.10 for over a month now in production. In this time it happened two times, that a client could not read/parse the content of a filed of type blob because of a IOException; one java client and one C client. What I see in the logs is, that the client gets only the first 256 bytes (content changed, to protect the innocent):

<obj><type>de.xcom.ha.xserver.omhandling.TradeConfirmation</type><data><MessageId>ETRBL201605260000000730_EXE_EVENT</MessageId><OrderIdent><NewExchangeOrderId>7777555777555</NewExchangeOrderId><UserOrderId>aaa555mm</UserOrderId><InternalBankOrderId>XY40000

MySQL-Workbench (version 6.2.3) is able to read and show the whole blob:
<obj><type>de.xcom.ha.xserver.omhandling.TradeConfirmation</type><data><MessageId>ETRBL201605260000000730_EXE_EVENT</MessageId><OrderIdent><NewExchangeOrderId>7777555777555</NewExchangeOrderId><UserOrderId>aaa555mm</UserOrderId><InternalBankOrderId>XY400000aaa555mm</InternalBankOrderId></OrderIdent><Execution><ExternExecutionId>XZ00aaa555mm</ExternExecutionId><Quantity><Value>1000.000000</Value><Currency></Currency></Quantity><Price><Value>60.800000</Value><Currency>EUR</Currency></Price><SttlCurrency>EUR</SttlCurrency><FxPrice>0.000000</FxPrice><Time>20160526190828</Time></Execution></data></obj>

Table definition:
CREATE TABLE `REQUEST_HISTORY` (
  `ID_REQUEST_HISTORY` bigint(20) NOT NULL AUTO_INCREMENT,
  `ID_SERVICE` int(11) DEFAULT NULL,
  `GLOBAL_ID` varchar(40) DEFAULT NULL,
  `MESSAGE_IDENT` varchar(40) NOT NULL,
  `MESSAGE_CONTEXT` varchar(40) NOT NULL,
  `MESSAGE_DATA` blob,
  `STATE` char(1) DEFAULT NULL,
  `DATE_INSERT` datetime DEFAULT NULL,
  `DATE_READY` datetime DEFAULT NULL,
  `DATE_UPDATE` datetime NOT NULL,
  `ID_SERVER_REPOSITORY` bigint(20) NOT NULL,
  `ERROR_CODE` int(11) DEFAULT NULL,
  `TYPE` char(1) DEFAULT NULL,
  PRIMARY KEY (`ID_REQUEST_HISTORY`),
  UNIQUE KEY `IDX_MS` (`MESSAGE_IDENT`,`ID_SERVICE`,`MESSAGE_CONTEXT`),
  KEY `IDX_CONTEXT` (`MESSAGE_CONTEXT`),
  KEY `IDX_GLOBAL_ID` (`GLOBAL_ID`),
  KEY `REFERENCE_6_FK` (`ID_SERVICE`),
  KEY `IDX_STATE` (`STATE`)
) /*!50100 TABLESPACE ts_1 STORAGE DISK */ ENGINE=ndbcluster AUTO_INCREMENT=426140 DEFAULT CHARSET=latin1;

Workaround:
We copied the value from MySQL-Workbench, made an update statement from it and applied it (again) to the database. After that, all clients where able, to read the value without errros.

How to repeat:
We where not able to repeat this in any test envronment.
[27 May 2016 7:36] Hendrik Woltersdorf
cluster config

Attachment: config.ini (application/octet-stream, text), 3.62 KiB.

[27 May 2016 13:08] MySQL Verification Team
Hi Hendrik,

I doubt this is a bug on our side just because if this bug exist there would be hundreds of reports by now, but I can't dismiss it. Anyways in order to see what's going on we first need to see "where" in your chain [datanode] ->...->[Application] this loss occurs and for that I need to know how exactly your "chain" looks like.

Are you using MySQL to access data from ndbcluster or NDB API, Java/NDB, ndbd for memcached or any other method?

After you retrieve data from ndbcluster, how do your application connect to that data (mysql-nd, mysql c api, mysql c++ api, mysql jdbc...)?

Finally how your app handles data?

You said you have logs, in what point of the chain you collect logs attm?

kind regards
Bogdan Kecman
[27 May 2016 13:28] Hendrik Woltersdorf
We access the cluster through SQL-nodes (MySQL server). Some clients use JDBC others the c/c++ libraries. The applications access the database servers directly and write the logs with the queries and the results, if appropriate.
[27 May 2016 13:44] MySQL Verification Team
Hi,

and the problem you are facing (the incomplete blob content) was retrieved from mysqld trough c/c++ or jdbc? (you said you can only see it from time to time on one system and can't reproduce so I doubt you have it both on jdbc and c and c++ api)

thanks
Bogdan
[27 May 2016 13:57] Hendrik Woltersdorf
Today I saw this in a c/c++ client. Last time, when it happened in a java app, the case got analyzed by a colleague, who is out of office for some time; so I can't ask him about the logs.
[27 May 2016 14:11] MySQL Verification Team
Hi,

So you seen this both with CAPI and JDBC.. weird. 

Every time you seen this it was 7.4.10 on the server side? 

We did have some blob issues with data on disk back in the day of 7.0/7.1 but all those are fixed. Note that BLOB in ndbcluster is a not straight forward type. Only first 256 bytes (what you get) are stored in the table itself, rest of the content is stored in the separate storage blocks. It can happen that those are not created but then every select from that record would have a same problem but you are saying that you can't reproduce and when you select you get a full message but only sometimes you get only first 256 bytes. Can you recheck this? That same record that you see in your log as only 256 bytes long, can you select that same record from the DB and see if you get the rest or only 256?

Lemme run some tests and I'll get back to you. In the meantime can you send me the ndb_desc for the table in question and also please check your table with ndb_blob_tool 
https://dev.mysql.com/doc/mysql-cluster-excerpt/5.6/en/mysql-cluster-programs-ndb-blob-too...

thanks
Bogdan Kecman
[30 May 2016 5:27] Hendrik Woltersdorf
The ndb_desc output:
-- REQUEST_HISTORY --
Version: 16777232
Fragment type: HashMapPartition
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 13
Number of primary keys: 1
Length of frm data: 625
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 12
ExtraRowGciBits: 0
ExtraRowAuthorBits: 0
TableStatus: Retrieved
HashMap: DEFAULT-HASHMAP-3840-12
-- Attributes --
ID_REQUEST_HISTORY Bigint PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
ID_SERVICE Int NULL AT=FIXED ST=MEMORY
GLOBAL_ID Varchar(40;latin1_swedish_ci) NULL AT=SHORT_VAR ST=MEMORY
MESSAGE_IDENT Varchar(40;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
MESSAGE_CONTEXT Varchar(40;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
MESSAGE_DATA Blob(256,2000,0) NULL AT=MEDIUM_VAR ST=DISK BV=2 BT=NDB$BLOB_22_5
STATE Char(1;latin1_swedish_ci) NULL AT=FIXED ST=MEMORY
DATE_INSERT Datetime2(0) NULL AT=FIXED ST=DISK
DATE_READY Datetime2(0) NULL AT=FIXED ST=DISK
DATE_UPDATE Datetime2(0) NOT NULL AT=FIXED ST=DISK
ID_SERVER_REPOSITORY Bigint NOT NULL AT=FIXED ST=DISK
ERROR_CODE Int NULL AT=FIXED ST=DISK
TYPE Char(1;latin1_swedish_ci) NULL AT=FIXED ST=DISK
-- Indexes --
PRIMARY KEY(ID_REQUEST_HISTORY) - UniqueHashIndex
REFERENCE_6_FK(ID_SERVICE) - OrderedIndex
PRIMARY(ID_REQUEST_HISTORY) - OrderedIndex
IDX_MS(MESSAGE_IDENT, ID_SERVICE, MESSAGE_CONTEXT) - OrderedIndex
IDX_STATE(STATE) - OrderedIndex
IDX_CONTEXT(MESSAGE_CONTEXT) - OrderedIndex
IDX_MS$unique(ID_SERVICE, MESSAGE_IDENT, MESSAGE_CONTEXT) - UniqueHashIndex
IDX_GLOBAL_ID(GLOBAL_ID) - OrderedIndex
-- Per partition info for REQUEST_HISTORY --
Partition       Row count       Commit count    Frag fixed memory       Frag varsized memory    Extent_space    Free extent_space
0               46394           1177853         2785280                 3440640                 4194304         1573712  
2               46062           1173374         2752512                 3407872                 4194304         1592304  
4               45666           1177312         2752512                 3375104                 4194304         1614480  
8               46102           1177863         2752512                 3407872                 4194304         1590064  
10              45768           1171176         2719744                 3375104                 4194304         1608768  
6               45808           1171806         2752512                 3375104                 4194304         1606528  
9               46001           1176237         2752512                 3407872                 4194304         1595720  
11              45746           1173807         2752512                 3407872                 4194304         1610000  
1               46524           1179192         2785280                 3440640                 4194304         1566432  
3               46132           1175593         2752512                 3407872                 4194304         1588384  
5               46115           1175119         2752512                 3407872                 4194304         1589336  
7               45610           1173252         2752512                 3375104                 4194304         1617616  

-- Per partition info for NDB$BLOB_22_5 --
Partition       Row count       Commit count    Frag fixed memory       Frag varsized memory    Extent_space    Free extent_space
9               149             419217          32768                   0                       1048576         710236   
11              186             418406          32768                   0                       1048576         619696   
1               158             420506          32768                   0                       1048576         649876   
3               167             419167          32768                   0                       1048576         653900   
5               167             418977          32768                   0                       1048576         637804   
7               156             418128          32768                   0                       1048576         639816   
8               162             419760          32768                   0                       1048576         702188   
10              182             417400          32768                   0                       1048576         613660   
0               177             419895          32768                   0                       1048576         643840   
2               153             418455          32768                   0                       1048576         690116   
4               150             419502          32768                   0                       1048576         714260   
6               154             417300          32768                   0                       1048576         708224   

NDBT_ProgramExit: 0 - OK

----------------
Checking the table with ndb_blob_tool did not show any errors, but if this problem occures again, I will do more checks.
[11 Jul 2016 20:13] MySQL Verification Team
Hi,

I'm runnning this for days ... 20+ threads .. the db is being modified during the test .. I'm not able to reproduce your problem.. 

#include <string.h>
#include <unistd.h>
#include <my_global.h>
#include <mysql.h>

int main(int argc, char **argv)
{
  MYSQL_RES *result;
  MYSQL_ROW row;

  MYSQL *con = mysql_init(NULL);
  if (con == NULL){
      fprintf(stderr, "mysql_init() failed\n");
      exit(1);
  }

  if (mysql_real_connect(con, "localhost", "root", "", "test", 0, "/tmp/mysql.sock", 0) == NULL) {
    fprintf(stderr, "%s\n", mysql_error(con));
    mysql_close(con);
    exit(1);
  }

while(1){
  if (mysql_query(con, "select MESSAGE_DATA from REQUEST_HISTORY")) {
    fprintf(stderr, "%s\n", mysql_error(con));
    mysql_close(con);
    exit(1);
  }
  result = mysql_store_result(con);
  if (result == NULL) {
    fprintf(stderr, "%s\n", mysql_error(con));
    mysql_close(con);
    exit(1);
  }

  while ((row = mysql_fetch_row(result)))
  {
    if (!row[0]){
     printf("NULL?!\n");
     continue;
    }
    if (row[0][strlen(row[0])-1] != 'X') printf ("NOT X?!?!? %c\n", row[0][strlen(row[0])-1]);
  }

  mysql_free_result(result);

}
  mysql_close(con);
  exit(0);
}

I'm just making sure in the script populating and modifying the table that LAST and ONLY LAST char in the blob is X .. 

I cannot reproduce this.. if you can create a simple script that you can reproduce this on your env. I will be happy to test it but I can't get "only part" of the blob... 

I assume your max_allowed_packet is ok?
[12 Jul 2016 5:45] Hendrik Woltersdorf
max_allowed_packet is 16MB; way more then the blob size in question.
I will see, if I can create a test case; but as I wrote earlier, we saw this _very_ rarely.
[12 Jul 2016 8:31] MySQL Verification Team
Hi,
I understand "very rarely" that's why I let that run for weeks but not able to reproduce it..
see if you can create a test case, c/c++/java whatever is easier for you

thanks
Bogdan
[5 Aug 2016 6:16] Hendrik Woltersdorf
I created a test case, that mimics the actions of our applications, but I was not able, to reproduce the issue in our test environment in the last weeks.
Sorry,
Hendrik
[30 Sep 2016 10:47] Hendrik Woltersdorf
craete table and procedure

Attachment: create_tab_and_proc.zip (application/zip, text), 2.86 KiB.

[30 Sep 2016 10:54] Hendrik Woltersdorf
Now I managed to create a test case!

The attached zip file contains create table, insert, create proc and a call template.

1. create the table and insert the test row
2. be creative and insert more, different(!) data into this table
3. wait. It takes some time until the error shows.
4. do the update and call SP_READ_BLOB...

The problem seems to be connected to the usage of a variable for the blob and the sequence select into - update - select variables as ...

to 3.: Right after inserting the test row into the table I get the correct result. After waiting some time, I get "something else"; for examle right now:

872067üŒSSAGE_DATA
$MESSAGE_DAMESSAGE_DA    2?                     363_CU`       01609270  01609270CHANGE_EVENT</MessageI</MessageIOrderIdent><NewExchangNewExchang609290852559</NewExcha</NewExcha><DateLastUpdate>04981date>04981409</DateLastUpdate><UtUpdate><U>25002821</UserOrderIdserOrderIdBankOrderId>XH30000025XH30000025ternalBankOrderId></OrderId></OrIsin>DE0007236101</Isi36101</Isi.000000</Limit><StopLiit><StopLi00</StopLimit><OcoLimit><OcoLimi</OcoLimit><TrsStopDisTrsStopDis0000</TrsStopDistance>pDistance>istance>0.000000</TrsL0000</TrsLce><BuySell>K</BuySellK</BuySell>5100.000000</Quantity</QuantityOpen>5100.000000</Quan0000</QuanCurrency>EUR</Currency</Currency>20160930</Validity><Oalidity><O/Ordertyp><Text>XS0025ext>XS0025xt><PeakQuantity>0.000tity>0.000uantity></Order><Statuder><Statu1</Status><Time>201609ime>201609Time><OverwriteValues>iteValues>writeValues></Status><</Status><</