Bug #81624 | reading a blob somtimes returns only a part of the value (disk table) | ||
---|---|---|---|
Submitted: | 27 May 2016 7:31 | Modified: | 30 Sep 2016 10:54 |
Reporter: | Hendrik Woltersdorf | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | 7.4.10 | OS: | Any |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
Tags: | BLOB, ndbcluster |
[27 May 2016 7:31]
Hendrik Woltersdorf
[27 May 2016 7:36]
Hendrik Woltersdorf
cluster config
Attachment: config.ini (application/octet-stream, text), 3.62 KiB.
[27 May 2016 13:08]
MySQL Verification Team
Hi Hendrik, I doubt this is a bug on our side just because if this bug exist there would be hundreds of reports by now, but I can't dismiss it. Anyways in order to see what's going on we first need to see "where" in your chain [datanode] ->...->[Application] this loss occurs and for that I need to know how exactly your "chain" looks like. Are you using MySQL to access data from ndbcluster or NDB API, Java/NDB, ndbd for memcached or any other method? After you retrieve data from ndbcluster, how do your application connect to that data (mysql-nd, mysql c api, mysql c++ api, mysql jdbc...)? Finally how your app handles data? You said you have logs, in what point of the chain you collect logs attm? kind regards Bogdan Kecman
[27 May 2016 13:28]
Hendrik Woltersdorf
We access the cluster through SQL-nodes (MySQL server). Some clients use JDBC others the c/c++ libraries. The applications access the database servers directly and write the logs with the queries and the results, if appropriate.
[27 May 2016 13:44]
MySQL Verification Team
Hi, and the problem you are facing (the incomplete blob content) was retrieved from mysqld trough c/c++ or jdbc? (you said you can only see it from time to time on one system and can't reproduce so I doubt you have it both on jdbc and c and c++ api) thanks Bogdan
[27 May 2016 13:57]
Hendrik Woltersdorf
Today I saw this in a c/c++ client. Last time, when it happened in a java app, the case got analyzed by a colleague, who is out of office for some time; so I can't ask him about the logs.
[27 May 2016 14:11]
MySQL Verification Team
Hi, So you seen this both with CAPI and JDBC.. weird. Every time you seen this it was 7.4.10 on the server side? We did have some blob issues with data on disk back in the day of 7.0/7.1 but all those are fixed. Note that BLOB in ndbcluster is a not straight forward type. Only first 256 bytes (what you get) are stored in the table itself, rest of the content is stored in the separate storage blocks. It can happen that those are not created but then every select from that record would have a same problem but you are saying that you can't reproduce and when you select you get a full message but only sometimes you get only first 256 bytes. Can you recheck this? That same record that you see in your log as only 256 bytes long, can you select that same record from the DB and see if you get the rest or only 256? Lemme run some tests and I'll get back to you. In the meantime can you send me the ndb_desc for the table in question and also please check your table with ndb_blob_tool https://dev.mysql.com/doc/mysql-cluster-excerpt/5.6/en/mysql-cluster-programs-ndb-blob-too... thanks Bogdan Kecman
[30 May 2016 5:27]
Hendrik Woltersdorf
The ndb_desc output: -- REQUEST_HISTORY -- Version: 16777232 Fragment type: HashMapPartition K Value: 6 Min load factor: 78 Max load factor: 80 Temporary table: no Number of attributes: 13 Number of primary keys: 1 Length of frm data: 625 Row Checksum: 1 Row GCI: 1 SingleUserMode: 0 ForceVarPart: 1 FragmentCount: 12 ExtraRowGciBits: 0 ExtraRowAuthorBits: 0 TableStatus: Retrieved HashMap: DEFAULT-HASHMAP-3840-12 -- Attributes -- ID_REQUEST_HISTORY Bigint PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR ID_SERVICE Int NULL AT=FIXED ST=MEMORY GLOBAL_ID Varchar(40;latin1_swedish_ci) NULL AT=SHORT_VAR ST=MEMORY MESSAGE_IDENT Varchar(40;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY MESSAGE_CONTEXT Varchar(40;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY MESSAGE_DATA Blob(256,2000,0) NULL AT=MEDIUM_VAR ST=DISK BV=2 BT=NDB$BLOB_22_5 STATE Char(1;latin1_swedish_ci) NULL AT=FIXED ST=MEMORY DATE_INSERT Datetime2(0) NULL AT=FIXED ST=DISK DATE_READY Datetime2(0) NULL AT=FIXED ST=DISK DATE_UPDATE Datetime2(0) NOT NULL AT=FIXED ST=DISK ID_SERVER_REPOSITORY Bigint NOT NULL AT=FIXED ST=DISK ERROR_CODE Int NULL AT=FIXED ST=DISK TYPE Char(1;latin1_swedish_ci) NULL AT=FIXED ST=DISK -- Indexes -- PRIMARY KEY(ID_REQUEST_HISTORY) - UniqueHashIndex REFERENCE_6_FK(ID_SERVICE) - OrderedIndex PRIMARY(ID_REQUEST_HISTORY) - OrderedIndex IDX_MS(MESSAGE_IDENT, ID_SERVICE, MESSAGE_CONTEXT) - OrderedIndex IDX_STATE(STATE) - OrderedIndex IDX_CONTEXT(MESSAGE_CONTEXT) - OrderedIndex IDX_MS$unique(ID_SERVICE, MESSAGE_IDENT, MESSAGE_CONTEXT) - UniqueHashIndex IDX_GLOBAL_ID(GLOBAL_ID) - OrderedIndex -- Per partition info for REQUEST_HISTORY -- Partition Row count Commit count Frag fixed memory Frag varsized memory Extent_space Free extent_space 0 46394 1177853 2785280 3440640 4194304 1573712 2 46062 1173374 2752512 3407872 4194304 1592304 4 45666 1177312 2752512 3375104 4194304 1614480 8 46102 1177863 2752512 3407872 4194304 1590064 10 45768 1171176 2719744 3375104 4194304 1608768 6 45808 1171806 2752512 3375104 4194304 1606528 9 46001 1176237 2752512 3407872 4194304 1595720 11 45746 1173807 2752512 3407872 4194304 1610000 1 46524 1179192 2785280 3440640 4194304 1566432 3 46132 1175593 2752512 3407872 4194304 1588384 5 46115 1175119 2752512 3407872 4194304 1589336 7 45610 1173252 2752512 3375104 4194304 1617616 -- Per partition info for NDB$BLOB_22_5 -- Partition Row count Commit count Frag fixed memory Frag varsized memory Extent_space Free extent_space 9 149 419217 32768 0 1048576 710236 11 186 418406 32768 0 1048576 619696 1 158 420506 32768 0 1048576 649876 3 167 419167 32768 0 1048576 653900 5 167 418977 32768 0 1048576 637804 7 156 418128 32768 0 1048576 639816 8 162 419760 32768 0 1048576 702188 10 182 417400 32768 0 1048576 613660 0 177 419895 32768 0 1048576 643840 2 153 418455 32768 0 1048576 690116 4 150 419502 32768 0 1048576 714260 6 154 417300 32768 0 1048576 708224 NDBT_ProgramExit: 0 - OK ---------------- Checking the table with ndb_blob_tool did not show any errors, but if this problem occures again, I will do more checks.
[11 Jul 2016 20:13]
MySQL Verification Team
Hi, I'm runnning this for days ... 20+ threads .. the db is being modified during the test .. I'm not able to reproduce your problem.. #include <string.h> #include <unistd.h> #include <my_global.h> #include <mysql.h> int main(int argc, char **argv) { MYSQL_RES *result; MYSQL_ROW row; MYSQL *con = mysql_init(NULL); if (con == NULL){ fprintf(stderr, "mysql_init() failed\n"); exit(1); } if (mysql_real_connect(con, "localhost", "root", "", "test", 0, "/tmp/mysql.sock", 0) == NULL) { fprintf(stderr, "%s\n", mysql_error(con)); mysql_close(con); exit(1); } while(1){ if (mysql_query(con, "select MESSAGE_DATA from REQUEST_HISTORY")) { fprintf(stderr, "%s\n", mysql_error(con)); mysql_close(con); exit(1); } result = mysql_store_result(con); if (result == NULL) { fprintf(stderr, "%s\n", mysql_error(con)); mysql_close(con); exit(1); } while ((row = mysql_fetch_row(result))) { if (!row[0]){ printf("NULL?!\n"); continue; } if (row[0][strlen(row[0])-1] != 'X') printf ("NOT X?!?!? %c\n", row[0][strlen(row[0])-1]); } mysql_free_result(result); } mysql_close(con); exit(0); } I'm just making sure in the script populating and modifying the table that LAST and ONLY LAST char in the blob is X .. I cannot reproduce this.. if you can create a simple script that you can reproduce this on your env. I will be happy to test it but I can't get "only part" of the blob... I assume your max_allowed_packet is ok?
[12 Jul 2016 5:45]
Hendrik Woltersdorf
max_allowed_packet is 16MB; way more then the blob size in question. I will see, if I can create a test case; but as I wrote earlier, we saw this _very_ rarely.
[12 Jul 2016 8:31]
MySQL Verification Team
Hi, I understand "very rarely" that's why I let that run for weeks but not able to reproduce it.. see if you can create a test case, c/c++/java whatever is easier for you thanks Bogdan
[5 Aug 2016 6:16]
Hendrik Woltersdorf
I created a test case, that mimics the actions of our applications, but I was not able, to reproduce the issue in our test environment in the last weeks. Sorry, Hendrik
[30 Sep 2016 10:47]
Hendrik Woltersdorf
craete table and procedure
Attachment: create_tab_and_proc.zip (application/zip, text), 2.86 KiB.
[30 Sep 2016 10:54]
Hendrik Woltersdorf
Now I managed to create a test case! The attached zip file contains create table, insert, create proc and a call template. 1. create the table and insert the test row 2. be creative and insert more, different(!) data into this table 3. wait. It takes some time until the error shows. 4. do the update and call SP_READ_BLOB... The problem seems to be connected to the usage of a variable for the blob and the sequence select into - update - select variables as ... to 3.: Right after inserting the test row into the table I get the correct result. After waiting some time, I get "something else"; for examle right now: 872067üSSAGE_DATA $MESSAGE_DAMESSAGE_DA 2? 363_CU` 01609270 01609270CHANGE_EVENT</MessageI</MessageIOrderIdent><NewExchangNewExchang609290852559</NewExcha</NewExcha><DateLastUpdate>04981date>04981409</DateLastUpdate><UtUpdate><U>25002821</UserOrderIdserOrderIdBankOrderId>XH30000025XH30000025ternalBankOrderId></OrderId></OrIsin>DE0007236101</Isi36101</Isi.000000</Limit><StopLiit><StopLi00</StopLimit><OcoLimit><OcoLimi</OcoLimit><TrsStopDisTrsStopDis0000</TrsStopDistance>pDistance>istance>0.000000</TrsL0000</TrsLce><BuySell>K</BuySellK</BuySell>5100.000000</Quantity</QuantityOpen>5100.000000</Quan0000</QuanCurrency>EUR</Currency</Currency>20160930</Validity><Oalidity><O/Ordertyp><Text>XS0025ext>XS0025xt><PeakQuantity>0.000tity>0.000uantity></Order><Statuder><Statu1</Status><Time>201609ime>201609Time><OverwriteValues>iteValues>writeValues></Status><</Status><</