Bug #51643 | Prepared statements with loadbalancer works on wrong connection | ||
---|---|---|---|
Submitted: | 2 Mar 2010 14:29 | Modified: | 4 Mar 2010 17:07 |
Reporter: | Gunn Olaussen | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Connector / J | Severity: | S2 (Serious) |
Version: | 5.0.12 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[2 Mar 2010 14:29]
Gunn Olaussen
[2 Mar 2010 14:30]
Gunn Olaussen
Here is some more information about how we tracked down this bug: We have the following cluster setup: siv27: ndbd, mysqld:56200 siv28: ndbd, mysqld:56300 siv29: ndb_mgmd, mysqld:56100, ndbapi 24/02 14:30:06.393 INFO Thr{0011} ClusterJDBCClient.connect: Creating JDBC connection to MySQL Cluster using URL: 'jdbc:mysql:loadbalance://siv29:56100,siv27:56200,siv28:56300' 24/02 14:30:06.393 FINE Thr{0011} ClusterJDBCClient.connect: JDBC Properties used: [autoreconnect=false reconnect=false loadBalanceStrategy=random user=XXX loadBalanceBlacklistTimeout=5000 allowMultiQueries=true drivername=com.mysql.jdbc.Driver dumpQueriesOnException=true password=YYY logger=com.mysql.jdbc.log.StandardLogger enablePacketDebug=true traceProtocol=true] The goal is to run TPCC load, but the datamodel creation fails because the number of lines doesn't match the expected: 24/02 14:35:26.152 FINE Thr{0011} Transaction.message: com.sun.jet.load.oltp.transactions.RunOLTP$ModelCheck: Expected #WAREHOUSE to be 1 was 1, expected #DISTRICTS to be 5 was 5, expected #ORDERS to be 15000 was 14975, expected #CUSTOMERS to be 15000 was 13500, expected #ITEMS to be 20000 was 20000, expected #STOCKS to be 20000 was 19901 Lets take the ORDERS table as an example: By connecting to the mysqld we see that the missing rows are the last 25 that were inserted (id 2976-3000 for district 5). In the load log I see that a commit has been done after o_id=2975 and then another after the last 25 orders: 24/02 14:35:25.132 FINE Thr{0062} Transaction.message: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0:(populate) INSERT INTO orders: o_id: '2975', d_id: '5', w_id: '1', c_id: '2177' 24/02 14:35:25.145 FINER Thr{0062} JDBCTransaction.commit: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0 commit() 24/02 14:35:25.148 FINE Thr{0062} Transaction.message: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0:(populate) INSERT INTO orders: o_id: '2976', d_id: '5', w_id: '1', c_id: '640' skip... 24/02 14:35:25.609 FINE Thr{0062} Transaction.message: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0:(populate) INSERT INTO orders: o_id: '2999', d_id: '5', w_id: '1', c_id: '347' 24/02 14:35:25.636 FINE Thr{0062} Transaction.message: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0:(populate) INSERT INTO orders: o_id: '3000', d_id: '5', w_id: '1', c_id: '756' 24/02 14:35:25.648 FINE Thr{0062} Transaction.message: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0:loadOrders, progress = 3000 24/02 14:35:25.648 FINER Thr{0062} JDBCTransaction.commit: com.sun.jet.load.oltp.transactions.OrderPopulate OrderPopulate1_0 commit() The same queries can be found in the query log for the mysqld on siv27 (with timestamp 100224 14:35:25): 12 Query INSERT INTO orders(o_id, o_c_id, o_d_id, o_w_id, o_entry_d, o_ol_cnt,o_total_amount, o_all_local, o_carrier_id)values (2975, 2177, 5, 1, CURDATE(), 23, 6697.79, 1, null) 12 Query commit 12 Query INSERT INTO orders(o_id, o_c_id, o_d_id, o_w_id, o_entry_d, o_ol_cnt,o_total_amount, o_all_local, o_carrier_id)values (2976, 640, 5, 1, CURDATE(), 44, 5910.8, 1, null) skip... 12 Query INSERT INTO orders(o_id, o_c_id, o_d_id, o_w_id, o_entry_d, o_ol_cnt,o_total_amount, o_all_local, o_carrier_id)values (2999, 347, 5, 1, CURDATE(), 42, 9262.23, 1, null) 12 Query INSERT INTO orders(o_id, o_c_id, o_d_id, o_w_id, o_entry_d, o_ol_cnt,o_total_amount, o_all_local, o_carrier_id)values (3000, 756, 5, 1, CURDATE(), 19, 3973.39, 1, null) But here there is no commit afterwards. However, there is a commit in the query log of siv29: 100224 14:35:25 18 Query commit When seeing this problem we checked out the connector/J source code and built it ourselves so we have added some extra tracing and turned on packet debug: The last order which has been commited is 2975: Wed Feb 24 14:35:25 CET 2010 TRACE: send() packet payload: aa 00 00 00 03 49 4e 53 . . . . . I N S 45 52 54 20 49 4e 54 4f E R T . I N T O 20 6f 72 64 65 72 73 28 . o r d e r s ( 6f 5f 69 64 2c 20 6f 5f o _ i d , . o _ 63 5f 69 64 2c 20 6f 5f c _ i d , . o _ 64 5f 69 64 2c 20 6f 5f d _ i d , . o _ 77 5f 69 64 2c 20 6f 5f w _ i d , . o _ 65 6e 74 72 79 5f 64 2c e n t r y _ d , 20 6f 5f 6f 6c 5f 63 6e . o _ o l _ c n 74 2c 6f 5f 74 6f 74 61 t , o _ t o t a 6c 5f 61 6d 6f 75 6e 74 l _ a m o u n t 2c 20 6f 5f 61 6c 6c 5f , . o _ a l l _ 6c 6f 63 61 6c 2c 20 6f l o c a l , . o 5f 63 61 72 72 69 65 72 _ c a r r i e r 5f 69 64 29 76 61 6c 75 _ i d ) v a l u 65 73 20 28 32 39 37 35 e s . ( 2 9 7 5 2c 20 32 31 37 37 2c 20 , . 2 1 7 7 , . 35 2c 20 31 2c 20 43 55 5 , . 1 , . C U 52 44 41 54 45 28 29 2c R D A T E ( ) , 20 32 33 2c 20 36 36 39 . 2 3 , . 6 6 9 37 2e 37 39 2c 20 31 2c 7 . 7 9 , . 1 , 20 6e 75 6c 6c 29 . n u l l ) Then the commit with our added debug shows that the commit is done on the correct host: Wed Feb 24 14:35:25 CET 2010 INFO: Commit with host='siv27:56200' and origHost='siv27:56200' Wed Feb 24 14:35:25 CET 2010 TRACE: send() packet payload: 07 00 00 00 03 63 6f 6d . . . . . c o m 6d 69 74 m i t Then it continues to insert the remaining rows (this is the last): Wed Feb 24 14:35:25 CET 2010 TRACE: send() packet payload: a9 00 00 00 03 49 4e 53 . . . . . I N S 45 52 54 20 49 4e 54 4f E R T . I N T O 20 6f 72 64 65 72 73 28 . o r d e r s ( 6f 5f 69 64 2c 20 6f 5f o _ i d , . o _ 63 5f 69 64 2c 20 6f 5f c _ i d , . o _ 64 5f 69 64 2c 20 6f 5f d _ i d , . o _ 77 5f 69 64 2c 20 6f 5f w _ i d , . o _ 65 6e 74 72 79 5f 64 2c e n t r y _ d , 20 6f 5f 6f 6c 5f 63 6e . o _ o l _ c n 74 2c 6f 5f 74 6f 74 61 t , o _ t o t a 6c 5f 61 6d 6f 75 6e 74 l _ a m o u n t 2c 20 6f 5f 61 6c 6c 5f , . o _ a l l _ 6c 6f 63 61 6c 2c 20 6f l o c a l , . o 5f 63 61 72 72 69 65 72 _ c a r r i e r 5f 69 64 29 76 61 6c 75 _ i d ) v a l u 65 73 20 28 33 30 30 30 e s . ( 3 0 0 0 2c 20 37 35 36 2c 20 35 , . 7 5 6 , . 5 2c 20 31 2c 20 43 55 52 , . 1 , . C U R 44 41 54 45 28 29 2c 20 D A T E ( ) , . 31 39 2c 20 33 39 37 33 1 9 , . 3 9 7 3 2e 33 39 2c 20 31 2c 20 . 3 9 , . 1 , . 6e 75 6c 6c 29 n u l l ) And here is the commit (to the wrong host): Wed Feb 24 14:35:25 CET 2010 INFO: Commit with host='siv29:56100' and origHost='siv29:56100' Wed Feb 24 14:35:25 CET 2010 TRACE: send() packet payload: 07 00 00 00 03 63 6f 6d . . . . . c o m 6d 69 74 m i t We added some output to pickConnection() in RandomBalanceStrategy on a later test run and saw that it picks a new host to connect to after each commit (which is by design). So it selects another host right before I insert the last batch of rows. This is the host which ends up getting the commit, but the rows were inserted on the previous connection since that is the one the PreparedStatement is holding.
[3 Mar 2010 19:35]
Mark Matthews
Fixed for 5.1.13.
[4 Mar 2010 17:07]
Tony Bedford
An entry has been added to the 5.1.13 changelog: Objects created by ConnectionImpl, such as prepared statements, hold a reference to the ConnectionImpl that created them. However, when the load balancer picked a new connection, it did not update the reference contained in, for example, the PreparedStatement. This resulted in inserts and updates being directed to invalid connections, while commits were directed to the new connection. This resulted in silent data loss.