MySQL Bugs: #43987: InnoDB drops some records during DBT2 tests

Bug #43987	InnoDB drops some records during DBT2 tests
Submitted:	31 Mar 2009 15:42	Modified:	1 Apr 2009 12:07
Reporter:	Trim Pershad	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S5 (Performance)
Version:	5.1, 6.0	OS:	Any
Assigned to:	Alexey Stroganov	CPU Architecture:	Any

Description:
InnoDB drops some records during DBT2 tests. This was originally reported by Jim Starkey via email and  so needs further investigation by running the dbt2 tests.

How to repeat:
Ranger will provide the steps to repeat.

Original email:

This morning I ran a 5 minute, 10 warehouse dbt2 run for both Falcon and InnoDB.  
InnoDB whomped us:

                          Response Time (s)
 Transaction      %    Average :    90th %        Total        Rollbacks      %
------------  -----  ---------------------  -----------  ---------------  -----
    Delivery   3.88      0.086 :     0.122        15414                0   0.00
   New Order  43.63      0.066 :     0.095       173313             1749   1.01
Order Status   3.90      0.025 :     0.036        15503                0   0.00
     Payment  41.52      0.083 :     0.129       164925                0   0.00
 Stock Level   3.91      0.030 :     0.046        15522                0   0.00

33421.56 new-order transactions per minute (NOTPM)
5.1 minute duration
0 total unknown errors
10 second(s) ramping up

vs Falcon's

 Transaction      %    Average :    90th %        Total        Rollbacks      %
------------  -----  ---------------------  -----------  ---------------  -----
    Delivery   3.87      0.121 :     0.180         9556                0   0.00
   New Order  43.16      0.095 :     0.149       106591             1061   1.00
Order Status   3.81      0.044 :     0.062         9406                0   0.00
     Payment  41.21      0.159 :     0.296       101777                0   0.00
 Stock Level   3.84      0.045 :     0.063         9473                0   0.00

20491.26 new-order transactions per minute (NOTPM)
5.2 minute duration
0 total unknown errors
11 second(s) ramping up

But here's the interesting part.  After the two runs, I did counts of the two new order tables:

mysql> select count(*) from dbt2_innodb.new_order;
+----------+
| count(*) |
+----------+
|   138299 |
+----------+
1 row in set (0.02 sec)

mysql> select count(*) from dbt2.new_order;
+----------+
| count(*) |
+----------+
|   116836 |
+----------+
1 row in set (0.07 sec)

If we compare the number of records in the new_order tables to the initial population plus the number of new order transactions, we get:

 Initial Records Reported New Order Transctions Rollbacks Records Expected Records Observed Discrepancy 

 
 
 
 
 
 
 
InnoDB 9,000 173,313 1,749 180,564 138,299 -42,265 
Falcon 9,000 106,591 1,061 114,530 116,836 2,306 

 
 
 
 
 
 
 
InnoDB(2) 9,000 175,081 1,719 182,362 139,019 -43,34 

Just to be care, I ran the test a second time listed above as "InnoDB(2)".

Fast is one thing, but getting the right answer is more important.

After analysis of how test internals I believe this issue is not a bug in InnoDB or any other engine.

To explain why this issue is not a bug I need to highlight some design aspects of tpc-c test.

...
5.2.3 ... Over the measurement interval, the terminal population must maintain a minimum percentage of mix for each transaction type as follows:
Transaction Type Minimum % of mix
New-Order 1 n/a
Payment 43.0
Order-Status 4.0
Delivery 4.0
Stock-Level 4.0
note: There is no minimum for the New-Order transaction as its measured rate is the reported throughput.

...

2.4. The New-Order Transaction
2.4.2.2.
...
• A new row is inserted into both the NEW-ORDER table and the ORDER table to reflect the creation of the new order. O_CARRIER_ID is set to a null value. If the order includes only home order-lines, then O_ALL_LOCAL is set to 1, otherwise O_ALL_LOCAL is set to 0.
...

2.7 The Delivery Transaction
The Delivery business transaction consists of processing a batch of 10 new (not yet delivered) orders.

2.7.4.2.
• The selected row in the NEW-ORDER table is deleted.

...

So during New-Order transaction we are inserting _1_ row to new_order table and during Delivery transaction we are deleting _10_ rows from new_order table.

So right formula to calculate number of rows for new_order table should looks like following:

initial number of rows in new_order table + number of NewOrder transactions - number of Delivery transactions * 10 - number of rollbacks for NewOrder transaction.

While analyzing this issue I've found some inaccuracy in counting of number of executed transactions logged to mix.log and now trying to fix this. I will post my findings and will provide details that will clear up suspicions regarding correctness of results.

I've found bug in dbt2 test that leads to inaccuracy in logging of performed
transactions.

The problem was in absence of checking error code after receiving of
transaction from db client so even if transaction was discarded on db side
it will be still logged as successful one.

After applying of following patch we will get 100% accuracy for the number
of performed transactions.

--- driver.c.orig       2008-11-12 00:52:12.000000000 +0100
+++ driver.c    2009-04-01 13:15:49.000000000 +0200
@@ -572,7 +572,14 @@
 #else /* STANDALONE */
                length = send_transaction_data(sockfd, &client_data);
                length = receive_transaction_data(sockfd, &client_data);
-               rc = client_data.status;
+                if (length == ERROR || length == ERROR_SOCKET_CLOSED )
+                {
+                  rc = ERROR;
+                }
+                else
+                {
+                  rc = client_data.status;
+                }
 #endif /* STANDALONE */
                if (gettimeofday(&rt1, NULL) == -1) {
                        perror("gettimeofday");

I should note that the number of actual performed transactions and number
of transactions that were accounted for the test in the final report will be
different due to fact that in test we have warmup period and transactions
performed during warmup will not get into final report.

Below I will explain how transactions are logged and how one can ensure
that there is no problem in data integrity.

All performed transactions during dbt2 test are logged into
driver/mix.log file in such format:
#timestamp | short name of transaction | response time | thread_id
1238582499,n,0.915697,1073875264

Successful transaction logged as lowercase char(see above) unsuccessful(rollbacked) as upper-case char(see below).

1238586829,N,10.504595,1077254464.

As I mentioned before there is a warmup period in the test (to start all
driver threads) and thus all records(transactions) in this file before
special mark(see below) will not be accounted. The mark that indicate begining
of the test looks like following:

1238584199,START

This mark is used later by mix_analyzer.pl tool to calculate statistic
information and provide report according tpc-c rules. But as this report
doesn't include all performed transactions so we can't rely on it to
calculate final number of rows in db. So the only way to do so to
count number of transactions directly in mix.log file:

Below are results of checking consistency for InnoDB and Falcon.

dbt2 test with my patch above and 10 warehouses database that has 
initially 90000 rows in new_order table:

InnoDB:
-------
bash# mysql -uroot dbt2w10 -e'select count(*) from new_order' -ss
89917

bash# grep ",n," mix.log | wc -l
307
bash# grep ",d," mix.log | wc -l
39

Now we calculate number of rows in db:

initial number of rows + number of new_order transaction - number of delivery transactions * 10

bash# mysql -uroot -e'select 90000+307-39*10'
+-----------------+
| 90000+307-39*10 |
+-----------------+
|           89917 |
+-----------------+

Falcon:
bash# mysql -uroot dbt2w10 -e'select count(*) from new_order' -ss
89985

bash# grep ",n," mix.log | wc -l
295
bash# grep ",d," mix.log | wc -l
31

bash# mysql -uroot -e'select 90000+295-31*10'
+-----------------+
| 90000+295-31*10 |
+-----------------+
|           89985 |
+-----------------+

I've just forgot to mention that duration of my runs for InnoDB and Falcon was 1 min and the purpose of this runs was to demonstrate that there are any drops for InnoDB and number of rows for both engines after the test is ok.

typo:

there are any drops = there are no any drops

I need to clarify following:

Found issue in dbt2 test regarding improper logging of transactions affects only very _insignificant_ amount of transactions (actually only the very last transactions  that were passed from driver to db client) and such way may cause very slight  difference between theoretical number of rows calculated  from number of performed transactions and  actual number of rows in db. It is almost nothing in case of long runs with many users. I found the problem only because I performed very short test runs with 1 thread. 

In short - all previous results are valid and this issue only about only some 
cosmetic inaccuracy that don't affect results.

Another update regarding formula to calculate exact number of rows in new_order table - the delivery transaction sometimes deletes less than 10 rows(that's ok with standard). It means that if we assume that every delivery transaction deletes 10 rows but actually it will delete less then  number of rows in new_order table can be a bit bigger than computed with formula I provided before. To get accurate number we should count number of rows affected for DELETE statements that can be achieved with 
following addition to nonsp/dbc_delivery.c:

--- dbc_delivery.c.orig 2008-11-12 00:52:12.000000000 +0100
+++ dbc_delivery.c      2009-04-02 00:11:50.000000000 +0200
@@ -84,6 +84,8 @@
             {
               return -1;
             }
+            LOG_ERROR_MESSAGE("DELIVERY_2: affected rows: %d\n",result.num_rows);
+
             sprintf(query, DELIVERY_3, vals[NO_O_ID], w_id, d_id);

 #ifdef DEBUG_QUERY

Such way we will log number of affected rows for every delete statement. This info will be in client/error.log.

To count actually delete rows:
grep "affected rows: 1"  | wc -l