Bug #58404 | Assertion prev_view == NULL || prev_view->low_limit_no >= view->low_limit_no | ||
---|---|---|---|
Submitted: | 22 Nov 2010 21:46 | Modified: | 15 Oct 2012 13:30 |
Reporter: | Elena Stepanova | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.6.1-m5-debug | OS: | Any |
Assigned to: | Sunny Bains | CPU Architecture: | Any |
[22 Nov 2010 21:46]
Elena Stepanova
[24 Nov 2010 22:47]
Elena Stepanova
The problem is reproducible on debug builds with long high-concurrency sysbench workload. Every test run that we have crashes at some point, although sometimes it happens after ~20 min of the test, while sometimes it might be several hours. I did not get the failure on a non-debug version (1 test run, 8 hours). How to reproduce: Start server (only system tables in the data folder): mysqld --no-defaults --basedir=<basedir> --datadir=<data folder> --log-error=mysqld.err --innodb_buffer_pool_size=256M --max_connections=3000 --max_prepared_stmt_count=21100 Run sysbench prepare (I used sysbench 0.4.12): sysbench --test=oltp --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-db=test --mysql-table-engine=innodb --oltp-table-size=100000 prepare Run sysbench workflow: sysbench --test=oltp --num-threads=2000 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-db=test --mysql-table-engine=innodb --max-requests=0 --max-time=28800 run
[1 Dec 2010 3:06]
Sunny Bains
When creating the read view we set view->low_limit_no to max(trx_sys->trx_list : no) and transactions can change state from RUNNING to COMMITED while we are iterating over the trx_sys_t::trx_list. Therefore it is possible for concurrently running threads to then end up inserting the new view out of order in the trx_sys_t::view_list. 328 for (trx = UT_LIST_GET_FIRST(trx_sys->trx_list); 329 trx != NULL; 330 trx = UT_LIST_GET_NEXT(trx_list, trx)) { 331 332 /* Note: We are doing a dirty read of the trx_t::state 333 without the cover of the trx_t::mutex. The state change 334 to TRX_STATE_PREPARED is done using only the trx_t::mutex. * / 335 336 if (trx->id != cr_trx_id 337 && (trx->state == TRX_STATE_ACTIVE 338 || trx->state == TRX_STATE_PREPARED)) { 339 340 ut_ad(n_trx < view->n_trx_ids); 341 342 view->trx_ids[n_trx++] = trx->id; 343 344 /* NOTE that a transaction whose trx number is < 345 trx_sys->max_trx_id can still be active, if it is 346 in the middle of its commit! Note that when a 347 transaction starts, we initialize trx->no to 348 IB_ULONGLONG_MAX. */ 349 350 if (view->low_limit_no > trx->no) { 351 352 view->low_limit_no = trx->no; 353 } 354 } 355 } 356 357 view->n_trx_ids = n_trx; 358 359 if (view->n_trx_ids > 0) { 360 /* The last active transaction has the smallest id: */ 361 362 view->up_limit_id = view->trx_ids[view->n_trx_ids - 1]; 363 } else { 364 view->up_limit_id = view->low_limit_id; 365 } 366 367 ut_ad(read_view_validate(view)); 368 369 mutex_enter(&trx_sys->read_view_mutex); 370 371 UT_LIST_ADD_FIRST(view_list, trx_sys->view_list, view); 372 373 mutex_exit(&trx_sys->read_view_mutex); By the time a thread gets to line 371 there is no guarantee that when it prepends the view to trx_sys->view list it will preserve the order. There are several solutions, in order of decreasing preference: 1. Ordered insert based on view->low_limit_no 2. Force thread to acquire trx_sys_t::lock in X mode in lock_trx_release_locks() when changing trx_t::state to TRX_STATE_COMMITTED_IN_MEMORY 3. Hold the trx_sys_t::read_view_mutex for the duration of the for loop The reason I prefer 1 is because the worst case should require the traversal of as many list nodes as there are threads doing concurrent create of views. This number should be small. 2. Is also OK, it is simple but my worry is increasing the contention on the trx_sys_t::lock. For 3 my concern is that with > 1024 threads it could make read view create a bottleneck.
[22 Dec 2010 21:31]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101222212842-y0t3ibtd32wd9qaw) (version source revid:alexander.nozdrin@oracle.com-20101222212842-y0t3ibtd32wd9qaw) (merge vers: 5.6.1) (pib:24)