Bug #79703 Spin rounds per wait will be negative in InnoDB status if spin waits > max32int
Submitted: 18 Dec 2015 12:54 Modified: 29 Jun 2016 12:36
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.6,5.7 OS:Any
Assigned to: CPU Architecture:Any

[18 Dec 2015 12:54] Laurynas Biveinis
Description:
InnoDB status will start printing negative values for spin rounds per wait, if the wait number, being accounted as a 64-bit integer, will not fit into regular signed int (32 bits on LP64).

For example (output from PS but we have no patches here),

----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 19811539060
OS WAIT ARRAY INFO: signal count 8333711487
Mutex spin waits 192857078576, rounds 290389036704, OS waits 1568180069
RW-shared spins 19747848205, rounds 565755834160, OS waits 14799492482
RW-excl spins 7774140570, rounds 144578121130, OS waits 3188338144
Spin rounds per wait: -697.30 mutex, -327.60 RW-shared, -177.22 RW-excl

This is caused by

		(double) mutex_spin_round_count /
		(mutex_spin_wait_count ? mutex_spin_wait_count : 1),

conditional operator part: mutex_spin_wait_count is not an integral type var but a C++ object with type conversion operator to ib_int64_t. 1 is a constant of type (signed) int. Now C++ language rules somehow decide (credit to my colleague Igor Solodovnikov) that the type of the result of conditional operator is int too. This can also be confirmed by disassembling sync_print_wait_info:

...
	leaq	__ZL21mutex_spin_wait_count(%rip), %r15 # mutex_spin_wait_count.this to R15
...
	addq	264(%rcx,%r15), %rdx # counter sum to RDX (only the final add shown, RCX = 0)
...
	cvtsi2sdl	%edx, %xmm1 # EDX not RDX! opcode is also for "doubleword integer"
...
	divsd	%xmm1, %xmm0

The bug was introduced together with ib_counter_t introduction. Before, the counter variables being plain integers were working OK (thus 5.5 is not affected)

How to repeat:
Code analysis, running server long enough, then looking at the status.

Suggested fix:
Probably the easiest fix is s/1/1ULL/g in the conditional operators. But at the same time observe that each counter object mention causes a fresh summation for the counter. Thus better to introduce local vars and to sum each counter only once. Also get rid of redundant typecasts in the first printf, and resolve the ib_int64_t / ib_uint64_t mismatch.
[18 Dec 2015 14:09] Laurynas Biveinis
Bug 79703 fix for 5.6

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug79703.patch (application/octet-stream, text), 3.17 KiB.

[18 Dec 2015 14:11] Laurynas Biveinis
With proposed patch integer to double conversion uses 	cvtsi2sdq	%rax, %xmm2, which looks correct.
[29 Feb 2016 20:52] Staffan Flink
We've also hit this bug but it seems that the correct numbers for 'Spin rounds per wait' can be retrieved from I_S.INNODB_METRICS.

snippet output from 'SHOW ENGINE INNODB STATUS\G':
*************************** 1. row ***************************
Type: InnoDB
Name:
Status:
=====================================
2016-02-29 17:04:50 7f1613cb2700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 59 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 27828639 srv_active, 0 srv_shutdown, 552 srv_idle
srv_master_thread log flush and writes: 27829190
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 562748487
OS WAIT ARRAY INFO: signal count 4440218264
Mutex spin waits 4031631980, rounds 19046728943, OS waits 221855330
RW-shared spins 1337835479, rounds 10346069330, OS waits 159065324
RW-excl spins 436493896, rounds 12607401587, OS waits 149384138
Spin rounds per wait: -72.33 mutex, 7.73 RW-shared, 28.88 RW-excl

Output from I_S.INNODB_METRICS:
mysql> use information_schema;
Database changed
mysql> select name,count,max_count,avg_count from INNODB_METRICS where name like '%spin%';
+-----------------------------+-------------+-------------+--------------------+
| name | count | max_count | avg_count |
+-----------------------------+-------------+-------------+--------------------+
| innodb_rwlock_s_spin_waits | 1338920718 | 1338920718 | 48.02786821549153 |
| innodb_rwlock_x_spin_waits | 436953500 | 436953500 | 15.673777268638679 |
| innodb_rwlock_s_spin_rounds | 10353523860 | 10353523860 | 371.3869483301454 |
| innodb_rwlock_x_spin_rounds | 12620856222 | 12620856222 | 452.7174844992445 |
+-----------------------------+-------------+-------------+--------------------+
4 rows in set (0.00 sec)

This would suggests that it's 'SHOW ENGINE INNODB STATUS' that interprets the numbers incorrectly.

For information, we're running 5.6.23.

Best Regards
Staffan Flink
[29 Jun 2016 12:36] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.0 release, and here's the changelog entry:

SHOW ENGINE INNODB STATUS output displayed negative spin rounds per wait
values. 

Thank you for the bug report.