Bug #74893 mdl m_unused_lock_objects shouldn't be volatile
Submitted: 17 Nov 2014 7:23 Modified: 10 Apr 2015 8:14
Reporter: Stewart Smith Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.7.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: PowerPC

[17 Nov 2014 7:23] Stewart Smith
Description:
m_unused_lock_objects in sql/mdl.cc doesn't need to be volatile. Atomics will take care of doing correct updates and if stale reads are really okay, then let the compiler optimize them!

volatile forces compiler to generate load and store instructions, which in this case is completely unneeded, and volatile is just a usual pointer to concurrency bugs.

How to repeat:
code analysis

Suggested fix:
apply my patch (attached!)
[17 Nov 2014 7:24] Stewart Smith
make mdl unused lock objects not volatile

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: volatile-mdl-unused-lock-objects.patch (text/x-patch), 752 bytes.

[17 Nov 2014 8:29] MySQL Verification Team
Hello Stewart,

Thank you for the report and contribution.

Thanks,
Umesh
[17 Nov 2014 10:24] Jon Olav Hauglid
Hi Stewart,

Our atomics API has three different implementations:
1) GCC intrinsics
2) Windows Interlocked functions
3) Solaris atomics

The latter two require the arguments to be volatile, hence our API also does.
This means we can't use your patch directly. If we want to get rid of volatile 
for 1) where it is not needed, we will have to introduce some platform specific 
typedef or similar.
[24 Nov 2014 22:49] Stewart Smith
Then you should likely create atomic_t and atomic64_t if the other two implementations require volatile, you can then do this trick:

typedef struct {
#ifdef WIN32 || SOLARIS
volatile
#endif
uint32_t v; 
} atomic_t;

and then you get the extra check of not being able to blindly do non-atomic reads and writes (you can always have an accessor for non-atomic reads). I'd heartily recommend this anyway as it'll probably prevent 18 bugs and prevent hours of wasted time of random developers checking code only to find out it's actually okay to be doing a dirty read.

But having volatile there has just made people think it was magic (which it certainly isn't) and has lead to the current situation (I haven't sat down and solved most of the replication issues there... but I'm almost certain there's some awful bugs).

That being said, the functions on Windows (and I guess Solaris) have volatile in the arguments, which is different than requiring a volatile value be passed to it. It tells the compiler that the *argument* into the function (well, the thing it's pointing to) needs to have some safe assumptions... my guess is this is kind of how the builtins/APIs/compilers work on Windows/Solaris.

So, if you had this code:
int foo;
printf("%d\n",foo);
printf("%d\n",foo);
printf("%d\n",foo);
printf("%d\n",foo);
InterlockedIncrement(&foo);
printf("%d\n",foo);
printf("%d\n",foo);
printf("%d\n",foo);

running in several threads, the compiler could do one read for the first four printfs and one for the second three, but it could not cache across the InterlockedIncrement call because of volatile there. My guess is that with GCC intrinsics, the compiler knows about this while on Windows/Solaris it's actually just some asm and there's not the appropriate ways to tell the complier that you've had a value clobbered.

I'd provide a patch, but I doubt it would easily apply to the current development tree as it's likely further ahead than 5.7.5.
[10 Apr 2015 8:14] Stewart Smith
Still a problem in 5.7.7