It all starts with volatile
I have always been interpreting volatile
as one way to tell the compiler that
this variable might change unexpectedly (outside the current thread), so CPU
should not cache its use, and do a full memory access (load and store) all the
time, and recent working with volatile
deepens my understanding. Here, I would
document it, which hopefully would clear some doubts for someone else as well.
§why volatile
It’s indeed confusing to have the same keyword with different meanings in
different languages. volatile
in Java/C# means atomic in C11, and volatile
in C intends to solve something else. This post focuses on volatile
in C, for
it’s the least well understood one.
The purpose of volatile is to force an implementation to suppress optimization that could otherwise occur. For example, for a machine with memory-mapped input/output, a pointer to a device register might be declared as a pointer to volatile, in order to prevent the compiler from removing apparently redundant references through the pointer.
volatile
turns off optimization in the compiler, and no special CPU
instructions are emitted because of it.
?? volatile
plus SC would be enough for multithread programming.
§multithread programming
The above discussion on volatile
doesn’t mention multithread programming at
all, but given all volatile
variables are read from and written to the memory,
it should be enough to be used for inter-thread communication. Let’s try with
one simple example:
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
// volatile static int flag = 0;
static int flag = 0;
void* thread_A(void * arg)
{
sleep(1);
flag = 1;
return NULL;
}
void* thread_B(void * arg)
{
while (flag == 0) ;
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t a, b;
pthread_create(&a, NULL, &thread_A, NULL);
pthread_create(&b, NULL, &thread_B, NULL);
pthread_exit(NULL);
}
Building and run the above C program using:
clang -O3 volatile.c; ./a.out
Then, we find it never terminates. Adding the volatile
declaration fixes the
problem. Excellent! volatile
ensures that updates to variables could be
visible to other threads.
However, is it really enough for multithread programming?
In this post, Volatile: Almost Useless for Multi-Threaded
Programming,
the author mentions that two key issues in multithread programming are,
atomicity and memory consistency, and volatile
addresses neither. Adding or
removing volatile
doesn’t affect atomicity at all. Memory consistency is
achieved by the collaboration between the compiler and the CPU. If the compiler
doesn’t do any reordering, and the CPU uses sequential consistency, memory
consistency is guaranteed. However, in practice, compilers does reordering and
CPU doesn’t have sequential consistency for performance reasons.
volatile
could be considered as compiler barriers, for volatile objects access
can’t be omitted, reordered. However, no other special instructions are
generated, so the CPU with out-of-order execution capacity can still reorder the
instructions in theory. (The reason why volatile
is not broken for IO
variables is that most embedded CPU don’t have OoE, and those that do take
special care not to screw this up. See volatile in embedded
programming for more info.)
This post
provides one runnable
example
to illustrate how using volatile
results non-intuitive results due to the
reordering done by CPU. (The actual reason is store buffer, but it’s
equivalent to think of it as reordering, as mentioned in the C11 memory model
presentation.)
Transformations at all levels are equivalent.
==> Can reason about all transformations as reorderings of source code loads and stores.
The proper way of dealing multithread programming is to use the right synchronization mechanics: mutex (lock), atomic variables, etc. The new memory model in C11 walks us through the memory model guaranteed by the latest standard, so long as we don’t have data race in our program.
If you are used to volatile
in Java/C#, use _Atomic
in C11 for C, they mean
the same. Maybe _Atomic
is the thing you really have in mind, when you use
volatile
for multithread programming.
§When shall I use volatile or not
This post, Volatile: Almost Useless for Multi-Threaded
Programming,
also documents the three valid use cases for volatile
. The
trouble with volatile focuses on kernel
development, where volatile
is almost always wrong, except certain usages
which already existed in the code base.