It all starts with volatile
I have always been interpreting volatile as one way to tell the compiler that
this variable might change unexpectedly (outside the current thread), so CPU
should not cache its use, and do a full memory access (load and store) all the
time, and recent working with volatile deepens my understanding. Here, I would
document it, which hopefully would clear some doubts for someone else as well.
§why volatile
It’s indeed confusing to have the same keyword with different meanings in
different languages. volatile in Java/C# means atomic in C11, and volatile
in C intends to solve something else. This post focuses on volatile in C, for
it’s the least well understood one.
The purpose of volatile is to force an implementation to suppress optimization that could otherwise occur. For example, for a machine with memory-mapped input/output, a pointer to a device register might be declared as a pointer to volatile, in order to prevent the compiler from removing apparently redundant references through the pointer.
volatile turns off optimization in the compiler, and no special CPU
instructions are emitted because of it.
?? volatile plus SC would be enough for multithread programming.
§multithread programming
The above discussion on volatile doesn’t mention multithread programming at
all, but given all volatile variables are read from and written to the memory,
it should be enough to be used for inter-thread communication. Let’s try with
one simple example:
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
// volatile static int flag = 0;
static int flag = 0;
void* thread_A(void * arg)
{
sleep(1);
flag = 1;
return NULL;
}
void* thread_B(void * arg)
{
while (flag == 0) ;
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t a, b;
pthread_create(&a, NULL, &thread_A, NULL);
pthread_create(&b, NULL, &thread_B, NULL);
pthread_exit(NULL);
}
Building and run the above C program using:
clang -O3 volatile.c; ./a.out
Then, we find it never terminates. Adding the volatile declaration fixes the
problem. Excellent! volatile ensures that updates to variables could be
visible to other threads.
However, is it really enough for multithread programming?
In this post, Volatile: Almost Useless for Multi-Threaded
Programming,
the author mentions that two key issues in multithread programming are,
atomicity and memory consistency, and volatile addresses neither. Adding or
removing volatile doesn’t affect atomicity at all. Memory consistency is
achieved by the collaboration between the compiler and the CPU. If the compiler
doesn’t do any reordering, and the CPU uses sequential consistency, memory
consistency is guaranteed. However, in practice, compilers does reordering and
CPU doesn’t have sequential consistency for performance reasons.
volatile could be considered as compiler barriers, for volatile objects access
can’t be omitted, reordered. However, no other special instructions are
generated, so the CPU with out-of-order execution capacity can still reorder the
instructions in theory. (The reason why volatile is not broken for IO
variables is that most embedded CPU don’t have OoE, and those that do take
special care not to screw this up. See volatile in embedded
programming for more info.)
This post
provides one runnable
example
to illustrate how using volatile results non-intuitive results due to the
reordering done by CPU. (The actual reason is store buffer, but it’s
equivalent to think of it as reordering, as mentioned in the C11 memory model
presentation.)
Transformations at all levels are equivalent.
==> Can reason about all transformations as reorderings of source code loads and stores.
The proper way of dealing multithread programming is to use the right synchronization mechanics: mutex (lock), atomic variables, etc. The new memory model in C11 walks us through the memory model guaranteed by the latest standard, so long as we don’t have data race in our program.
If you are used to volatile in Java/C#, use _Atomic in C11 for C, they mean
the same. Maybe _Atomic is the thing you really have in mind, when you use
volatile for multithread programming.
§When shall I use volatile or not
This post, Volatile: Almost Useless for Multi-Threaded
Programming,
also documents the three valid use cases for volatile. The
trouble with volatile focuses on kernel
development, where volatile is almost always wrong, except certain usages
which already existed in the code base.