I have always been interpreting volatile as one way to tell the compiler that this variable might change unexpectedly (outside the current thread), so CPU should not cache its use, and do a full memory access (load and store) all the time, and recent working with volatile deepens my understanding. Here, I would document it, which hopefully would clear some doubts for someone else as well.

§why volatile

It’s indeed confusing to have the same keyword with different meanings in different languages. volatile in Java/C# means atomic in C11, and volatile in C intends to solve something else. This post focuses on volatile in C, for it’s the least well understood one.

The purpose of volatile is to force an implementation to suppress optimization that could otherwise occur. For example, for a machine with memory-mapped input/output, a pointer to a device register might be declared as a pointer to volatile, in order to prevent the compiler from removing apparently redundant references through the pointer.

volatile turns off optimization in the compiler, and no special CPU instructions are emitted because of it.

?? volatile plus SC would be enough for multithread programming.

§multithread programming

The above discussion on volatile doesn’t mention multithread programming at all, but given all volatile variables are read from and written to the memory, it should be enough to be used for inter-thread communication. Let’s try with one simple example:

#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

// volatile static int flag = 0;
static int flag = 0;

void* thread_A(void * arg)
{
    sleep(1);
    flag = 1;
    return NULL;
}

void* thread_B(void * arg)
{
    while (flag == 0) ;
    return NULL;
}

int main(int argc, char *argv[])
{
    pthread_t a, b;

    pthread_create(&a, NULL, &thread_A, NULL);
    pthread_create(&b, NULL, &thread_B, NULL);

    pthread_exit(NULL);
}

Building and run the above C program using:

clang -O3 volatile.c; ./a.out

Then, we find it never terminates. Adding the volatile declaration fixes the problem. Excellent! volatile ensures that updates to variables could be visible to other threads.

However, is it really enough for multithread programming?

In this post, Volatile: Almost Useless for Multi-Threaded Programming, the author mentions that two key issues in multithread programming are, atomicity and memory consistency, and volatile addresses neither. Adding or removing volatile doesn’t affect atomicity at all. Memory consistency is achieved by the collaboration between the compiler and the CPU. If the compiler doesn’t do any reordering, and the CPU uses sequential consistency, memory consistency is guaranteed. However, in practice, compilers does reordering and CPU doesn’t have sequential consistency for performance reasons.

volatile could be considered as compiler barriers, for volatile objects access can’t be omitted, reordered. However, no other special instructions are generated, so the CPU with out-of-order execution capacity can still reorder the instructions in theory. (The reason why volatile is not broken for IO variables is that most embedded CPU don’t have OoE, and those that do take special care not to screw this up. See volatile in embedded programming for more info.)

This post provides one runnable example to illustrate how using volatile results non-intuitive results due to the reordering done by CPU. (The actual reason is store buffer, but it’s equivalent to think of it as reordering, as mentioned in the C11 memory model presentation.)

Transformations at all levels are equivalent.

==> Can reason about all transformations as reorderings of source code loads and stores.

The proper way of dealing multithread programming is to use the right synchronization mechanics: mutex (lock), atomic variables, etc. The new memory model in C11 walks us through the memory model guaranteed by the latest standard, so long as we don’t have data race in our program.

If you are used to volatile in Java/C#, use _Atomic in C11 for C, they mean the same. Maybe _Atomic is the thing you really have in mind, when you use volatile for multithread programming.

§When shall I use volatile or not

This post, Volatile: Almost Useless for Multi-Threaded Programming, also documents the three valid use cases for volatile. The trouble with volatile focuses on kernel development, where volatile is almost always wrong, except certain usages which already existed in the code base.