Using the model name returned from cat /proc/cpuinfo
, one could locate the corresponding CPU web page from Intel or AMD.
On the other hand, it could be nice to measure the frequency directly as well.
Three things are worth being mentioned explicitly:
Inline assembly is used to avoid i++
being optimized away.
“manual unrolling” is used to hide the overhead of loop condition checking.
Need to compile with optimization turned on, otherwise, i
is written back to memory on each change.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 #include <stdio.h> #include <stdlib.h> #include <sys/time.h> #define inc asm ( "addq $1, %0" : "+r" (i) ) static double get_wall_seconds () { struct timeval tv ; gettimeofday(&tv, NULL ); double seconds = tv.tv_sec + (double )tv.tv_usec / 1000000 ; return seconds; } int main (int argc, char ** argv) { int nBillions = 1 ; if (argc == 2 ) { nBillions = atoi(argv[1 ]); } unsigned long int N_one_billion = 1000000000 ; unsigned long int N = (unsigned ) nBillions*N_one_billion; double startTime = get_wall_seconds(); unsigned long i = 0 ; while (i < N) { inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; inc; } double timeTaken = get_wall_seconds() - startTime; printf ("N = %ld, timeTaken = %7.3f\n" , N, timeTaken); double ops_per_second = (double )N / timeTaken; printf ("CPU clock frequency is %4.2f GHz\n" , ops_per_second/N_one_billion); }
Testing on my box:
$ gcc -O test.c ; ./a.out ; clang -O test.c ; ./a.out
N = 1000000000, timeTaken = 0.325
CPU clock frequency is 3.08 GHz
N = 1000000000, timeTaken = 0.306
CPU clock frequency is 3.27 GHz
$ clang -O test.c ; ./a.out ; gcc -O test.c ; ./a.out
N = 1000000000, timeTaken = 0.316
CPU clock frequency is 3.17 GHz
N = 1000000000, timeTaken = 0.320
CPU clock frequency is 3.13 GHz