Is clock_gettime() adequate for submicrosecond timing?
I ran some benchmarks on my system which is a quad core E5645 Xeon supporting a constant TSC running kernel 3.2.54 and the results were:
clock_gettime(CLOCK_MONOTONIC_RAW) 100ns/callclock_gettime(CLOCK_MONOTONIC) 25ns/callclock_gettime(CLOCK_REALTIME) 25ns/callclock_gettime(CLOCK_PROCESS_CPUTIME_ID) 400ns/callrdtsc (implementation @DavidSchwarz) 600ns/call
So it looks like on a reasonably modern system the (accepted answer) rdtsc is the worst route to go down.
No. You'll have to use platform-specific code to do it. On x86 and x86-64, you can use 'rdtsc' to read the Time Stamp Counter.
Just port the rdtsc assembly you're using.
__inline__ uint64_t rdtsc(void) { uint32_t lo, hi; __asm__ __volatile__ ( // serialize "xorl %%eax,%%eax \n cpuid" ::: "%rax", "%rbx", "%rcx", "%rdx"); /* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */ __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi)); return (uint64_t)hi << 32 | lo;}
I need a high-resolution timer for the embedded profiler in the Linux build of our application. Our profiler measures scopes as small as individual functions, so it needs a timer precision of better than 25 nanoseconds.
Have you considered oprofile
or perf
? You can use the performance counter hardware on your CPU to get profiling data without adding instrumentation to the code itself. You can see data per-function, or even per-line-of-code. The "only" drawback is that it won't measure wall clock time consumed, it will measure CPU time consumed, so it's not appropriate for all investigations.