Precise thread sleep needed. Max 1ms error Precise thread sleep needed. Max 1ms error multithreading multithreading

Precise thread sleep needed. Max 1ms error


I was looking for lightweight cross-platform sleep function that is suitable for real time applications (i.e. high resolution/high precision with reliability). Here are my findings:

Scheduling Fundamentals

Giving up CPU and then getting it back is expensive. According to this article, scheduler latency could be anywhere between 10-30ms on Linux. So if you need to sleep less than 10ms with high precision then you need to use special OS specific APIs. The usual C++11 std::this_thread::sleep_for is not high resolution sleep. For example, on my machine, quick tests shows that it often sleeps for at least 3ms when I ask it to sleep for just 1ms.

Linux

Most popular solution seems to be nanosleep() API. However if you want < 2ms sleep with high resolution than you need to also use sched_setscheduler call to set the thread/process for real-time scheduling. If you don't than nanosleep() acts just like obsolete usleep which had resolution of ~10ms. Another possibility is to use alarms.

Windows

Solution here is to use multimedia times as others have suggested. If you want to emulate Linux's nanosleep() on Windows, below is how (original ref). Again, note that you don't need to do CreateWaitableTimer() over and over if you are calling sleep() in loop.

#include <windows.h>    /* WinAPI *//* Windows sleep in 100ns units */BOOLEAN nanosleep(LONGLONG ns){    /* Declarations */    HANDLE timer;   /* Timer handle */    LARGE_INTEGER li;   /* Time defintion */    /* Create timer */    if(!(timer = CreateWaitableTimer(NULL, TRUE, NULL)))        return FALSE;    /* Set timer properties */    li.QuadPart = -ns;    if(!SetWaitableTimer(timer, &li, 0, NULL, NULL, FALSE)){        CloseHandle(timer);        return FALSE;    }    /* Start & wait for timer */    WaitForSingleObject(timer, INFINITE);    /* Clean resources */    CloseHandle(timer);    /* Slept without problems */    return TRUE;}

Cross Platform Code

Here's the time_util.cc which implements sleep for Linux, Windows and Apple's platforms. However notice that it doesn't set real-time mode using sched_setscheduler as I mentioned above so if you want to use for <2ms then that's something you need to do additionally. One other improvement you can make is to avoid calling CreateWaitableTimer for Windows version over and over again if you are calling sleep in some loop. For how to do this, see example here.

#include "time_util.h"#ifdef _WIN32#  define WIN32_LEAN_AND_MEAN#  include <windows.h>#else#  include <time.h>#  include <errno.h>#  ifdef __APPLE__#    include <mach/clock.h>#    include <mach/mach.h>#  endif#endif // _WIN32/**********************************=> unix ************************************/#ifndef _WIN32void SleepInMs(uint32 ms) {    struct timespec ts;    ts.tv_sec = ms / 1000;    ts.tv_nsec = ms % 1000 * 1000000;    while (nanosleep(&ts, &ts) == -1 && errno == EINTR);}void SleepInUs(uint32 us) {    struct timespec ts;    ts.tv_sec = us / 1000000;    ts.tv_nsec = us % 1000000 * 1000;    while (nanosleep(&ts, &ts) == -1 && errno == EINTR);}#ifndef __APPLE__uint64 NowInUs() {    struct timespec now;    clock_gettime(CLOCK_MONOTONIC, &now);    return static_cast<uint64>(now.tv_sec) * 1000000 + now.tv_nsec / 1000;}#else // macuint64 NowInUs() {    clock_serv_t cs;    mach_timespec_t ts;    host_get_clock_service(mach_host_self(), SYSTEM_CLOCK, &cs);    clock_get_time(cs, &ts);    mach_port_deallocate(mach_task_self(), cs);    return static_cast<uint64>(ts.tv_sec) * 1000000 + ts.tv_nsec / 1000;}#endif // __APPLE__#endif // _WIN32/************************************ unix <=**********************************//**********************************=> win *************************************/#ifdef _WIN32void SleepInMs(uint32 ms) {    ::Sleep(ms);}void SleepInUs(uint32 us) {    ::LARGE_INTEGER ft;    ft.QuadPart = -static_cast<int64>(us * 10);  // '-' using relative time    ::HANDLE timer = ::CreateWaitableTimer(NULL, TRUE, NULL);    ::SetWaitableTimer(timer, &ft, 0, NULL, NULL, 0);    ::WaitForSingleObject(timer, INFINITE);    ::CloseHandle(timer);}static inline uint64 GetPerfFrequency() {    ::LARGE_INTEGER freq;    ::QueryPerformanceFrequency(&freq);    return freq.QuadPart;}static inline uint64 PerfFrequency() {    static uint64 xFreq = GetPerfFrequency();    return xFreq;}static inline uint64 PerfCounter() {    ::LARGE_INTEGER counter;    ::QueryPerformanceCounter(&counter);    return counter.QuadPart;}uint64 NowInUs() {    return static_cast<uint64>(        static_cast<double>(PerfCounter()) * 1000000 / PerfFrequency());}#endif // _WIN32

Yet another more complete cross-platform code can be found here.

Another Quick Solution

As you might have noticed, above code is no longer very light-weight. It needs to include Windows header among others things which might not be very desirable if you are developing header-only libraries. If you need sleep less than 2ms and you are not very keen on using OS code then you can just use following simple solution which is cross platform and works very well on my tests. Just remember that you are now not using heavily optimized OS code which might be much better at saving power and managing CPU resources.

typedef std::chrono::high_resolution_clock clock;template <typename T>using duration = std::chrono::duration<T>;static void sleep_for(double dt){    static constexpr duration<double> MinSleepDuration(0);    clock::time_point start = clock::now();    while (duration<double>(clock::now() - start).count() < dt) {        std::this_thread::sleep_for(MinSleepDuration);    }}

Related Questions


Don't use spinning here. The requested resolution and accuracy can be reached with standard methods.

You may use Sleep() down to periods of about 1 ms when the systems interrupt period is set to operate at that high frequency. Look at the description of Sleep() to get the details, in particular the multimedia timers with Obtaining and Setting Timer Resolution to get the details on how to set the systems interrupt period.The obtainable accuracy with such an approach is in the few microseconds range when implemented properly.

I suspect your loop is doing something else too. Thus I suspect you want a total period of 5 ms which then would be the sum of the Sleep() and the rest of time you spend on other things in the loop.

For this scenario I'd suggest Waitable Timer Objects, however, these timers also rely on the setting of the multimedia timer API. I've given an overview over the relevant functions for higher precision timing here. Much deeper insight in high precision timing can be found here.

For even more accurate and reliable timing you may have to have a look into process priority classes and thread priorities. Another answer about the Sleep() accuracy is this.

However, whether it is possible to obtain a Sleep() delay of precisely 5 ms depends on the systems hardware. Some systems allow you to operate at 1024 interrupts per second (set by the multimedia timer API). This corresponds to a period of 0.9765625 ms. The nearest you can get thus is 4.8828125 ms. Others allow to get closer, particulary since Windows 7 the timing has improved a significantly when operated on hardware providing high resolution event timers. See About Timers at MSDN and High Precision Event Timer.

Summary: Set the multimedia timer to operate at maximum frequency and use waitable timer.


From the question tags I suppose you are on windows.Take a look at Multimedia Timers, they advertise precision under 1ms.Another options is to use Spin Locks but this will basically keep a cpu core at maximum usage.