Can multithreading speed up memory allocation?

c++ multithreading memory-management new-operator boost-thread

Standard CRT

While with older of Visual Studio the default CRT allocator was blocking, this is no longer true at least for Visual Studio 2010 and newer, which calls corresponding OS functions directly. The Windows heap manager was blocking until Widows XP, in XP the optional Low Fragmentation Heap is not blocking, while the default one is, and newer OSes (Vista/Win7) use LFH by default. The performance of recent (Windows 7) allocators is very good, comparable to scalable replacements listed below (you still might prefer them if targeting older platforms or when you need some other features they provide). There exist several multiple "scalable allocators", with different licenses and different drawbacks. I think on Linux the default runtime library already uses a scalable allocator (some variant of PTMalloc).

Scalable replacements

I know about:

HOARD (GNU + commercial licenses)
MicroQuill SmartHeap for SMP (commercial license)
Google Perf Tools TCMalloc (BSD license)
NedMalloc (BSD license)
JemAlloc (BSD license)
PTMalloc (GNU, no Windows port yet?)
Intel Thread Building Blocks (GNU, commercial)

You might want to check Scalable memory allocator experiences for my experiences with trying to use some of them in a Windows project.

In practice most of them work by having a per thread cache and per thread pre-allocated regions for allocations, which means that small allocations most often happen inside of a context of thread only, OS services are called only infrequently.

c++ multithreading memory-management new-operator boost-thread

Dynamic allocation of memory uses the heap of the application/module/process (but not thread). The heap can only handle one allocation request at a time. If you try to allocate memory in "parallel" threads, they will be handled in due order by the heap. You will not get a behaviour like: one thread is waiting to get its memory while another can ask for some, while a third one is getting some. The threads will have to line-up in queue to get their chunk of memory.

What you would need is a pool of heaps. Use whichever heap is not busy at the moment to allocate the memory. But then, you have to watch out throughout the life of this variable such that it does not get de-allocated on another heap (that would cause a crash).

I know that Win32 API has functions such as GetProcessHeap(), CreateHeap(), HeapAlloc() and HeapFree(), that allow you to create a new heap and allocate/deallocate memory from a specific heap HANDLE. I don't know of an equivalence in other operating systems (I have looked for them, but to no avail).

You should, of course, try to avoid doing frequent dynamic allocations. But if you can't, you might consider (for portability) to create your own "heap" class (doesn't have to be a heap per se, just a very efficient allocator) that can manage a large chunk of memory and surely a smart pointer class that would hold a reference to the heap from which it came. This would enable you to use multiple heaps (make sure they are thread-safe).

c++ multithreading memory-management new-operator boost-thread

There are 2 scalable drop-in replacements for malloc that I know of:

Google's tcmalloc
Facebook's jemalloc (link to a performance study comparing to tcmalloc)

I don't have any experience with Hoard (which performed poorly in the study), but Emery Berger lurks on this site and was astonished by the results. He said he would have a look and I surmise there might have been some specifics to either the test or implementation that "trapped" Hoard as the general feedback is usually good.

One word of caution with jemalloc, it can waste a bit of space when you rapidly create then discard threads (as it creates a new pool for each thread you allocate from). If your threads are stable, there should not be any issue with this.

CodeHunter

Can multithreading speed up memory allocation?

Standard CRT

Scalable replacements

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last