Making use of all available RAM in a Haskell program? Making use of all available RAM in a Haskell program? windows windows

Making use of all available RAM in a Haskell program?


Currently, on Windows, GHC is a 32-bit GHC - I think a 64-bit GHC for windows is supposed to be available when 7.6 comes.

One consequence of that is that on Windows, you can't use more than 4G - 1BLOCK of memory, since the maximum allowed as a size-parameter is HS_WORD_MAX:

decodeSize(rts_argv[arg], 2, BLOCK_SIZE, HS_WORD_MAX) / BLOCK_SIZE;

With 32-bit Words, HS_WORD_MAX = 2^32-1.

That explains

running ./mem.exe 42000000 +RTS -s -M4G errors out with -M4G: size outside allowed range

since decodeSize() decodes 4G as 2^32.

This limitation will remain also after upgrading your GHC, until finally a 64-bit GHC for Windows is released.

As a 32-bit process, the user-mode virtual address space is limited to 2 or 4 GB (depending on the status of the IMAGE_FILE_LARGE_ADDRESS_AWARE flag), cf Memory limits for Windows Releases.

Now, you are trying to construct a Set containing 42 million 4-byte Ints. A Data.Set.Set has five words of overhead per element (constructor, size, left and right subtree pointer, pointer to element), so the Set will take up about 0.94 GiB of memory (1.008 'metric' GB). But the process uses about twice that or more (it needs space for the garbage collection, at least the size of the live heap).

Running the programme on my 64-bit linux, with input 21000000 (to make up for the twice as large Ints and pointers), I get

$ ./mem +RTS -s -RTS 21000000min: 0max: 21000000  31,330,814,200 bytes allocated in the heap   4,708,535,032 bytes copied during GC   1,157,426,280 bytes maximum residency (12 sample(s))      13,669,312 bytes maximum slop            2261 MB total memory in use (0 MB lost due to fragmentation)                                    Tot time (elapsed)  Avg pause  Max pause  Gen  0     59971 colls,     0 par    2.73s    2.73s     0.0000s    0.0003s  Gen  1        12 colls,     0 par    3.31s   10.38s     0.8654s    8.8131s  INIT    time    0.00s  (  0.00s elapsed)  MUT     time   12.12s  ( 13.33s elapsed)  GC      time    6.03s  ( 13.12s elapsed)  EXIT    time    0.00s  (  0.00s elapsed)  Total   time   18.15s  ( 26.45s elapsed)  %GC     time      33.2%  (49.6% elapsed)  Alloc rate    2,584,429,494 bytes per MUT second  Productivity  66.8% of total user, 45.8% of total elapsed

but top reports only 1.1g of memory use - top, and presumably the Task Manager, reports only live heap.

So it seems IMAGE_FILE_LARGE_ADDRESS_AWARE is not set, your process is limited to an address space of 2GB, and the 42 million Set needs more than that - unless you specify a maximum or suggested heap size that is smaller:

$ ./mem +RTS -s -M1800M -RTS 21000000min: 0max: 21000000  31,330,814,200 bytes allocated in the heap   3,551,201,872 bytes copied during GC   1,157,426,280 bytes maximum residency (12 sample(s))      13,669,312 bytes maximum slop            1154 MB total memory in use (0 MB lost due to fragmentation)                                    Tot time (elapsed)  Avg pause  Max pause  Gen  0     59971 colls,     0 par    2.70s    2.70s     0.0000s    0.0002s  Gen  1        12 colls,     0 par    4.23s    4.85s     0.4043s    3.3144s  INIT    time    0.00s  (  0.00s elapsed)  MUT     time   11.99s  ( 12.00s elapsed)  GC      time    6.93s  (  7.55s elapsed)  EXIT    time    0.00s  (  0.00s elapsed)  Total   time   18.93s  ( 19.56s elapsed)  %GC     time      36.6%  (38.6% elapsed)  Alloc rate    2,611,793,025 bytes per MUT second  Productivity  63.4% of total user, 61.3% of total elapsed

Setting the maximal heap size below what it would use naturally, actually lets it fit in hardly more than the space needed for the Set, at the price of a slightly longer GC time, and suggesting a heap size of -H1800M lets it finish using only

1831 MB total memory in use (0 MB lost due to fragmentation)

So if you specify a maximal heap size below 2GB (but large enough for the Set to fit), it should work.


The default heap size is unlimited.

Using GHC 7.2 on a 64 bit Windows XP machine, I can allocate higher values, by setting the heap size larger, explicitly:

$ ./A 42000000  +RTS -s -H1.6Gmin: 0max: 42000000  32,590,763,756 bytes allocated in the heap   3,347,044,008 bytes copied during GC     714,186,476 bytes maximum residency (4 sample(s))       3,285,676 bytes maximum slop            1651 MB total memory in use (0 MB lost due to fragmentation)

and

$ ./A 42000000  +RTS -s -H1.7Gmin: 0max: 42000000  32,590,763,756 bytes allocated in the heap   3,399,477,240 bytes copied during GC     757,603,572 bytes maximum residency (4 sample(s))       3,281,580 bytes maximum slop            1754 MB total memory in use (0 MB lost due to fragmentation)

even:

$ ./A 42000000  +RTS -s -H1.85Gmin: 0max: 42000000  32,590,763,784 bytes allocated in the heap   3,492,115,128 bytes copied during GC     821,240,344 bytes maximum residency (4 sample(s))       3,285,676 bytes maximum slop            1909 MB total memory in use (0 MB lost due to fragmentation)

That is, I can allocate up to the Windows XP 2G process limit. I imagine on Win 7 you won't have such a low limit -- this table suggests either 4G or 192G -- just ask for as much as you need (and use a more recent GHC).