Linux/perl mmap performance Linux/perl mmap performance linux linux

Linux/perl mmap performance


Ok, found the problem. As suspected, neither linux or perl were to blame. To open and access the file I do something like this:

#!/usr/bin/perl# Create 1 GB file if you do not have one:# dd if=/dev/urandom of=test.bin bs=1048576 count=1000use strict; use warnings;use Sys::Mmap;open (my $fh, "<test.bin")    || die "open: $!";my $t = time;print STDERR "mmapping.. ";mmap (my $mh, 0, PROT_READ, MAP_SHARED, $fh)    || die "mmap: $!";my $str = unpack ("A1024", substr ($mh, 0, 1024));print STDERR " ", time-$t, " seconds\nsleeping..";sleep (60*60);

If you test that code, there are no delays like those I found in my original code, and after creating the minimal sample (always do that, right!) the reason suddenly became obvious.

The error was that I in my code treated the $mh scalar as a handle, something which is light weight and can be moved around easily (read: pass by value). Turns out, it's actually a GB long string, definitively not something you want to move around without creating an explicit reference (perl lingua for a "pointer"/handle value). So if you need to store in in a hash or similar, make sure you store \$mh, and deref it when you need to use it like ${$hash->{mh}}, typically as the first parameter in a substr or similar.


If you have a relatively recent version of Perl, you shouldn't be using Sys::Mmap. You should be using PerlIO's mmap layer.

Can you post the code you are using?


On 32-bit systems the address space for mmap()s is rather limited (and varies from OS to OS). Be aware of that if you're using multi-gigabyte files and your are only testing on a 64-bit system. (I would have preferred to write this in a comment but I don't have enough reputation points yet)