Any faster way of copying arrays in C#? Any faster way of copying arrays in C#? arrays arrays

Any faster way of copying arrays in C#?


Use Buffer.BlockCopy. Its entire purpose is to perform fast (see Buffer):

This class provides better performance for manipulating primitive types than similar methods in the System.Array class.

Admittedly, I haven't done any benchmarks, but that's the documentation. It also works on multidimensional arrays; just make sure that you're always specifying how many bytes to copy, not how many elements, and also that you're working on a primitive array.

Also, I have not tested this, but you might be able to squeeze a bit more performance out of the system if you bind a delegate to System.Buffer.memcpyimpl and call that directly. The signature is:

internal static unsafe void memcpyimpl(byte* src, byte* dest, int len)

It does require pointers, but I believe it's optimized for the highest speed possible, and so I don't think there's any way to get faster than that, even if you had assembly at hand.


Update:

Due to requests (and to satisfy my curiosity), I tested this:

using System;using System.Diagnostics;using System.Reflection;unsafe delegate void MemCpyImpl(byte* src, byte* dest, int len);static class Temp{    //There really should be a generic CreateDelegate<T>() method... -___-    static MemCpyImpl memcpyimpl = (MemCpyImpl)Delegate.CreateDelegate(        typeof(MemCpyImpl), typeof(Buffer).GetMethod("memcpyimpl",            BindingFlags.Static | BindingFlags.NonPublic));    const int COUNT = 32, SIZE = 32 << 20;    //Use different buffers to help avoid CPU cache effects    static byte[]        aSource = new byte[SIZE], aTarget = new byte[SIZE],        bSource = new byte[SIZE], bTarget = new byte[SIZE],        cSource = new byte[SIZE], cTarget = new byte[SIZE];    static unsafe void TestUnsafe()    {        Stopwatch sw = Stopwatch.StartNew();        fixed (byte* pSrc = aSource)        fixed (byte* pDest = aTarget)            for (int i = 0; i < COUNT; i++)                memcpyimpl(pSrc, pDest, SIZE);        sw.Stop();        Console.WriteLine("Buffer.memcpyimpl: {0:N0} ticks", sw.ElapsedTicks);    }    static void TestBlockCopy()    {        Stopwatch sw = Stopwatch.StartNew();        sw.Start();        for (int i = 0; i < COUNT; i++)            Buffer.BlockCopy(bSource, 0, bTarget, 0, SIZE);        sw.Stop();        Console.WriteLine("Buffer.BlockCopy: {0:N0} ticks",            sw.ElapsedTicks);    }    static void TestArrayCopy()    {        Stopwatch sw = Stopwatch.StartNew();        sw.Start();        for (int i = 0; i < COUNT; i++)            Array.Copy(cSource, 0, cTarget, 0, SIZE);        sw.Stop();        Console.WriteLine("Array.Copy: {0:N0} ticks", sw.ElapsedTicks);    }    static void Main(string[] args)    {        for (int i = 0; i < 10; i++)        {            TestArrayCopy();            TestBlockCopy();            TestUnsafe();            Console.WriteLine();        }    }}

The results:

Buffer.BlockCopy: 469,151 ticksArray.Copy: 469,972 ticksBuffer.memcpyimpl: 496,541 ticksBuffer.BlockCopy: 421,011 ticksArray.Copy: 430,694 ticksBuffer.memcpyimpl: 410,933 ticksBuffer.BlockCopy: 425,112 ticksArray.Copy: 420,839 ticksBuffer.memcpyimpl: 411,520 ticksBuffer.BlockCopy: 424,329 ticksArray.Copy: 420,288 ticksBuffer.memcpyimpl: 405,598 ticksBuffer.BlockCopy: 422,410 ticksArray.Copy: 427,826 ticksBuffer.memcpyimpl: 414,394 ticks

Now change the order:

Array.Copy: 419,750 ticksBuffer.memcpyimpl: 408,919 ticksBuffer.BlockCopy: 419,774 ticksArray.Copy: 430,529 ticksBuffer.memcpyimpl: 412,148 ticksBuffer.BlockCopy: 424,900 ticksArray.Copy: 424,706 ticksBuffer.memcpyimpl: 427,861 ticksBuffer.BlockCopy: 421,929 ticksArray.Copy: 420,556 ticksBuffer.memcpyimpl: 421,541 ticksBuffer.BlockCopy: 436,430 ticksArray.Copy: 435,297 ticksBuffer.memcpyimpl: 432,505 ticksBuffer.BlockCopy: 441,493 ticks

Now change the order again:

Buffer.memcpyimpl: 430,874 ticksBuffer.BlockCopy: 429,730 ticksArray.Copy: 432,746 ticksBuffer.memcpyimpl: 415,943 ticksBuffer.BlockCopy: 423,809 ticksArray.Copy: 428,703 ticksBuffer.memcpyimpl: 421,270 ticksBuffer.BlockCopy: 428,262 ticksArray.Copy: 434,940 ticksBuffer.memcpyimpl: 423,506 ticksBuffer.BlockCopy: 427,220 ticksArray.Copy: 431,606 ticksBuffer.memcpyimpl: 422,900 ticksBuffer.BlockCopy: 439,280 ticksArray.Copy: 432,649 ticks

or, in other words: they're very competitive; as a general rule, memcpyimpl is fastest, but it's not necessarily worth worrying about.


You can use Array.Copy.

EDIT

Array.Copy does work for multidimensional arrays: see this topic.


If running on .NET Core, you may consider using source.AsSpan().CopyTo(destination) (beware on Mono though).

          Method |  Job | Runtime |      Mean |     Error |    StdDev | Ratio | RatioSD |---------------- |----- |-------- |----------:|----------:|----------:|------:|--------:|       ArrayCopy |  Clr |     Clr |  60.08 ns | 0.8231 ns | 0.7699 ns |  1.00 |    0.00 |        SpanCopy |  Clr |     Clr |  99.31 ns | 0.4895 ns | 0.4339 ns |  1.65 |    0.02 | BufferBlockCopy |  Clr |     Clr |  61.34 ns | 0.5963 ns | 0.5578 ns |  1.02 |    0.01 |                 |      |         |           |           |           |       |         |       ArrayCopy | Core |    Core |  63.33 ns | 0.6843 ns | 0.6066 ns |  1.00 |    0.00 |        SpanCopy | Core |    Core |  47.41 ns | 0.5399 ns | 0.5050 ns |  0.75 |    0.01 | BufferBlockCopy | Core |    Core |  59.89 ns | 0.4713 ns | 0.3936 ns |  0.94 |    0.01 |                 |      |         |           |           |           |       |         |       ArrayCopy | Mono |    Mono | 149.82 ns | 1.6466 ns | 1.4596 ns |  1.00 |    0.00 |        SpanCopy | Mono |    Mono | 347.87 ns | 2.0589 ns | 1.9259 ns |  2.32 |    0.02 | BufferBlockCopy | Mono |    Mono |  61.52 ns | 1.1691 ns | 1.0364 ns |  0.41 |    0.01 |