How to split an array into chunks of specific size? [closed]
Array.Copy has been around since 1.1 and does an excellent job of chunking arrays.
string[] buffer;for(int i = 0; i < source.Length; i+=100){ buffer = new string[100]; Array.Copy(source, i, buffer, 0, 100); // process array}
And to make an extension for it:
public static class Extensions{ public static T[] Slice<T>(this T[] source, int index, int length) { T[] slice = new T[length]; Array.Copy(source, index, slice, 0, length); return slice; }}
And to use the extension:
string[] source = new string[] { 1200 items here };// get the first 100string[] slice = source.Slice(0, 100);
Update: I think you might be wanting ArraySegment<>
No need for performance checks, because it simply uses the original array as its source and maintains an Offset and Count property to determine the 'segment'. Unfortunately, there isn't a way to retrieve JUST the segment as an array, so some folks have written wrappers for it, like here: ArraySegment - Returning the actual segment C#
ArraySegment<string> segment;for (int i = 0; i < source.Length; i += 100){ segment = new ArraySegment<string>(source, i, 100); // and to loop through the segment for (int s = segment.Offset; s < segment.Array.Length; s++) { Console.WriteLine(segment.Array[s]); }}
Performance of Array.Copy vs Skip/Take vs LINQ
Test method (in Release mode):
static void Main(string[] args){ string[] source = new string[1000000]; for (int i = 0; i < source.Length; i++) { source[i] = "string " + i.ToString(); } string[] buffer; Console.WriteLine("Starting stop watch"); Stopwatch sw = new Stopwatch(); for (int n = 0; n < 5; n++) { sw.Reset(); sw.Start(); for (int i = 0; i < source.Length; i += 100) { buffer = new string[100]; Array.Copy(source, i, buffer, 0, 100); } sw.Stop(); Console.WriteLine("Array.Copy: " + sw.ElapsedMilliseconds.ToString()); sw.Reset(); sw.Start(); for (int i = 0; i < source.Length; i += 100) { buffer = new string[100]; buffer = source.Skip(i).Take(100).ToArray(); } sw.Stop(); Console.WriteLine("Skip/Take: " + sw.ElapsedMilliseconds.ToString()); sw.Reset(); sw.Start(); String[][] chunks = source .Select((s, i) => new { Value = s, Index = i }) .GroupBy(x => x.Index / 100) .Select(grp => grp.Select(x => x.Value).ToArray()) .ToArray(); sw.Stop(); Console.WriteLine("LINQ: " + sw.ElapsedMilliseconds.ToString()); } Console.ReadLine();}
Results (in milliseconds):
Array.Copy: 15Skip/Take: 42464LINQ: 881Array.Copy: 21Skip/Take: 42284LINQ: 585Array.Copy: 11Skip/Take: 43223LINQ: 760Array.Copy: 9Skip/Take: 42842LINQ: 525Array.Copy: 24Skip/Take: 43134LINQ: 638
You can use LINQ
to group all items by the chunk size and create new Arrays afterwards.
// build sample data with 1200 Stringsstring[] items = Enumerable.Range(1, 1200).Select(i => "Item" + i).ToArray();// split on groups with each 100 itemsString[][] chunks = items .Select((s, i) => new { Value = s, Index = i }) .GroupBy(x => x.Index / 100) .Select(grp => grp.Select(x => x.Value).ToArray()) .ToArray();for (int i = 0; i < chunks.Length; i++){ foreach (var item in chunks[i]) Console.WriteLine("chunk:{0} {1}", i, item);}
Note that it's not necessary to create new arrays(needs cpu cycles and memory). You could also use the IEnumerable<IEnumerable<String>>
when you omit the two ToArrays
.
Here's the running code: http://ideone.com/K7Hn2