Is there a performance impact when calling ToList()? Is there a performance impact when calling ToList()? arrays arrays

Is there a performance impact when calling ToList()?


IEnumerable.ToList()

Yes, IEnumerable<T>.ToList() does have a performance impact, it is an O(n) operation though it will likely only require attention in performance critical operations.

The ToList() operation will use the List(IEnumerable<T> collection) constructor. This constructor must make a copy of the array (more generally IEnumerable<T>), otherwise future modifications of the original array will change on the source T[] also which wouldn't be desirable generally.

I would like to reiterate this will only make a difference with a huge list, copying chunks of memory is quite a fast operation to perform.

Handy tip, As vs To

You'll notice in LINQ there are several methods that start with As (such as AsEnumerable()) and To (such as ToList()). The methods that start with To require a conversion like above (ie. may impact performance), and the methods that start with As do not and will just require some cast or simple operation.

Additional details on List<T>

Here is a little more detail on how List<T> works in case you're interested :)

A List<T> also uses a construct called a dynamic array which needs to be resized on demand, this resize event copies the contents of an old array to the new array. So it starts off small and increases in size if required.

This is the difference between the Capacity and Count attributes on List<T>. Capacity refers to the size of the array behind the scenes, Count is the number of items in the List<T> which is always <= Capacity. So when an item is added to the list, increasing it past Capacity, the size of the List<T> is doubled and the array is copied.


Is there a performance impact when calling toList()?

Yes of course. Theoretically even i++ has a performance impact, it slows the program for maybe a few ticks.

What does .ToList do?

When you invoke .ToList, the code calls Enumerable.ToList() which is an extension method that return new List<TSource>(source). In the corresponding constructor, under the worst circumstance, it goes through the item container and add them one by one into a new container. So its behavior affects little on performance. It's impossible to be a performance bottle neck of your application.

What's wrong with the code in the question

Directory.GetFiles goes through the folder and returns all files' names immediately into memory, it has a potential risk that the string[] costs a lot of memory, slowing down everything.

What should be done then

It depends. If you(as well as your business logic) gurantees that the file amount in the folder is always small, the code is acceptable. But it's still suggested to use a lazy version: Directory.EnumerateFiles in C#4. This is much more like a query, which will not be executed immediately, you can add more query on it like:

Directory.EnumerateFiles(myPath).Any(s => s.Contains("myfile"))

which will stop searching the path as soon as a file whose name contains "myfile" is found. This is obviously has a better performance then .GetFiles.


Is there a performance impact when calling toList()?

Yes there is. Using the extension method Enumerable.ToList() will construct a new List<T> object from the IEnumerable<T> source collection which of course has a performance impact.

However, understanding List<T> may help you determine if the performance impact is significant.

List<T> uses an array (T[]) to store the elements of the list. Arrays cannot be extended once they are allocated so List<T> will use an over-sized array to store the elements of the list. When the List<T> grows beyond the size the underlying array a new array has to be allocated and the contents of the old array has to be copied to the new larger array before the list can grow.

When a new List<T> is constructed from an IEnumerable<T> there are two cases:

  1. The source collection implements ICollection<T>: Then ICollection<T>.Count is used to get the exact size of the source collection and a matching backing array is allocated before all elements of the source collection is copied to the backing array using ICollection<T>.CopyTo(). This operation is quite efficient and probably will map to some CPU instruction for copying blocks of memory. However, in terms of performance memory is required for the new array and CPU cycles are required for copying all the elements.

  2. Otherwise the size of the source collection is unknown and the enumerator of IEnumerable<T> is used to add each source element one at a time to the new List<T>. Initially the backing array is empty and an array of size 4 is created. Then when this array is too small the size is doubled so the backing array grows like this 4, 8, 16, 32 etc. Every time the backing array grows it has to be reallocated and all elements stored so far have to be copied. This operation is much more costly compared to the first case where an array of the correct size can be created right away.

    Also, if your source collection contains say 33 elements the list will end up using an array of 64 elements wasting some memory.

In your case the source collection is an array which implements ICollection<T> so the performance impact is not something you should be concerned about unless your source array is very large. Calling ToList() will simply copy the source array and wrap it in a List<T> object. Even the performance of the second case is not something to worry about for small collections.