After seeing some posts on LinkedIn discussing collection initializers, I became curious. There was a claim that using collection expressions, instead of collection initializers, would boost performance. As a result, I set out to measure collection initializer performance in C# using BenchmarkDotNet. And yes, while these might be micro-optimizations for many people, I thought it would be cool to explore.
Besides, maybe there’s someone out there with something like this on their hot-path that needs to squeeze a bit more out of their application 🙂
In one of my most recent articles, I explain the basics of collection initializers with some simple code examples. Simply put, instead of manually writing code like the following to initialize a collection:
List<string> devLeaderCoolList = new List<string>();
devLeaderCoolList.Add("Hello");
devLeaderCoolList.Add(", ");
devLeaderCoolList.Add("World!");
… we can instead reduce it to something more succinct like the following:
List<string> devLeaderCoolList = [ "Hello", ", ", "World!" ];
Pretty neat, right?
This collection expression syntax is even more lightweight than we’ve had access to in recent times. But syntax and readability aside (Not to minimize the benefits of code readability, but I’m trying not to put you to sleep), what about the performance?! I bet you didn’t even consider, with all of the different collection initializer syntaxes available, that we’d see a change in performance!
Well, Dave Callan got me thinking about that when he posted this on LinkedIn:
This image was originally posted by Dave Callan on LinkedIn, and that has inspired this entire article.
Let’s jump into some benchmarks!
This section will detail the benchmarks for initializing lists in C# in various ways. I’ll provide coverage on different collection initializers, the newer collection expression syntax, and even compare it to doing it manually! Surely, adding everything by hand would be slower than setting ourselves up for success by doing it all with a collection initializer — but we should cover our bases.
I will not be covering the spread operator in these benchmarks because I’d like to focus on that more for collection combination benchmarks. Admittedly, yes, it is still creating a collection… but I feel like the use case is different and I’d like to split it up.
I’ll be using BenchmarkDotNet for all of these benchmarks, so if you’re not familiar with using BenchmarkDotNet you can check out the video below and see how to use it for yourself:
With the BenchmarkDotNet NuGet installed, here’s what I am using at the entry point to kick things off (for all benchmark examples in this article):
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Reflection;
BenchmarkRunner.Run(
Assembly.GetExecutingAssembly(),
args: args);
It’s not very exciting — but I wanted to show you there’s nothing fancy going on here. Just running all of the benchmarks we have access to. And here is the list benchmark code:
[MemoryDiagnoser]
[MediumRunJob]
public class ListBenchmarks
{
private static readonly string[] _dataAsArray = new string[]
{
"Apple",
"Banana",
"Orange",
};
private static IEnumerable<string> GetDataAsIterator()
{
yield return "Apple";
yield return "Banana";
yield return "Orange";
}
[Benchmark(Baseline = true)]
public List<string> ClassicCollectionInitializer_NoCapacity()
{
return new List<string>()
{
"Apple",
"Banana",
"Orange",
};
}
[Benchmark]
public List<string> ClassicCollectionInitializer_SetCapacity()
{
return new List<string>(3)
{
"Apple",
"Banana",
"Orange",
};
}
[Benchmark]
public List<string> CollectionExpression()
{
return
[
"Apple",
"Banana",
"Orange",
];
}
[Benchmark]
public List<string> CopyConstructor_Array()
{
return new List<string>(_dataAsArray);
}
[Benchmark]
public List<string> CopyConstructor_Iterator()
{
return new List<string>(GetDataAsIterator());
}
[Benchmark]
public List<string> ManuallyAdd_NoCapacitySet()
{
List<string> list = [];
list.Add("Apple");
list.Add("Banana");
list.Add("Orange");
return list;
}
[Benchmark]
public List<string> ManuallyAdd_CapacitySet()
{
List<string> list = new(3);
list.Add("Apple");
list.Add("Banana");
list.Add("Orange");
return list;
}
}
Note in the above code example the baseline we will be comparing against is what I consider the traditional collection initializer:
return new List<string>()
{
"Apple",
"Banana",
"Orange",
};
And of course, I wouldn’t make you go compile and run these yourself, so let’s look at the results below:
Let’s go through the results from worst to best based on the Ratio column (Higher is worse):
Here is where we start to see some speed up!
One of the common themes here is that providing a capacity is a BIG performance gain. We realized an ~87% gain over our baseline simply by providing it a capacity. Side note: why couldn’t the compiler do some kind of optimization here if we know the collection size in the braces?!
Dictionaries don’t yet have a fancy collection expression that uses square brackets and removes even more bloat, but we do have several variations of collection initializers to use. These benchmarks will be very similar, also using BenchmarkDotNet, and they also use the same entry point program — so I won’t repeat it here.
I know dictionaries can have two types to work with, and I wanted to keep this similar to the list example — not because they are similar implementations of collections, but because I didn’t want to just pollute this article with more variations of things for no reason. I decided to go with a Dictionary<string, string>
where the keys are what we already looked at, and the values are just some short strings to work with that are unique.
Here’s the code for the dictionary benchmarks:
[MemoryDiagnoser]
[MediumRunJob]
public class DictionaryBenchmarks
{
private static readonly Dictionary<string, string> _sourceData = new()
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
private static IEnumerable<KeyValuePair<string, string>> GetDataAsIterator()
{
foreach (var item in _sourceData)
{
yield return item;
}
}
[Benchmark(Baseline = true)]
public Dictionary<string, string> CollectionInitializer_BracesWithoutCapacity()
{
return new Dictionary<string, string>()
{
{ "Apple", "The first value" },
{ "Banana", "The next value" },
{ "Orange", "The last value" },
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracesWithCapacity()
{
return new Dictionary<string, string>(3)
{
{ "Apple", "The first value" },
{ "Banana", "The next value" },
{ "Orange", "The last value" },
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracketsWithoutCapacity()
{
return new Dictionary<string, string>()
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracketsWithCapacity()
{
return new Dictionary<string, string>(3)
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
}
[Benchmark]
public Dictionary<string, string> CopyConstructor_Dictionary()
{
return new Dictionary<string, string>(_sourceData);
}
[Benchmark]
public Dictionary<string, string> CopyConstructor_Iterator()
{
return new Dictionary<string, string>(GetDataAsIterator());
}
[Benchmark]
public Dictionary<string, string> ManuallyAdd_NoCapacitySet()
{
Dictionary<string, string> dict = [];
dict.Add("Apple", "The first value");
dict.Add("Banana", "The next value");
dict.Add("Orange", "The last value");
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAdd_CapacitySet()
{
Dictionary<string, string> dict = new(3);
dict.Add("Apple", "The first value");
dict.Add("Banana", "The next value");
dict.Add("Orange", "The last value");
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAssign_NoCapacitySet()
{
Dictionary<string, string> dict = [];
dict["Apple"] = "The first value";
dict["Banana"] = "The next value";
dict["Orange"] = "The last value";
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAssign_CapacitySet()
{
Dictionary<string, string> dict = new(3);
dict["Apple"] = "The first value";
dict["Banana"] = "The next value";
dict["Orange"] = "The last value";
return dict;
}
}
You’ll notice two themes creeping up:
Otherwise, we still have capacity considerations just like the list benchmarks!
The dictionary benchmarks are as follows:
Doing the same exercise of highest to lowest ratio:
Everything beyond here is technically faster according to our benchmarks:
Okay, wait a second. Now we’re going to see that doing a dictionary copy is the FASTEST with a ~35% speed boost? I’m not sure how we’ve started to see known capacities not helping and copy constructors being fastest.
Even I’m skeptical now. So I wanted to rerun the benchmarks and I wanted to add a variant of each of the manual benchmarks that uses new()
instead of an empty collection expression, []
.
In this run of the benchmarks, things are much closer across the board. I don’t think that discredits the previous benchmarks, because truly many of them were also relatively close with the iterator copy constructor remaining the huge outlier. But the other huge outlier that remains is the copy constructor using another dictionary!
My takeaway is this:
I’ve written hundreds of articles, made hundreds of YouTube videos, and more posts across social media platforms than I could ever count. There will be people who want to pick these benchmarks apart, and unfortunately, their goal will seem like they’re just trying to discredit what’s being presented.
However, I *DO* think it’s important to discuss the context of the benchmarks and look at what’s being considered in these scenarios:
The goal of presenting these benchmarks is not to tell you that you must do things a certain way — it’s simply to show you some interesting information. Even if you are hyper-focused on performance, you should benchmark and profile your own code! Don’t rely on my results here. Let these serve as a starting point that you might be able to tune things on your hot path that you didn’t realize.
What other considerations can you think of? Feel free to share in the comments — but be conversational, please.
Overall, I consider most of what we see in this article on collection initializer performance in C# to be micro-optimizations — more than likely. I wouldn’t lose sleep over using one way over another, as long as you’re optimizing for readability and your profiling results don’t show you spending most of your time doing collection initialization. I hope that you got to have fun exploring this with me and see that if you’re ever curious you can go set up some simple benchmarks to experiment!
If you found this useful and you’re looking for more learning opportunities, consider subscribing to my free weekly software engineering newsletter and check out my free videos on YouTube!