This is the second part in the series about choosing the most suitable associative container (dictionary) in C++23. In the first part, we covered ordered containers, and in this part, we'll discuss unordered ones in detail.
These containers do not provide any defined order for their keys. Moreover, key order can change with each modification, so the program should never rely on it. Such containers are always implemented as hash maps.
Generally, when adding, removing, or searching for a key, a hash map first computes some integral hash value from that key using the hash function. The hash (or rather its part) is then used as an index into the contiguous pre-allocated array. Each entry in this array is called a bucket. Some entries in that array will be empty, some will contain a single element, and some might map to more than one element due to hash collisions. This happens when different keys have hash values that point to the same array index. Hash maps utilize various strategies to deal with hash collisions (see this Wikipedia article). The number of elements in the hash map divided by the total size of the array is called the load factor. The higher the load factor is, the more possible hash collisions become with each newly inserted element.
As opposed to ordered containers, hash maps do not support range queries, rank/select operations, iteration over keys in order, and search for the key smaller/larger than the given key. In return, hash maps reach the best achievable performance: time complexity of search/insertion/removal operations is amortized O(1)
. I say "amortized" here, because when the number of elements becomes too large, a hash map needs to rehash its contents to reduce the load factor (effectively growing the bucket array). Time complexity of rehashing is O(n)
. However, it is expected to happen rarely, so on average, each operation is still O(1)
. Another reason for potentially reduced performance is a hash function with poor distribution, which will increase the frequency of collisions.
Starting with C++11, the standard library provides the following hash maps: std::unordered_map
(link), std::unordered_set
(link), std::unordered_multimap
(link), and std::unordered_multiset
(link). Maps associate a key with the value, while sets only store the key and are useful to check if the key is present in the container, but not retrieve the associated value. Multi containers allow storing multiple duplicate keys.
All standard unordered containers are node-based and use Separate chaining to deal with hash collisions, which means, they store each key or key-value pair in a separate linked list node. Memory for each new node is allocated individually, which is not particularly efficient. This also makes the data structure not CPU cache-friendly, because each key access requires extra indirection. Here is how the unordered_map
internal structure may look like:
On the left, there is an array of buckets, which is pre-allocated to some fixed size (8
in our example). Initially, all buckets are empty. When we start adding elements to the hash map, some buckets will become occupied. In the example above, there is an element with key 10
(which got into bucket 1
), and two elements with keys 50
and 256
, both ended up in the same bucket 3
because their hash values collided. There is also an element with key 3
, which landed in bucket 6
. Each bucket points to the linked list, which ideally contains not more than one element. Buckets 0
, 2
, 4
, 5
, and 7
are empty and point to nullptr
.
Let's assume we need to find the value for key 256
.
8
in our case). In our example, the remainder value is 3
.3
.50
, which is not the same as the 256
we are looking for, so the map will continue to iterate. The next element has the key 256
, which is exactly the one we need. The corresponding value is xyz
.end
iterator, indicating that the key does not exist.
You may notice that the last element of each list points to the first element of the next list. This is done in some implementations to improve the iteration efficiency. Instead of checking bucket by bucket when iterating over all hash map elements, we can use those connections to jump from one non-empty linked list to another much faster.
If we continue adding more elements to the hash map above, at some point the load factor will become too high (for example, 80%), and the hash map will decide to rehash. It will grow the pre-allocated array (e.g. from 8
to 16
elements), recompute hashes for all existing elements, and put elements into the new buckets.
Standard containers provide reference and iterator stability guarantees, but they are weaker than those offered by ordered containers. References to the existing elements are never invalidated. Iterators to the existing elements can be invalidated only when the addition of a new element causes a rehash, or when rehashing is triggered manually:
#include <iostream>
#include <unordered_map>
int main() {
std::unordered_map<std::string, int> map{
{"test", 123},
{"xyz", 456},
{"abc", 10}
};
auto& value = map["abc"];
auto valueIt = map.find("abc");
// Might cause a rehash if the load factor
// is too high
map.emplace("zzz", 1000);
// Triggers the rehash manually
map.rehash(1000);
// References remain valid in any case:
std::cout << value << "\n";
// Iterators are invalidated: the line below
// is undefined behavior and might crash
std::cout << valueIt->second << "\n";
}
Since C++17, unordered containers allow node manipulation: it is possible to take a node from one map and move it to another map without copying the key and the value:
#include <iostream>
#include <unordered_map>
int main() {
std::unordered_map<std::string, int> map1{
{"test", 123},
{"xyz", 456},
{"abc", 10}
};
// Take the node from map1:
auto node = map1.extract("xyz");
// We can even change its key
// (we can also change the value
// if needed):
node.key() = "xyzxyz";
std::unordered_map<std::string, int> map2;
// Insert the node into another map:
map2.insert(std::move(node));
// Prints "xyzxyz: 456":
for (const auto& [k, v] : map2) {
std::cout << k << ": " << v << "\n";
}
}
This is what happens in the above program:
Another strategy to deal with hash collisions is called Open addressing. In open-addressing hash maps, each bucket stores at most one element. If the bucket is already occupied, the element with the same hash value goes to some other free bucket. Such hash maps try to group elements in contiguous memory blocks to improve efficiency and make the data structure more CPU cache-friendly. C++ standard library provides no open addressing hash maps, but there are plenty of third-party alternatives.
Boost.Unordered is an awesome library that provides a wide range of hash map implementations.
There are boost::unordered_set
, boost::unordered_map
, boost::unordered_multiset
, and boost::unordered_multimap
, which are analogs for the std
containers, and everything written above applies to them. These containers use a bit more complex internal structure with "bucket groups" to improve iteration efficiency.
The library also provides boost::unordered_node_set
, boost::unordered_node_map
, boost::unordered_flat_set
, and boost::unordered_flat_map
, which are open addressing containers. The internal structure is different from open addressing variants:
You can read more about the internal structure in this blog post.
Node-based containers (boost::unordered_node_set
, boost::unordered_node_map
) still use nodes, which are pointed to by the buckets. These containers provide the same iterator and reference stability guaranteed as the std
containers and also provide the same API for node manipulation (i.e. extract
method).
In flat containers (boost::unordered_flat_set
, boost::unordered_flat_map
), keys and values are stored directly in the bucket array. Flat containers are the most efficient, but provide the weakest guarantees: iterators and references to all elements are invalidated when rehashing happens. Node manipulation API is not provided at all. Flat containers might use more memory than node-based ones, especially if the key or the value sizeof
is large.
Another third-party library that implements open-addressing containers is Folly F14 (provided by Meta). There are a few dictionary variants that are very close to the Boost library containers described above:
folly::F14NodeSet
is the same as boost::unordered_node_set
, folly::F14NodeMap
is the same as boost::unordered_node_map
.folly::F14ValueSet
is the same as boost::unordered_flat_set
, and folly::F14ValueMap
is the same as boost::unordered_flat_map
.folly::F14VectorMap
/ folly::F14VectorSet
keep keys/values packed in a contiguous array, and the buckets contain indexes into that array.folly::F14FastMap
/ folly::F14FastSet
is an umbrella class. It chooses the most efficient implementation (either F14Value
or F14Vector
) based on the parameters you specify (such as key and value types).
An interesting extra efficiency feature of F14 hash maps is prehashing. For example, if you need to search for the same key multiple times, and its hashing is expensive (e.g. a key is a string), you can prehash it once, and then use the F14HashToken
in all hash map calls later to avoid re-hashing the same key multiple times. The starting point is the prehash
method (link).
You can read more about the internal structure of F14 hash containers in this FB blog post.
Abseil Swiss Tables library (provided by Google) also implements open addressing node-based and flat hash containers: absl::flat_hash_map
, absl::flat_hash_set
, absl::node_hash_map
, absl::node_hash_set
. They are similar to the Boost and F14 ones. You can read more about them here and here.
A lesser-known ankerl
library (github) claims to be very efficient in most scenarios. The author of this library provides extensive benchmark results for many hash maps in different use cases (blog post). You can follow these results, but take them with a grain of salt. Always test the hash map you pick in the production environment. Benchmarks are always synthetic and do not cover cases when the CPU and memory do other work besides hash map operations. Benchmarks also do not cover the cases when various parts of the hash map memory are not in the CPU cache, which will frequently happen in real programs.
Hash function quality is important to keep the time complexity of look-up operations O(1)
. Flat hash maps are particularly sensitive to hash function quality. An ideal hash function can be defined like this: if a single bit in the key changes, the corresponding hash value will change half of its bits randomly. Such hash function is called avalanching.
Unfortunately, some C++ standard library implementations of integer and pointer hash functions are non-avalanching. For example, libstdc++
simply returns the pointer or integer value directly without any additional transformations: github.
Advanced hash tables, such as boost
and F14
implementations, deal with this issue by introducing the hash_is_avalanching
trait. If the hash function does not state itself as avalanching, the hash table will perform an additional mixing step to improve the hash quality. This comes at an extra cost. If you implement a custom hash function, and you know that it is of decent quality, you can mark it avalanching as shown in this example. Boost uses the is_avalanching
property name, and the F14 property is named folly_is_avalanching
. Ideally, you should add both.
If you use the key types supported out of the box (e.g. string
, int
, pointers) and the default hash functions provided by boost
or F14
, they will be already marked correctly as needed, and you won't need to think about this unless you decide to implement a custom key type, which will need a custom hash function.
All of the above containers are not thread-safe in general, and you will have to implement external synchronization (for example, using mutexes) if one thread may modify the hash map when another thread accesses it.
If all threads only read the map, no synchronization is needed. Make sure that you only call methods marked with const
. All non-const
methods may modify the container. Keep in mind that operator[]
is not const
and will modify the container. A common pitfall in the multithreaded code is:
std::unordered_map<std::string, bool> map;
// ...
if (map["key"]) {
// ...
}
In the code above, the goal is to check if the value corresponding to the key
is true
. However, map["key"]
will add the key
to the map
if it is not yet there. The newly added value will be set to default (false
in the example above). This means, such code is not thread-safe, and is not too optimal. A more efficient and thread-safe way to check the same condition is the following:
if (auto it = map.find("key"); it != map.end() && it->second == true) {
// ...
}
it != map.end()
).it->second == true
).In this example, find
and end
do not modify the map, and the search becomes thread-safe. If the goal was only to check if the key exists in the map, you could simply call map.contains("key")
.
There are a few implementations of thread-safe hash maps.
boost::concurrent_flat_set
and boost::concurrent_flat_map
from the Boost.Unordered library. Boost containers provide visitation-based API which is vastly different from the API provided by the C++ standard library.folly::ConcurrentHashMap
(github). Folly tries to keep its API as close as possible to the C++ standard unordered containers.MichaelHashMap
, SkipListMap
, SkipListSet
, FeldmanHashMap
, FeldmanHashSet
). This library has not been updated in a while and does not provide detailed documentation, but I am still sharing it because the containers it offers are unique. If your use case implies high contention on a hash map, try these lock-free dictionaries offered by libcds
.Let’s look at the summarized points that explain how to pick the most suitable container.
std
, boost
, folly
, or abseil
node-based containers (such as std::unordered_map
, boost::unordered_map
, boost::unordered_node_map
, or folly::F14NodeMap
). boost::unordered_node_...
and folly
will likely be the most efficient.boost::unordered_flat_map
or folly::F14FastMap
).sizeof
the key and/or the value is large. If memory usage is a concern, use node-based containers instead.