Faster Elixir map datastructure creation

2021-11-29

UPDATE: Map.new is another great choice, as it also uses :maps.from_list internally, showing the same performance as Enum.into. Thanks, Shane for the tip!


Elixir and Erlang share a great map datastructure. It's an associative datastructure of keys and values. It looks like this:

%{a: 1, b: true, c: [821.99]}

Sometimes you'll want to create a map from another collection of things. One common way to do this would be like so, building up the map one key-value pair at a time, using Enum.reduce.

kvs = [{"a", 5}, {"x", 9194}, ...]
Enum.reduce(kvs, %{}, fn {k, v}, acc -> Map.put(acc, k, v) end)

There are other ways to do this in Elixir. Using the following code, we build a list of key value pairs, and then pass them to the Erlang function :maps.from_list, which converts the list to a map.

# imagine we didn't build this input list ourselves, as is possible in real-world code
kvs = [{"a", 5}, {"x", 9194}, ...]
Enum.reduce(kvs, [], fn pair, acc -> [pair | acc] end) |> :maps.from_list()

Another common way is to use Enum.into, which is designed specifically to convert one collection into another. It looks like this:

kvs = [{"a", 5}, {"x", 9194}, ...]
Enum.into(kvs, %{})

So which to use? I recommend using Enum.into, for a few reasons. The first of which is that, in its 2-arity form, it's less powerful. It does a collection to collection translation (say, list to map) and nothing else. There is also a 3-arity form that allows you to pass a function that serves as a mapper, transforming each input element before inserting it into the map, giving the same transformation power as both Enum.reduce versions above.

The second reason is performance. I did some simple benchmarking and found that on larger inputs (>=10,000 pairs), of the three variations (reduce/map, reduce/list, and Enum.into), reduce/list and Enum/into are between 2x and 2.5x faster than calling Map.put in a reduce loop.

Digging deeper, I checked out the Enum.into source code, and the reason Enum.into is faster is because it actually calls :maps.from_list if it can determine that the target collection is a map!

So, I recommend using Enum.into: its API is simpler for simple cases, powerful enough for those you need to transform, and automatically specializes to use fast Erlang built-in functions (written in C, rather than Erlang or Elixir) when it determines that it can.