A Map is an abstract data structure to store pairs of data: key and value. It also has a fast key lookup of O(1)
for [hashmap-chap] or O(log n)
for TreeMap.
We can implement a Map using two different underlying data structures:
-
HashMap: it’s a map implementation using an array and a hash function. The job of the hash function is to convert the
key
into an index that maps to thevalue
. Optimized HashMap can have an average runtime of O(1). -
TreeMap: it’s a map implementation that uses a self-balanced Binary Search Tree (like [c-avl-tree] or Red-Black Tree). The BST nodes store the key, and the value and nodes are sorted by key guaranteeing an O(log n) look up.
We already covered [hashmap-chap], so this chapter we are going to focus on TreeMap.
A TreeMap is a Map implementation using a Balanced Binary Search Trees. Implementing a Map with a tree, TreeMap, has a couple of advantages over a HashMap:
-
Keys are always sorted.
-
Statistical data can be easily obtained like the median, highest, lowest key.
-
Collisions are not a concern so in the worst case is still O(log n).
-
Trees are more space efficient and don’t need to allocate memory beforehand (e.g.
HashMap
’s initial capacity) nor you have to rehash when is getting full.
Ok, now that you know the advantages, let’s implement it! For a full comparison read the HashMap vs TreeMap section.
Let’s get started with the essential functions. They have the same interface as the HashMap
(but the implementation is different).
class TreeMap {
constructor(){}
set(key, value) {}
get(key) {}
has(key) {}
delete(key) {}
}
For inserting a value on a TreeMap, we first need to inialize the tree:
link:../../../src/data-structures/maps/tree-maps/tree-map.js[role=include]
The tree can be an instance of any Binary Search Tree that we implemented so far. However, for better performance, it should be a self-balanced tree like a Red-Black Tree or AVL Tree.
Let’s implement the method to add values to the tree.
add
method and size
attributelink:../../../src/data-structures/maps/tree-maps/tree-map.js[role=include]
Adding values is very easy (once we have the underlying tree implementation).
When We search by key in a tree map, it takes O(log n). This is the implementation:
get
and has
methodlink:../../../src/data-structures/maps/tree-maps/tree-map.js[role=include]
One side effect of storing keys in a tree is that they don’t come up in insertion order. Instead, they ordered by value.
link:../../../src/data-structures/maps/tree-maps/tree-map.js[role=include]
-
We implemented the default iterator using the in-order traversal. That’s useful for getting the keys sorted.
Generators are useful for producing values that can you can iterate in a for…of
loop. Generators use the function*
syntax which expects to have a yield
with a value.
Removing elements from TreeMap is simple.
delete
methodlink:../../../src/data-structures/maps/tree-maps/tree-map.js[role=include]
The BST implementation does all the heavy lifting.
That’s it! To see the full file in context, click here: here
-
HashMap: it’s a map implementation using an array and hash function. The job of the hash function is to convert the key into an index that contains the matching data. Optimized HashMap can have an average runtime of O(1).
-
TreeMap: it’s a map implementation that uses a self-balanced Binary Search Tree (red-black tree). The BST nodes store the key, and the value and nodes are sorted by key guaranteeing an O(log n) look up.
-
HashMap
is more time-efficient. ATreeMap
is more space-efficient. -
TreeMap
search complexity is O(log n), while an optimizedHashMap
is O(1) on average. -
HashMap
’s keys are in insertion order (or random depending in the implementation).TreeMap
’s keys are always sorted. -
TreeMap
offers some statistical data for free such as: get minimum, get maximum, median, find ranges of keys.HashMap
doesn’t. -
TreeMap
has a guarantee always an O(log n), while `HashMap`s has an amortized time of O(1) but in the rare case of a rehash, it would take an O(n).
As we discussed so far, there is a trade-off between the implementations.
Data Structure |
Searching By |
Insert |
Delete |
Space Complexity |
|
Index/Key |
Value |
||||
O(1) |
O(n) |
O(1)* |
O(1) |
O(n) |
|
O(log n) |
O(n) |
O(log n) |
O(log n) |
O(n) |
* = Amortized run time. E.g. rehashing might affect run time to O(n).