The Swift Algorithm Club is an open source project on implementing data structures and algorithms in Swift.
Every month, Kelvin Lau, Vincent Ngo and I feature a cool data structure or algorithm from the club in a tutorial on this site. If you want to learn more about algorithms and data structures, follow along with us!
In this tutorial, you’ll learn how to implement a heap in Swift 3. A heap is frequently used to implement a priority queue.
The Heap data structure was first implemented for the Swift Algorithm Club by Kevin Randrup, and has been presented here for tutorial form.
You won’t need to have done any other tutorials to understand this one, but it might help to read the tutorials for the Tree and Queue data structures, and be familiar with their terminology.
Getting Started
The heap data structure was first introduced by J. W. J. Williams in 1964 as a data structure for the heapsort sorting algorithm.
In theory, the heap resembles the binary tree data structure (similar to the Binary Search Tree). The heap is a tree, and all of the nodes in the tree have 0, 1 or 2 children.
Here’s what it looks like:
Elements in a heap are partially sorted by their priority. Every node in the tree has a higher priority than its children. There are two different ways values can represent priorities:
- maxheaps: Elements with a higher value represent higher priority.
- minheaps: Elements with a lower value represent higher priority.
The heap also has a compact height. If you think of the heap as having levels, like this:
…then the heap has the fewest possible number of levels to contain all its nodes. Before a new level can be added, all the existing levels must be full.
Whenever we add nodes to a heap, we add them in the leftmost possible position in the incomplete level.
Whenever we remove nodes from a heap, we remove the rightmost node from the lowest level.
Removing the highest priority element
The heap is useful as a priority queue because the root node of the tree contains the element with the highest priority in the heap.
However, simply removing the root node would not leave behind a heap. Or rather, it would leave two heaps!
Instead, we swap the root node with the last node in the heap. Then we remove it:
Then, we compare the new root node to each of its children, and swap it with whichever child has the highest priority.
Now the new root node is the node with the highest priority in the tree, but the heap might not be ordered yet. We compare the new child node with its children again, and swap it with the child with the highest priority.
We keep sifting down until either the former last element has a higher priority than its children, or it becomes a leaf node. Since every node once again has a higher priority than its children, the heap quality of the tree is restored.
Adding a new element
Adding a new element uses a very similar technique. First we add the new element at the left-most position in the incomplete level of the heap:
Then we compare the priority of the new element to its parent, and if it has a higher priority, we sift up.
We keep sifting up until the new element has a lower priority than its parent, or it becomes the root of the heap.
And once again, the ordering of the heap is preserved.
Practical Representation
If you’ve worked through the Binary Search Tree tutorial, it might surprise you to learn the heap data structure doesn’t have a Node data type to contain its element and links to its children. Under the hood, the heap data structure is actually an array!
Every node in the heap is assigned an index. We start by assigning 0 to the root node, and then we iterate down through the levels, counting each node from left to right:
If we then used those indices to make an array, with each element stored in its indexed position, it would look like this:
A bit of clever math now connects each node to its children. Notice how each level of the tree has twice as many nodes as the level above it. We have a little formula for calculating the child indices of any node.
Given the node at index i
, its left child node can be found at index 2i + 1
and its right child node can be found at index 2i + 2
.
This is why it’s important for the heap to be a compact tree, and why we add each new element to the leftmost position: we’re actually adding new elements to an array, and we can’t leave any gaps.
Implementing a Swift Heap
That’s all the theory. Let’s start coding.
Start by creating a new Swift playground, and add the following struct declaration:
struct Heap<Element> {
var elements : [Element]
let priorityFunction : (Element, Element) -> Bool
// TODO: priority queue functions
// TODO: helper functions
}
You’ve declared a struct named Heap
. The
syntax declares this to be a generic struct that allows it to infer its own type information at the call site.
The Heap
has two properties: an array of Element
types, and a priority function. The function takes two Element
s and returns true
if the first has a higher priority than the second.
You’ve also left some space for the priority queue functions – adding a new element, and removing the highest priority element, as described above – and for helper functions, to help keep your code clear and readable.
Simple functions
All the code snippets in this section are small, independent computed properties or functions. Remove the TODO
comment for priority queue functions, and replace it with these.
var isEmpty : Bool {
return elements.isEmpty
}
var count : Int {
return elements.count
}
You might recognize these property names from using arrays, or from the Queue
data structure. The Heap
is empty if its elements
array is empty, and its count is the elements
array’s count. We’ll be needing to know how many elements are in the heap a lot in the coming code.
Below the two computed properties, add this function:
func peek() -> Element? {
return elements.first
}
This will definitely be familiar to you if you’ve used the Queue
. All it does is return the first element in the array – allowing the caller to access the element with the highest priority in the heap.
Now remove the TODO
comment for helper functions, and replace it with these four functions:
func isRoot(_ index: Int) -> Bool {
return (index == 0)
}
func leftChildIndex(of index: Int) -> Int {
return (2 * index) + 1
}
func rightChildIndex(of index: Int) -> Int {
return (2 * index) + 2
}
func parentIndex(of index: Int) -> Int {
return (index - 1) / 2
}
These four functions are all about taking the formula of calculating the array indices of child or parent nodes, and hiding them inside easy to read function calls.
You might have realised that the formula for calculating the child indices only tell you what the left or right child indices should be. They don’t use optionals or throw errors to suggest that the heap might be too small to actually have an element at those indices. We’ll have to be mindful of this.
You might also have realised that because of the left and right child index formula, or because of the tree diagrams above, all left children will have odd indices and all right children will have even indices. However, the parentIndex
function doesn’t attempt to determine if the index
argument is a left or right child before calculating the parent index; it just uses integer division to get the answer.
Comparing priority
In the theory, we compared the priorities of elements with their parent or children nodes a lot. In this section we determine which index, of a node and its children, points to the highest priority element.
Below the parentIndex
function, add this function:
func isHigherPriority(at firstIndex: Int, than secondIndex: Int) -> Bool {
return priorityFunction(elements[firstIndex], elements[secondIndex])
}
This helper function is a wrapper for the priority function property. It takes two indices and returns true
if the element at the first index has higher priority.
This helps us write two more comparison helper functions, which you can now write below isHigherPriority
:
func highestPriorityIndex(of parentIndex: Int, and childIndex: Int) -> Int {
guard childIndex < count && isHigherPriority(at: childIndex, than: parentIndex)
else { return parentIndex }
return childIndex
}
func highestPriorityIndex(for parent: Int) -> Int {
return highestPriorityIndex(of: highestPriorityIndex(of: parent, and: leftChildIndex(of: parent)), and: rightChildIndex(of: parent))
}
Let’s review these two functions. The first assumes that a parent node has a valid index in the array, checks if the child node has a valid index in the array, and then compares the priorities of the nodes at those indices, and returns a valid index for whichever node has the highest priority.
The second function also assumes that the parent node index is valid, and compares the index to both of its left and right children – if they exist. Whichever of the three has the highest priority is the index returned.
The last helper function is another wrapper, and it’s the only helper function which changes the Heap
data structure at all.
mutating func swapElement(at firstIndex: Int, with secondIndex: Int) {
guard firstIndex != secondIndex
else { return }
swap(&elements[firstIndex], &elements[secondIndex])
}
This function takes two indices, and swaps the elements at those indices. Because Swift throws a runtime error if the caller attempts to swap array elements with the same index, we guard
for this and return early if the indices are the same.
Enqueueing a new element
If we’ve written useful helper functions, then the big and important functions should now be easy to write. So, first we’re going to write a function which enqueues a new element to the last position in the heap, and then sift it up.
It looks as simple as you would expect. Write this with the priority queue functions, under the peek()
function:
mutating func enqueue(_ element: Element) {
elements.append(element)
siftUp(elementAtIndex: count - 1)
}
count - 1
is the highest legal index value in the array, with the new element added.
This won’t compile until you write the siftUp
function, though:
mutating func siftUp(elementAtIndex index: Int) {
let parent = parentIndex(of: index) // 1
guard !isRoot(index), // 2
isHigherPriority(at: index, than: parent) // 3
else { return }
swapElement(at: index, with: parent) // 4
siftUp(elementAtIndex: parent) // 5
}
Now we see all the helper functions coming to good use! Let’s review what you’ve written.
- First you calculate what the parent index of the index argument is, because it’s used several times in this function and you only need to calculate it once.
- Then you
guard
to ensure you’re not trying to sift up the root node of the heap, - or sift an element up above a higher priority parent. The function ends if you attempt either of these things.
- Once you know the indexed node has a higher priority than its parent, you swap the two values,
- and call
siftUp
on the parent index, in case the element isn’t yet in position.
This is a recursive function. It keeps calling itself until its terminal conditions are reached.
Dequeueing the highest priority element
What we can sift up, we can sift down, surely.
To dequeue the highest priority element, and leave a consistent heap behind, write the following function under the siftUp
function:
mutating func dequeue() -> Element? {
guard !isEmpty // 1
else { return nil }
swapElement(at: 0, with: count - 1) // 2
let element = elements.removeLast() // 3
if !isEmpty { // 4
siftDown(elementAtIndex: 0) // 5
}
return element // 6
}
Let’s review what you’ve written.
- First you
guard
that that the heap has a first element to return. If there isn’t, you returnnil
. - If there is an element, you swap it with the last node in the heap.
- Now you remove the highest priority element from the last position in the heap, and store it in
element
. - If the heap isn’t empty now, then you sift the current root element down the heap to its proper prioritized place.
- Finally you return the highest priority element from the function.
This won’t compile without the accompanying siftDown
function:
mutating func siftDown(elementAtIndex index: Int) {
let childIndex = highestPriorityIndex(for: index) // 1
if index == childIndex { // 2
return
}
swapElement(at: index, with: childIndex) // 3
siftDown(elementAtIndex: childIndex)
}
Let’s review this function too:
- First you find out which index, of the argument index and its child indices, points to the element with the highest priority. Remember that if the argument index is a leaf node in the heap, it has no children, and the
highestPriorityIndex(for:)
function will return the argument index. - If the argument index is that index, then you stop sifting here.
- If not, then one of the child elements has a higher priority; swap the two elements, and keep recursively sifting down.
One last first thing
The only essential thing left to do is to check the Heap
‘s initializer. Because the Heap
is a struct, it comes with a default init function, which you can call like this:
var heap = Heap(elements: [3, 2, 8, 5, 0], priorityFunction: >)
Swift’s generic inference will assume that heap
has a type of Heap
, and the comparison operator >
will make it a maxheap, prioritizing higher values over lower values.
But there’s a danger here. Can you spot it?
Write this function at the beginning of the Heap
struct, just below the two properties.
init(elements: [Element] = [], priorityFunction: @escaping (Element, Element) -> Bool) { // 1 // 2
self.elements = elements
self.priorityFunction = priorityFunction // 3
buildHeap() // 4
}
mutating func buildHeap() {
for index in (0 ..< count / 2).reversed() { // 5
siftDown(elementAtIndex: index) // 6
}
}
Let's review these two functions.
- First, you've written an explicit init function which takes an array of elements and a priority function, just as before. However, you've also specified that by default the array of elements is empty, so the caller can initialise a
Heap
with just the priority function if they so choose. - You also had to explicitly specify that the priority function is
@escaping
, because the struct will hold onto it after this function is complete. - Now you explicitly assign the arguments to the
Heap
's properties. - You finish off the
init()
function by building the heap, putting it in priority order. - In the
buildHeap()
function, you iterate through the first half of the array in reverse order. If you remember that the every level of the heap has room for twice as many elements as the level above, you can also work out that every level of the heap has one more element than every level above it combined, so the first half of the heap is actually every parent node in the heap. - One by one, you sift every parent node down into its children. In turn this will sift the high priority children towards the root.
And that's it. You wrote a heap in Swift!
A final thought
Let me leave you with a final thought.
What would happen if you had a huge, populated heap full of prioritised elements, and you kept dequeueing the highest priority element until the heap was empty?
You would dequeue every element in priority order. The elements would be perfectly sorted by their priority.
That's the heapsort algorithm!
Where To Go From Here?
I hope you enjoyed this tutorial on making a heap data structure!
Here is a Swift playground with the above code. You can also find alternative implementations and further discussion in the Heap section of the Swift Algorithm Club repository.
This was just one of the many algorithms in the Swift Algorithm Club repository. If you're interested in more, check out the repo.
It's in your best interest to know about algorithms and data structures - they're solutions to many real world problems, and are frequently asked as interview questions. Plus it's fun!
So stay tuned for many more tutorials from the Swift Algorithm club in the future. In the meantime, if you have any questions on implementing trees in Swift, please join the forum discussion below!
The post Swift Algorithm Club: Heap and Priority Queue Data Structure appeared first on Ray Wenderlich.