Swift Algorithm Club: Heap and Priority Queue Data Structure

Swift Algorithm Club - Heap and Priority Queue Data Structure

The Swift Algorithm Club is an open source project on implementing data structures and algorithms in Swift.

Every month, Kelvin Lau, Vincent Ngo and I feature a cool data structure or algorithm from the club in a tutorial on this site. If you want to learn more about algorithms and data structures, follow along with us!

In this tutorial, you’ll learn how to implement a heap in Swift 3. A heap is frequently used to implement a priority queue.

The Heap data structure was first implemented for the Swift Algorithm Club by Kevin Randrup, and has been presented here for tutorial form.

You won’t need to have done any other tutorials to understand this one, but it might help to read the tutorials for the Tree and Queue data structures, and be familiar with their terminology.

Getting Started

The heap data structure was first introduced by J. W. J. Williams in 1964 as a data structure for the heapsort sorting algorithm.

In theory, the heap resembles the binary tree data structure (similar to the Binary Search Tree). The heap is a tree, and all of the nodes in the tree have 0, 1 or 2 children.

Here’s what it looks like:

Technical Image 1: Illustration of Heap

Elements in a heap are partially sorted by their priority. Every node in the tree has a higher priority than its children. There are two different ways values can represent priorities:

maxheaps: Elements with a higher value represent higher priority.
minheaps: Elements with a lower value represent higher priority.

The heap also has a compact height. If you think of the heap as having levels, like this:

Technical Image 2: Illustration of levels

…then the heap has the fewest possible number of levels to contain all its nodes. Before a new level can be added, all the existing levels must be full.

Whenever we add nodes to a heap, we add them in the leftmost possible position in the incomplete level.

Technical Image 3: Illustration of adding nodes

Whenever we remove nodes from a heap, we remove the rightmost node from the lowest level.

Removing the highest priority element

The heap is useful as a priority queue because the root node of the tree contains the element with the highest priority in the heap.

However, simply removing the root node would not leave behind a heap. Or rather, it would leave two heaps!

Technical image 4: two heaps

Instead, we swap the root node with the last node in the heap. Then we remove it:

Technical image 5: swapped nodes

Then, we compare the new root node to each of its children, and swap it with whichever child has the highest priority.

Technical image 6: sifting down

Now the new root node is the node with the highest priority in the tree, but the heap might not be ordered yet. We compare the new child node with its children again, and swap it with the child with the highest priority.

Technical image 7: sifting down

We keep sifting down until either the former last element has a higher priority than its children, or it becomes a leaf node. Since every node once again has a higher priority than its children, the heap quality of the tree is restored.

Adding a new element

Adding a new element uses a very similar technique. First we add the new element at the left-most position in the incomplete level of the heap:

Technical image 8: new element

Then we compare the priority of the new element to its parent, and if it has a higher priority, we sift up.

Technical image 9: sifting up

We keep sifting up until the new element has a lower priority than its parent, or it becomes the root of the heap.

Technical image 10: sifting up

And once again, the ordering of the heap is preserved.

Practical Representation

If you’ve worked through the Binary Search Tree tutorial, it might surprise you to learn the heap data structure doesn’t have a Node data type to contain its element and links to its children. Under the hood, the heap data structure is actually an array!

Every node in the heap is assigned an index. We start by assigning 0 to the root node, and then we iterate down through the levels, counting each node from left to right:

Technical image 11: indexed tree

If we then used those indices to make an array, with each element stored in its indexed position, it would look like this:

Technical image 12: the array

A bit of clever math now connects each node to its children. Notice how each level of the tree has twice as many nodes as the level above it. We have a little formula for calculating the child indices of any node.

Given the node at index i, its left child node can be found at index 2i + 1 and its right child node can be found at index 2i + 2.

Technical image 13: all nodes pointing to their children

This is why it’s important for the heap to be a compact tree, and why we add each new element to the leftmost position: we’re actually adding new elements to an array, and we can’t leave any gaps.

Note: This array isn’t sorted. As you may have noticed from the above diagrams, the only relationships between nodes that the heap cares about are that parents have a higher priority than their children. The heap doesn’t care which of the left child and right child have higher priority. A node which is closer to the root node isn’t always of higher priority than a node which is further away.

Implementing a Swift Heap

That’s all the theory. Let’s start coding.

Start by creating a new Swift playground, and add the following struct declaration:

struct Heap<Element> {
  var elements : [Element]
  let priorityFunction : (Element, Element) -> Bool

  // TODO: priority queue functions
  // TODO: helper functions
}

You’ve declared a struct named Heap. The syntax declares this to be a generic struct that allows it to infer its own type information at the call site.

The Heap has two properties: an array of Element types, and a priority function. The function takes two Elements and returns true if the first has a higher priority than the second.

You’ve also left some space for the priority queue functions – adding a new element, and removing the highest priority element, as described above – and for helper functions, to help keep your code clear and readable.

Simple functions

All the code snippets in this section are small, independent computed properties or functions. Remove the TODO comment for priority queue functions, and replace it with these.

var isEmpty : Bool {
  return elements.isEmpty
}

var count : Int {
  return elements.count
}

You might recognize these property names from using arrays, or from the Queue data structure. The Heap is empty if its elements array is empty, and its count is the elements array’s count. We’ll be needing to know how many elements are in the heap a lot in the coming code.

Below the two computed properties, add this function:

func peek() -> Element? {
  return elements.first
}

This will definitely be familiar to you if you’ve used the Queue. All it does is return the first element in the array – allowing the caller to access the element with the highest priority in the heap.

Now remove the TODO comment for helper functions, and replace it with these four functions:

func isRoot(_ index: Int) -> Bool {
  return (index == 0)
}

func leftChildIndex(of index: Int) -> Int {
  return (2 * index) + 1
}

func rightChildIndex(of index: Int) -> Int {
  return (2 * index) + 2
}

func parentIndex(of index: Int) -> Int {
  return (index - 1) / 2
}

These four functions are all about taking the formula of calculating the array indices of child or parent nodes, and hiding them inside easy to read function calls.

You might have realised that the formula for calculating the child indices only tell you what the left or right child indices should be. They don’t use optionals or throw errors to suggest that the heap might be too small to actually have an element at those indices. We’ll have to be mindful of this.

You might also have realised that because of the left and right child index formula, or because of the tree diagrams above, all left children will have odd indices and all right children will have even indices. However, the parentIndex function doesn’t attempt to determine if the index argument is a left or right child before calculating the parent index; it just uses integer division to get the answer.

Comparing priority

In the theory, we compared the priorities of elements with their parent or children nodes a lot. In this section we determine which index, of a node and its children, points to the highest priority element.

Below the parentIndex function, add this function:

func isHigherPriority(at firstIndex: Int, than secondIndex: Int) -> Bool {
  return priorityFunction(elements[firstIndex], elements[secondIndex])
}

This helper function is a wrapper for the priority function property. It takes two indices and returns true if the element at the first index has higher priority.

This helps us write two more comparison helper functions, which you can now write below isHigherPriority:

func highestPriorityIndex(of parentIndex: Int, and childIndex: Int) -> Int {
  guard childIndex < count && isHigherPriority(at: childIndex, than: parentIndex)
    else { return parentIndex }
  return childIndex
}

func highestPriorityIndex(for parent: Int) -> Int {
  return highestPriorityIndex(of: highestPriorityIndex(of: parent, and: leftChildIndex(of: parent)), and: rightChildIndex(of: parent))
}

Let’s review these two functions. The first assumes that a parent node has a valid index in the array, checks if the child node has a valid index in the array, and then compares the priorities of the nodes at those indices, and returns a valid index for whichever node has the highest priority.

The second function also assumes that the parent node index is valid, and compares the index to both of its left and right children – if they exist. Whichever of the three has the highest priority is the index returned.

The last helper function is another wrapper, and it’s the only helper function which changes the Heap data structure at all.

mutating func swapElement(at firstIndex: Int, with secondIndex: Int) {
  guard firstIndex != secondIndex
    else { return }
  swap(&elements[firstIndex], &elements[secondIndex])
}

This function takes two indices, and swaps the elements at those indices. Because Swift throws a runtime error if the caller attempts to swap array elements with the same index, we guard for this and return early if the indices are the same.

Enqueueing a new element

If we’ve written useful helper functions, then the big and important functions should now be easy to write. So, first we’re going to write a function which enqueues a new element to the last position in the heap, and then sift it up.

It looks as simple as you would expect. Write this with the priority queue functions, under the peek() function:

mutating func enqueue(_ element: Element) {
  elements.append(element)
  siftUp(elementAtIndex: count - 1)
}

count - 1 is the highest legal index value in the array, with the new element added.

This won’t compile until you write the siftUp function, though:

mutating func siftUp(elementAtIndex index: Int) {
  let parent = parentIndex(of: index) // 1
  guard !isRoot(index), // 2
    isHigherPriority(at: index, than: parent) // 3
    else { return }
  swapElement(at: index, with: parent) // 4
  siftUp(elementAtIndex: parent) // 5
}

Now we see all the helper functions coming to good use! Let’s review what you’ve written.

First you calculate what the parent index of the index argument is, because it’s used several times in this function and you only need to calculate it once.
Then you guard to ensure you’re not trying to sift up the root node of the heap,
or sift an element up above a higher priority parent. The function ends if you attempt either of these things.
Once you know the indexed node has a higher priority than its parent, you swap the two values,
and call siftUp on the parent index, in case the element isn’t yet in position.

This is a recursive function. It keeps calling itself until its terminal conditions are reached.

Dequeueing the highest priority element

What we can sift up, we can sift down, surely.

To dequeue the highest priority element, and leave a consistent heap behind, write the following function under the siftUp function:

mutating func dequeue() -> Element? {
  guard !isEmpty // 1
    else { return nil }
  swapElement(at: 0, with: count - 1) // 2
  let element = elements.removeLast() // 3
  if !isEmpty { // 4
    siftDown(elementAtIndex: 0) // 5
  }
  return element // 6
}

Let’s review what you’ve written.

First you guard that that the heap has a first element to return. If there isn’t, you return nil.
If there is an element, you swap it with the last node in the heap.
Now you remove the highest priority element from the last position in the heap, and store it in element.
If the heap isn’t empty now, then you sift the current root element down the heap to its proper prioritized place.
Finally you return the highest priority element from the function.

This won’t compile without the accompanying siftDown function:

mutating func siftDown(elementAtIndex index: Int) {
  let childIndex = highestPriorityIndex(for: index) // 1
  if index == childIndex { // 2
    return
  }
  swapElement(at: index, with: childIndex) // 3
  siftDown(elementAtIndex: childIndex)
}

Let’s review this function too:

First you find out which index, of the argument index and its child indices, points to the element with the highest priority. Remember that if the argument index is a leaf node in the heap, it has no children, and the highestPriorityIndex(for:) function will return the argument index.
If the argument index is that index, then you stop sifting here.
If not, then one of the child elements has a higher priority; swap the two elements, and keep recursively sifting down.

One last first thing

The only essential thing left to do is to check the Heap‘s initializer. Because the Heap is a struct, it comes with a default init function, which you can call like this:

var heap = Heap(elements: [3, 2, 8, 5, 0], priorityFunction: >)

Swift’s generic inference will assume that heap has a type of Heap, and the comparison operator > will make it a maxheap, prioritizing higher values over lower values.

But there’s a danger here. Can you spot it?

Solution Inside	Select Show>
The elements array isn’t ordered. You’ll have to create an explicit init function which does some initial prioritizing of elements.

Write this function at the beginning of the Heap struct, just below the two properties.

init(elements: [Element] = [], priorityFunction: @escaping (Element, Element) -> Bool) { // 1 // 2
  self.elements = elements
  self.priorityFunction = priorityFunction // 3
  buildHeap() // 4
}

mutating func buildHeap() {
  for index in (0 ..< count / 2).reversed() { // 5
    siftDown(elementAtIndex: index) // 6
  }
}

Let's review these two functions.

First, you've written an explicit init function which takes an array of elements and a priority function, just as before. However, you've also specified that by default the array of elements is empty, so the caller can initialise a Heap with just the priority function if they so choose.
You also had to explicitly specify that the priority function is @escaping, because the struct will hold onto it after this function is complete.
Now you explicitly assign the arguments to the Heap's properties.
You finish off the init() function by building the heap, putting it in priority order.
In the buildHeap() function, you iterate through the first half of the array in reverse order. If you remember that the every level of the heap has room for twice as many elements as the level above, you can also work out that every level of the heap has one more element than every level above it combined, so the first half of the heap is actually every parent node in the heap.
One by one, you sift every parent node down into its children. In turn this will sift the high priority children towards the root.

And that's it. You wrote a heap in Swift!

A final thought

Let me leave you with a final thought.

What would happen if you had a huge, populated heap full of prioritised elements, and you kept dequeueing the highest priority element until the heap was empty?

You would dequeue every element in priority order. The elements would be perfectly sorted by their priority.

That's the heapsort algorithm!

Where To Go From Here?

I hope you enjoyed this tutorial on making a heap data structure!

Here is a Swift playground with the above code. You can also find alternative implementations and further discussion in the Heap section of the Swift Algorithm Club repository.

This was just one of the many algorithms in the Swift Algorithm Club repository. If you're interested in more, check out the repo.

It's in your best interest to know about algorithms and data structures - they're solutions to many real world problems, and are frequently asked as interview questions. Plus it's fun!

So stay tuned for many more tutorials from the Swift Algorithm club in the future. In the meantime, if you have any questions on implementing trees in Swift, please join the forum discussion below!

Note: The Swift Algorithm Club is always looking for more contributors. If you've got an interesting data structure, algorithm, or even an interview question to share, don't hesitate to contribute! To learn more about the contribution process, check out our Join the Swift Algorithm Club article.

The post Swift Algorithm Club: Heap and Priority Queue Data Structure appeared first on Ray Wenderlich.