Algorithms: Part 5 - Heaps, Heapsort and Priority Queues


Heap and Heapsort
Definitions
- Heap: A tree based data structure
- Binary Heap: A tree data structure with 2 children for each node
- Max Heap: A heap where the maximum value is at the root
- Heapsort: A sorting algorithm based on a binary heap
Binary Heap
A Binary Heap is a datastructure that has two children per each node.
It’s used as the foundational datastucture when creating Priority Queues and sorting using Heapsort.

Binary Heap is characterized by storing the items in an array and referencing the root and children using the following calculation of their index positions:
- left child: 2*idx + 1
- right child: 2*idx + 2
- root: [i-1 / 2]
Heapsort
Heapsort is a sorting algorithm that has:
- Runtime: $O(n log n)$
- Space Complexity: $O(1)$
Heapsort is an inplace sorting algorithm that uses an array to represent the heap.
This is beneficial since it doesn’t need auxiliary space and can store all items in a contiguous array.
Heapsort starts with the heapify function, this function moves the largest item to the root. This is required to prepare the Heap for sorting.
Heapify
Heapify has a runtime of $O(log n)$.
It starts from the last non-leaf node, meaning the last node with children. This is calculated using $n / 2 - 1$.
From this point, heapify is applied to each subheap. It compares the values at the children and root, swapping the largest valued item to the root.
After swapping the largest child to root, heapify is called recursively to check if the child position has itself, smaller child nodes. If that’s the case, it will continue to swap and maintain the order of the heap, causing the largest item in the heap to bubble up.
This process repeats until the root, allowing the largest item to be placed at the highest root (0th index).
This process builds a Max Heap, which is required before sorting.
void heapify(int *arr, int arr_size, int idx) {
int largest = idx;
int left = 2 * idx + 1;
int right = 2 * idx + 2;
if (left < arr_size && arr[left] > arr[largest]) {
largest = left;
}
if (right < arr_size && arr[right] > arr[largest]) {
largest = right;
}
if (largest != idx) {
_swap(&arr[idx], &arr[largest]);
heapify(arr, arr_size, largest);
}
}
void build_max_heap(int *arr, int arr_size) {
// Start from a non-leaf node, since this is already a heap.
for (int i = arr_size / 2 - 1; i >= 0; i--) {
heapify(arr, arr_size, i);
}
}
Heapsort Algorithm
After building the max heap, we can now sort the heap since we have the largest item at the root.
We iterate in reverse, from the end to start. By swapping the the highest position (initial root) with the end, we are essentially moving the largest item to the end of the heap/array, and calling heapify to allow the bubbling up of the next largest item to the root.
This process repeats, decrementing the end position (i) by one each time, allowing the already sorted items to the right of i (the end of the array/heap) to remain in place.
At the end of the loop, the heap will be sorted. Since heapify is called $n$ times, and heapify has a runtime of $O(log n)$, this results in a final runtime of $O(n log n)$.
bool heapsort(int *arr, size_t arr_size) {
if (arr_size < 2) {
return false;
}
build_max_heap(arr, arr_size);
// Start sorting by swapping the root with last position and continuously
// calling using `n-1`.
for (int i = arr_size - 1; i >= 0; i--) {
_swap(&arr[0], &arr[i]);
heapify(arr, i, 0);
}
return true;
}
Priority Queues
Priority Queues are datastructures that use a heap to maintain the highest priority item and a loosely structure ordering of priority for the rest of the items.
Priority Queues are used in a variety of settings:
- OS use them to schedule the processes with highest priority
- Load Balancing, when certain tasks need to be prioritzed
A priority queue is a very efficient Datastructure when processing the most important task.
Since its based on a min/max heap, most behaviour of functions in the datastructure has a runtime of $O(log n)$.
Below is an example of the methods of a PriorityQueue:
insert
: Adds an item to the PriorityQueue (heap). It has a runtime of $O(log n)$, since after adding the new item,heapify
needs to be called to rebalance the heap and ensure the highest priority is bubbled up to the root.peek
: Has a runtime of $O(1)$ since it can return a copy or a reference to the root element.extract_max
: Has a runtime of $O(log n)$. It returns the root element, removes it from the heap and callsheapify
to place the next highest prioty at the root.increase_key
: Increases the value of a certain item. It has a runtime of $O(log n)$ since after updating the value of the key, it callsheapify
to update the rough ordering of items and also the highest root.
Below is a simple example of a Priority Queue:
#include "../include/priority_queue.h"
#include "../../../algos/heap-sort/include/heap_sort.h"
#include "../../../common/include/swap.h"
#include "../../../common/include/types.h"
#include <stdlib.h>
#include <string.h>
PriorityQueue init_queue(size_t size) {
PriorityQueue pqueue;
pqueue.init_size = size;
pqueue.curr_size = 0;
pqueue.arr = malloc(size * sizeof(int));
// Set all values initially to -1
memset(pqueue.arr, -1, pqueue.init_size * sizeof(int));
return pqueue;
}
void free_queue(PriorityQueue *pqueue) {
if (pqueue->arr != NULL) {
free(pqueue->arr);
pqueue->curr_size = 0;
pqueue->init_size = 0;
}
}
int peek(PriorityQueue *pqueue) { return pqueue->arr[0]; }
int extract_max(PriorityQueue *pqueue) {
int max = pqueue->arr[0];
if (max == -1) {
return max;
}
pqueue->arr[0] = -1;
pqueue->curr_size--;
build_max_heap(pqueue->arr, pqueue->init_size);
return max;
}
void insert(PriorityQueue *pqueue, int value) {
pqueue->arr[pqueue->init_size - 1] = value;
pqueue->curr_size++;
build_max_heap(pqueue->arr, pqueue->init_size);
}
int get_parent_val(PriorityQueue *queue, size_t idx) {
size_t pos = idx - 1 / 2;
return queue->arr[pos];
}
void increase_key(PriorityQueue *pqueue, size_t idx, int new_val) {
// Index must be within bounds.
if (idx > pqueue->init_size) {
return;
}
// Can't increase value if less than the index value.
if (new_val < pqueue->arr[idx]) {
return;
}
pqueue->arr[idx] = new_val;
build_max_heap(pqueue->arr, pqueue->init_size);
}