Articles

Heapsort Explained: Algorithm, Implementation, and Complexity Analysis

  • Learn what data structures and algorithms are, why they are useful, and how you can use them effectively in Python.
    • With Certificate
    • Intermediate.
      26 hours
  • Learn about the usefulness and efficiency of computational sorting by implementing different sorting algorithms yourself.
    • With Certificate
    • Intermediate.
      3 hours

What is heapsort?

Heapsort is a comparison-based sorting algorithm which uses a binary heap to efficiently sort arrays in O(n log n) time complexity.

Heapsort sorts arrays by inserting the data into the heap and then repeatedly extracting the root of the heap.

How the Heap Sort algorithm works:

  1. Build a max-heap to store the data from an unsorted list.
  2. Extract the largest value from the heap and place it into a sorted list.
  3. Replace the root of the heap with the last element in the list. Then, rebalance the heap.
  4. Once the max-heap is empty, return the sorted list.

Now that we’ve got a brief overview of heapsort, let’s dive deep and explore how it works in detail.

How heapsort works

We’ll explore the detailed process of how the heapsort algorithm transforms an unsorted array into a sorted one using heap operations.

Step 1: Build a max-heap

For this algorithm, we’ll want to build out a max-heap. In the heap sort algorithm, we first convert our unsorted array into a max-heap structure. A max-heap ensures the root contains the largest value, with each parent node being larger than its children. Starting with this unsorted array:

[14, 11, 2, 20, 3, 10, 3]

By placing our values into a max-heap data structure, our list would look like this:

[20, 11, 14, 2, 10, 5, 3]

We can visualize the max-heap like this:

A diagram visualizing the max-heap created for implementing heapsort

Step 2: Extract the root of the heap

In order to sort our data, we’ll repeatedly extract and remove the largest value from the heap until it’s empty. By following the rule of heaps, we can expect to find the largest value located at the root of the heap.

After removing the largest value, we can’t just leave our heap without a root because that would cause us to have two orphaned nodes. Instead, we can swap our root node with the last element in the heap. Since the last element has no children, we can easily remove the value from the heap.

This step does cause one major problem. By swapping the two elements, our root node isn’t the largest value in the heap. So, we’ll need to restructure the heap in order to ensure that it’s balanced.

Step 3: Restore the heap

With the root value no longer holding the largest value, we’ve violated an important rule about heaps: the parent must contain a value that is larger than its children’s values.

We can fix this by using heapify down. To perform heapify down, we’ll first compare our new root value to its children. Then, we’ll select the child with the larger value and swap it with the root value. We’ll continue working our way down the heap until it is balanced again:

A GIF demonstrating heapify down in heapsort

In this example, we swap the original root value 20 with the right-most child 3. With 3 as the new root, we compare the value to its child value, 14. Since 14 is greater than 3, we will swap the two values and make 14 the new root. Next, we’ll compare 3 to its new child value, 5. Once again, the child value is greater than its parent, so we will swap 3 and 5. With no more children to compare 3 to, our heap has been rebalanced.

Step 4: Repeat the process

We’ll repeat the process of swapping the root and the last element, extracting the largest value, and rebalancing the heap while the data structure has a size greater than 1. Once we hit this condition, we will have an ordered list of values.

Now that we understand the working mechanism, let’s implement heapsort in Python.

Heapsort Python Implementation

Here’s a complete Python implementation of the heapsort algorithm with detailed explanations:

def heapify(arr, n, i):
largest = i # Initialize largest as root
left = 2 * i + 1 # left = 2*i + 1
right = 2 * i + 2 # right = 2*i + 2
if left < n and arr[left] > arr[largest]:
largest = left
if right < n and arr[right] > arr[largest]:
largest = right
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # Swap
heapify(arr, n, largest)
def heap_sort(arr):
n = len(arr)
# Build max heap
for i in range(n//2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements one-by-one
for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i] # Swap
heapify(arr, i, 0)
# Example usage
arr = [4, 10, 3, 5, 1]
heap_sort(arr)
print("Sorted array:", arr)

The output will look like this:

Sorted array: [1, 3, 4, 5, 10]

Let’s now go through the time and space complexity of heapsort.

Time and space complexity analysis of heapsort

The heapsort algorithm demonstrates consistent performance characteristics across all scenarios. Here is the time and space complexity of heapsort:

Type Complexity
Best-case time complexity O(n log n)
Average-case time complexity O(n log n)
Worst-case time complexity O(n log n)
Space complexity O(1)

Lastly, let’s discuss the advantages and disadvantages of heapsort.

Heapsort advantages and disadvantages

Heapsort offers several advantages, including:

  • Time efficiency: Always guarantees O(n log n) time complexity for best, average, and worst cases.
  • In-place sorting: Requires no extra space (only O(1) auxiliary space), making it suitable for memory-constrained environments.
  • Not recursive: Can be implemented without recursion, reducing stack overhead.
  • No worst-case performance drop: Unlike Quick Sort, heapsort doesn’t degrade to O(n²) in any scenario.

However, heapsort has some disadvantages as well:

  • Not stable: Does not preserve the relative order of equal elements (i.e., not a stable sort).
  • Slower in practice than Quick Sort: Due to higher constant factors and less efficient memory access patterns.
  • Poor cache performance: Accesses memory in a non-sequential manner, leading to more cache misses compared to algorithms like Merge Sort.
  • Less intuitive to implement: Heap operations can be trickier to implement and debug, especially when building and maintaining the heap property.

Like other sorting algorithms, heapsort also has its own advantages and disadvantages. If used wisely, we can make the most out of it.

Conclusion

In this guide, we discussed what heapsort is and how it works. We went through its Python implementation and analyzed its time and space complexity. We also took a look at the advantages and disadvantages of heapsort.

Heap Sort is a powerful sorting algorithm with consistent performance and minimal memory overhead. Though it’s not the fastest for general-purpose sorting compared to Quick Sort, its worst-case performance and in-place behavior make it valuable in specific contexts.

If you want to learn more about sorting algorithms, check out the Learn Sorting Algorithms with Python course on Codecademy.

Frequently asked questions

1. What is the relationship between a heap and heapsort?

Heapsort relies directly on the heap data structure, particularly a binary max-heap. It uses the heap to repeatedly identify and remove the largest remaining element and place it at the end of the array. This process ensures a sorted order is achieved.

2. What is the difference between heapify and heapsort?

  • Heapify is the process of turning a binary tree (or array) into a heap, specifically a max-heap or min-heap. It ensures the heap property is maintained.
  • Heapsort uses the heapify process as a foundational step. Once the array is heapified into a max-heap, Heapsort repeatedly removes the maximum element and re-heapifies the reduced heap to sort the array.

3. Is heapsort the fastest?

Not in practice. While heapsort has a guaranteed worst-case time complexity of O(n log n), it is generally slower than Quick Sort due to higher constant factors and poor cache performance. However, heapsort’s consistent performance and low memory usage make it valuable in specific scenarios.

4. Why is heapsort not stable?

Heapsort is not stable because it swaps elements that are not adjacent, and equal elements may not maintain their relative order. This instability is due to the structure of the heap and how elements are moved during re-heapification.

5. Which is better, Quick Sort or Heapsort?

  • Quick Sort is usually faster in real-world scenarios due to better cache performance and lower constant overhead, especially with optimized implementations.
  • Heapsort, on the other hand, is more predictable with O(n log n) performance in all cases and is memory-efficient since it’s in-place and doesn’t require additional space.

In short:

  • Choose Quick Sort for speed (unless worst-case performance matters).
  • Choose heapsort for space-constrained environments or when consistent performance is critical.
Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team

Learn more on Codecademy

  • Learn what data structures and algorithms are, why they are useful, and how you can use them effectively in Python.
    • With Certificate
    • Intermediate.
      26 hours
  • Learn about the usefulness and efficiency of computational sorting by implementing different sorting algorithms yourself.
    • With Certificate
    • Intermediate.
      3 hours
  • Learn about the computer science concepts of data structures and algorithms and build implementations of each from scratch in modern Python.
    • Includes 8 Courses
    • With Certificate
    • Intermediate.
      25 hours