12.2 heapq and Priority Queues in Python

The heapq module in Python provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Heaps are binary trees that maintain a special property: for a min-heap, the smallest element is always at the root, and for a max-heap, the largest element is at the root.

In Python, the heapq module provides functions to create and manipulate heaps (which are essentially priority queues), and it offers an efficient way to manage elements based on their priority. Heaps are useful for tasks like finding the smallest or largest items in a collection, managing queues based on priority, and efficiently sorting datasets.

In this section, we’ll explore how to use heapq for priority queues, how to create and manipulate heaps, and how to solve common problems using heap-based approaches.


12.2.1 What is a Heap?

A heap is a specialized binary tree data structure where the parent node is either always smaller than (in a min-heap) or larger than (in a max-heap) its children. This property allows heaps to efficiently support finding and removing the minimum or maximum element in constant time (O(1)), while insertion and deletion operations are logarithmic (O(log n)).

Python’s heapq module implements a min-heap by default, but you can easily adapt it to work as a max-heap.


12.2.2 Basic Heap Operations with heapq

The heapq module provides several useful functions to work with heaps:

  • heapq.heappush(heap, item): Pushes a new item onto the heap, maintaining the heap property.
  • heapq.heappop(heap): Pops the smallest item off the heap, maintaining the heap property.
  • heapq.heappushpop(heap, item): Pushes a new item onto the heap, then pops and returns the smallest item.
  • heapq.heapify(iterable): Converts an iterable (e.g., list) into a heap in-place.
  • heapq.nlargest(n, iterable): Returns the n largest elements from the iterable.
  • heapq.nsmallest(n, iterable): Returns the n smallest elements from the iterable.

Example: Creating a Min-Heap

import heapq

# Create an empty list to use as a heap
min_heap = []

# Push elements onto the heap
heapq.heappush(min_heap, 10)
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 20)
heapq.heappush(min_heap, 1)

# Print the heap
print(min_heap)  # Output: [1, 5, 20, 10]

# Pop the smallest element
smallest = heapq.heappop(min_heap)
print(smallest)  # Output: 1

# Print the heap after popping
print(min_heap)  # Output: [5, 10, 20]

In this example:

  • heappush() adds elements to the heap, maintaining the heap property.
  • heappop() removes and returns the smallest element, which is always at the root of the heap.

12.2.3 Using heapq for Priority Queues

A priority queue is a data structure where each element has a priority associated with it. Elements are dequeued in order of their priority, not necessarily in the order they were enqueued. The heapq module can be used to implement a priority queue by treating the priority as the element to be sorted.

Example: Priority Queue with Tuples

In Python, tuples are compared lexicographically, so the first element of the tuple is the priority, followed by the subsequent elements.

import heapq

# Create an empty list to use as a priority queue
priority_queue = []

# Push elements as (priority, item) tuples
heapq.heappush(priority_queue, (2, 'task 2'))
heapq.heappush(priority_queue, (1, 'task 1'))
heapq.heappush(priority_queue, (3, 'task 3'))

# Pop the element with the highest priority (lowest number)
priority_item = heapq.heappop(priority_queue)
print(priority_item)  # Output: (1, 'task 1')

# Pop the next item
priority_item = heapq.heappop(priority_queue)
print(priority_item)  # Output: (2, 'task 2')

In this example:

  • heappush() adds tuples to the priority queue, where the first element is the priority.
  • heappop() removes the item with the highest priority (the lowest priority number).

12.2.4 Converting a List into a Heap

You can use heapq.heapify() to convert an existing list into a heap in-place. This is useful when you already have a dataset and want to quickly turn it into a heap for efficient access to the smallest or largest elements.

Example: Using heapq.heapify()

import heapq

# Create a list of numbers
nums = [15, 3, 9, 20, 8, 12]

# Convert the list into a heap
heapq.heapify(nums)

# Print the heap
print(nums)  # Output: [3, 8, 9, 20, 15, 12]

# Pop the smallest element
smallest = heapq.heappop(nums)
print(smallest)  # Output: 3

In this example:

  • heapq.heapify() rearranges the list into a valid heap structure.
  • You can now use heap operations like heappop() to efficiently retrieve the smallest element.

12.2.5 Finding the Largest or Smallest n Items

The heapq.nlargest() and heapq.nsmallest() functions allow you to quickly retrieve the n largest or smallest elements from a dataset.

Example: Finding the Largest and Smallest Elements

import heapq

# List of numbers
nums = [15, 3, 9, 20, 8, 12]

# Find the 3 largest elements
largest_three = heapq.nlargest(3, nums)
print(largest_three)  # Output: [20, 15, 12]

# Find the 2 smallest elements
smallest_two = heapq.nsmallest(2, nums)
print(smallest_two)  # Output: [3, 8]

In this example:

  • heapq.nlargest() retrieves the 3 largest elements from the list.
  • heapq.nsmallest() retrieves the 2 smallest elements from the list.

12.2.6 Implementing a Max-Heap

By default, Python’s heapq module implements a min-heap. To create a max-heap, you can invert the values when pushing them onto the heap and invert them again when popping. This is commonly done by negating the values.

Example: Creating a Max-Heap

import heapq

# Create an empty list to use as a max-heap
max_heap = []

# Push negated values onto the heap to simulate a max-heap
heapq.heappush(max_heap, -10)
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -20)
heapq.heappush(max_heap, -1)

# Pop the largest value (remember to negate it back)
largest = -heapq.heappop(max_heap)
print(largest)  # Output: 20

# Print the heap after popping
print([-x for x in max_heap])  # Output: [10, 5, 1]

In this example:

  • The values are negated before pushing onto the heap to simulate a max-heap.
  • When popping, the values are negated again to return the correct (positive) value.

12.2.7 Merging Sorted Iterables

The heapq.merge() function can merge multiple sorted iterables into a single sorted iterable. This function is useful when you need to merge already sorted datasets.

Example: Merging Sorted Lists

import heapq

# Two sorted lists
list1 = [1, 3, 5, 7]
list2 = [2, 4, 6, 8]

# Merge the two lists
merged = heapq.merge(list1, list2)

# Print the merged sorted list
print(list(merged))  # Output: [1, 2, 3, 4, 5, 6, 7, 8]

In this example:

  • heapq.merge() merges two sorted lists into a single sorted iterable without requiring additional sorting.

12.2.8 Summary

The heapq module provides efficient heap-based operations, making it ideal for tasks that require constant-time access to the smallest or largest elements in a dataset. You can use heapq to implement priority queues, min-heaps, max-heaps, and other

efficient sorting or queuing algorithms.

Key Functions:

  • heappush(): Adds an item to the heap, maintaining the heap property.
  • heappop(): Removes and returns the smallest item from the heap.
  • heapify(): Converts a list into a heap.
  • nlargest() and nsmallest(): Return the n largest or smallest items from a dataset.
  • merge(): Merges multiple sorted iterables.

By mastering the heapq module, you can write efficient algorithms for tasks like scheduling, resource management, and finding the top n elements in large datasets.