Module bin_heap

Source
Expand description

An arena-allocated (min) binary heap implementation

§About

This structure represents a min heap (by default). This means that elements with the smallest ordered values sit at the top of the heap (and get popped first). E.g. an element with priorty 6 gets popped before an element with the priority 9. You can implement std::cmp::Reverse to construct max heaps, though it does introduce wrapping which prevents seamless type compatability or interface consistency.

Push and pop operations are guaranteed to run in O(log(n)) time, but sorting is amortized to O(n * log(n)) time, which is exacerbated by reverse-sorted inputs.

§Design

A Vec-backed structure is particularly appropriate for heap implementation due to the complete binary tree invariant. Complete binary trees allow you to infer a lot of information in an indexed structure that you would not get with a linked structure.

  • With a complete tree you know that the maximum depth (height) of the tree is always always ⌊log2(n)⌋ where n is the number of elements in the heap.

  • Finding an insert point is always O(1) in an indexed structure with a known length. Finding an insert point in a linked backing structure requires O(log(n)) traversal.

  • Element types (parent, left child, and right child) can be determined by a mathematical relationship in the backing structure. For example, for index i, if heap[i] == root, then i + 1 == root.left, and i + 2 == root.right. This can be extended to any element in the heap such that left children are always 2i + 1, and right children are always 2i + 2, and parents are always (i - 1) / 2.

  • You also know that indexes in a heap with backing structure list contain (list.len() / 2)..list.len() leaf nodes, and all indexes 0..(list.len() / 2) are parent nodes. This, of course, relies on integer division (division in which all remainders are truncated/discarded, not rounded).

For a visual example, consider the following layout:

Index01234567891011
ValueABCDEFGHIJKL

This produes the following heap structure:

             A
         /       \
       B           C
     /   \        / \
    D     E      F   G
   / \   / \    /
  H   I J   K  L

You can clearly see that nodes at indexes 6 through 11 are leaf nodes, corresponding to the range (list.len() / 2)..list.len() when list.len() == 12. Conversely, indexes 0 through 5 are parent nodes, corresponding to the range 0..(list.len() / 2).

You can also see that all indexes of the form 2i + 1 are left children. E.g. indexes [1, 3, 5, 7, 9, 11] are all left children. The parent of a node at index i is always at (i - 1) / 2, which produces the mapping:

    M = { (0,3), (1,2), (2,2), (3,2), (4,2), (5,1) }

§Example

    use dsa_rust::hierarchies::bin_heap::BinHeap;

    // Defines an arbitrary entry object
    // NOTE: Clone is only necessary for from_slice() and list reuse
    #[derive(Clone, Eq, PartialEq, PartialOrd)]
    pub struct Job<'a> {
        pub priority: usize,
        pub title: &'a str,
    }

    // Heaps require total ordering, and BinHeap places an Ord trait bound on T.
    // This implementation guarantees ordering by the priority field only.
    // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
    // are made using the Ord implementation.
    use std::cmp::Ordering;
    impl Ord for Job<'_> {
        fn cmp(&self, other: &Self) -> Ordering {
            self.priority.cmp(&other.priority)
        }
    }

    let list = vec![
        Job {
            priority: 9,
            title: "Dichael",
        },
        Job {
            priority: 13,
            title: "Damiel",
        },
        Job {
            priority: 8,
            title: "Sleve",
        },
        Job {
            priority: 6,
            title: "Peter",
        },
        Job {
            priority: 14,
            title: "Flank",
        },
        Job {
            priority: 11,
            title: "Cork",
        },
        Job {
            priority: 12,
            title: "Bobson",
        },
    ];

    // Heapify options
    //////////////////

    // Option 1: from_slice() in O(n) time
    let mut heap = BinHeap::from_slice(&list);
    assert_eq!(heap.pop().unwrap().title, "Peter");
    assert_eq!(heap.pop().unwrap().title, "Sleve");

    // Option 2: from_vec() in O(n) time, clones purely for
    // list reuse in this test
    let mut heap = BinHeap::from_vec(list.clone());
    assert_eq!(heap.pop().unwrap().title, "Peter");
    assert_eq!(heap.pop().unwrap().title, "Sleve");

    // Examples that use Reverse to construct a max heap
    use std::cmp::Reverse;

    // Option 1: O(n) reverse heapify, clones purely for list reuse
    let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
    let mut heap = BinHeap::from_vec(reversed_vec);
    assert_eq!(heap.pop().unwrap().0.title, "Flank");
    assert_eq!(heap.pop().unwrap().0.title, "Damiel");

    // Option 2: O(n * log(n)) incremental reverse heap construction
    let mut heap = BinHeap::new();
    for e in list {
        heap.push(Reverse(e))
    }
    assert_eq!(heap.pop().unwrap().0.title, "Flank");
    assert_eq!(heap.pop().unwrap().0.title, "Damiel");

    // Dirty lil heap sort
    //////////////////////

    let list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];

    // Creates a heap in O(n) time
    let mut heap = BinHeap::from_slice(&list);

    // Creates a sorted list in O(n * log(n)) time
    let mut sorted_list: Vec<i32> = Vec::new();
    for _ in 0..list.len() {
        sorted_list.push(heap.pop().expect("oh noez"))
    }

    assert_eq!(sorted_list, [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56]);

Structs§

BinHeap
About