dsa_rust/hierarchies/
bin_heap.rs

1/*! An arena-allocated (min) binary heap implementation
2
3# About
4This structure represents a min heap (by default). This means that elements with the smallest ordered values sit at the top of the heap (and get popped first). E.g. an element with priorty 6 gets popped before an element with the priority 9. You can implement [std::cmp::Reverse] to construct max heaps, though it does introduce wrapping which prevents seamless type compatability or interface consistency.
5
6Push and pop operations are guaranteed to run in `O(log(n))` time, but sorting is amortized to `O(n * log(n))` time, which is exacerbated by reverse-sorted inputs.
7
8# Design
9A `Vec`-backed structure is particularly appropriate for heap implementation due to the complete binary tree invariant. Complete binary trees allow you to infer a lot of information in an indexed structure that you would not get with a linked structure.
10
11- With a complete tree you know that the maximum depth (height) of the tree is always always ⌊log2(`n`)⌋ where `n` is the number of elements in the heap.
12
13- Finding an insert point is always `O(1)` in an indexed structure with a known length. Finding an insert point in a linked backing structure requires `O(log(n))` traversal.
14
15- Element types (parent, left child, and right child) can be determined by a mathematical relationship in the backing structure. For example, for index `i`, if `heap[i] == root`, then `i + 1 == root.left`, and `i + 2 == root.right`. This can be extended to any element in the heap such that left children are always `2i + 1`, and right children are always `2i + 2`, and parents are always `(i - 1) / 2`.
16
17- You also know that indexes in a heap with backing structure `list` contain `(list.len() / 2)..list.len()` leaf nodes, and all indexes `0..(list.len() / 2)` are parent nodes. This, of course, relies on _integer division_ (division in which all remainders are truncated/discarded, not rounded).
18
19For a visual example, consider the following layout:
20| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  |  10 | 11 |
21|-------|---|---|---|---|---|---|---|---|---|----|-----|----|
22| Value | A | B | C | D | E | F | G | H | I | J  |  K  | L  |
23
24This produes the following heap structure:
25```text
26             A
27         /       \
28       B           C
29     /   \        / \
30    D     E      F   G
31   / \   / \    /
32  H   I J   K  L
33```
34You can clearly see that nodes at indexes 6 through 11 are leaf nodes, corresponding to the range
35`(list.len() / 2)..list.len()` when `list.len() == 12`. Conversely, indexes 0 through 5 are parent nodes, corresponding to the range `0..(list.len() / 2)`.
36
37You can also see that all indexes of the form `2i + 1` are left children. E.g. indexes [1, 3, 5, 7, 9, 11] are all left children. The parent of a node at index `i` is always at `(i - 1) / 2`, which produces the mapping:
38
39```text
40    M = { (0,3), (1,2), (2,2), (3,2), (4,2), (5,1) }
41```
42
43# Example
44
45```rust
46    use dsa_rust::hierarchies::bin_heap::BinHeap;
47
48    // Defines an arbitrary entry object
49    // NOTE: Clone is only necessary for from_slice() and list reuse
50    #[derive(Clone, Eq, PartialEq, PartialOrd)]
51    pub struct Job<'a> {
52        pub priority: usize,
53        pub title: &'a str,
54    }
55
56    // Heaps require total ordering, and BinHeap places an Ord trait bound on T.
57    // This implementation guarantees ordering by the priority field only.
58    // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
59    // are made using the Ord implementation.
60    use std::cmp::Ordering;
61    impl Ord for Job<'_> {
62        fn cmp(&self, other: &Self) -> Ordering {
63            self.priority.cmp(&other.priority)
64        }
65    }
66
67    let list = vec![
68        Job {
69            priority: 9,
70            title: "Dichael",
71        },
72        Job {
73            priority: 13,
74            title: "Damiel",
75        },
76        Job {
77            priority: 8,
78            title: "Sleve",
79        },
80        Job {
81            priority: 6,
82            title: "Peter",
83        },
84        Job {
85            priority: 14,
86            title: "Flank",
87        },
88        Job {
89            priority: 11,
90            title: "Cork",
91        },
92        Job {
93            priority: 12,
94            title: "Bobson",
95        },
96    ];
97
98    // Heapify options
99    //////////////////
100
101    // Option 1: from_slice() in O(n) time
102    let mut heap = BinHeap::from_slice(&list);
103    assert_eq!(heap.pop().unwrap().title, "Peter");
104    assert_eq!(heap.pop().unwrap().title, "Sleve");
105
106    // Option 2: from_vec() in O(n) time, clones purely for
107    // list reuse in this test
108    let mut heap = BinHeap::from_vec(list.clone());
109    assert_eq!(heap.pop().unwrap().title, "Peter");
110    assert_eq!(heap.pop().unwrap().title, "Sleve");
111
112    // Examples that use Reverse to construct a max heap
113    use std::cmp::Reverse;
114
115    // Option 1: O(n) reverse heapify, clones purely for list reuse
116    let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
117    let mut heap = BinHeap::from_vec(reversed_vec);
118    assert_eq!(heap.pop().unwrap().0.title, "Flank");
119    assert_eq!(heap.pop().unwrap().0.title, "Damiel");
120
121    // Option 2: O(n * log(n)) incremental reverse heap construction
122    let mut heap = BinHeap::new();
123    for e in list {
124        heap.push(Reverse(e))
125    }
126    assert_eq!(heap.pop().unwrap().0.title, "Flank");
127    assert_eq!(heap.pop().unwrap().0.title, "Damiel");
128
129    // Dirty lil heap sort
130    //////////////////////
131
132    let list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
133
134    // Creates a heap in O(n) time
135    let mut heap = BinHeap::from_slice(&list);
136
137    // Creates a sorted list in O(n * log(n)) time
138    let mut sorted_list: Vec<i32> = Vec::new();
139    for _ in 0..list.len() {
140        sorted_list.push(heap.pop().expect("oh noez"))
141    }
142
143    assert_eq!(sorted_list, [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56]);
144
145
146```
147
148*/
149
150/// # About
151/// It's `Vec` with a brand new dress.
152///
153/// See the [module-level documentation](crate::hierarchies::bin_heap) for more information.
154#[derive(Debug)]
155pub struct BinHeap<T> {
156    arena: Vec<T>,
157    size: usize,
158}
159impl<T: Ord> Default for BinHeap<T> {
160    fn default() -> Self {
161        Self::new()
162    }
163}
164impl<T> BinHeap<T>
165where
166    T: Ord,
167{
168    // PUBLIC API
169    /////////////
170
171    /// Creates a new, zero-sized `BinHeap`.
172    pub fn new() -> Self {
173        BinHeap {
174            arena: Vec::new(),
175            size: 0,
176        }
177    }
178
179    /// Creates a new `BinHeap` that pre-allocates to the given capacity.
180    pub fn new_with_capacity(capacity: usize) -> Self {
181        BinHeap {
182            arena: Vec::with_capacity(capacity),
183            size: 0,
184        }
185    }
186
187    /// A "heapify" operation that creates a `BinHeap` from any slice of entries
188    /// in `O(n)` time. The entries must be `Clone`.
189    pub fn from_slice(list: &[T]) -> Self
190    where
191        T: Clone,
192    {
193        let mut heap = BinHeap::new();
194        heap.arena = Vec::from(list);
195        let start = (list.len() / 2).saturating_sub(1);
196        for i in (0..=start).rev() {
197            heap.sift_down(i);
198        }
199        heap
200    }
201
202    /// A "heapify" operation that creates a `BinHeap` from a `Vec<T>` in `O(n)`
203    /// time. `T` does not need to be `Clone`.
204    pub fn from_vec(list: Vec<T>) -> Self {
205        let size = list.len();
206        let mut heap = BinHeap { arena: list, size };
207        let start = (size / 2).saturating_sub(1);
208        for i in (0..=start).rev() {
209            heap.sift_down(i);
210        }
211        heap
212    }
213
214    /// Returns the number of elements in the heap.
215    pub fn size(&self) -> usize {
216        self.size
217    }
218
219    /// Adds an element to the heap in `O(log(n))` time.
220    pub fn push(&mut self, data: T) {
221        // 1) Create and insert a Node into the array in a way that maintains the
222        // complete binary tree structure
223        self.arena.push(data);
224
225        // 2) Sift/bubble/percolate the heap to maintain order
226        // in O(log(n)) time
227        self.sift_up(self.arena.len() - 1);
228    }
229
230    /// Returns an immutable reference to the data at the top of the heap
231    /// in `O(1)` time.
232    pub fn peek(&self) -> Option<&T> {
233        if self.arena.is_empty() {
234            None
235        } else {
236            Some(&self.arena[0])
237        }
238    }
239
240    /// Returns and removes the element at the top of the heap in `O(log(n))` time.
241    pub fn pop(&mut self) -> Option<T> {
242        if !self.arena.is_empty() {
243            let node = self.arena.swap_remove(0); // O(1)
244            self.sift_down(0); // O(log(n))
245            Some(node)
246        } else {
247            None
248        }
249    }
250
251    // UTILITIES
252    ////////////
253
254    /// Sifts a node upward from the given index in the tree to maintain the
255    /// heap property.
256    fn sift_up(&mut self, mut index: usize) {
257        // Only sift up if the element is not the root (index 0)
258        while index > 0 {
259            // Calculate the parent's index
260            let parent_index = (index - 1) / 2;
261
262            // For a Min-Heap: If the current node's priority is LESS than its
263            // parent's, swap.
264            // For a Max-Heap: If the current node's priority is GREATER than
265            // its parent's, swap.
266            // Switch < to > for max heap
267            if self.arena[index] < self.arena[parent_index] {
268                self.arena.swap(index, parent_index);
269                // Move up to the parent's old position to continue sifting
270                index = parent_index;
271            } else {
272                break; // No-op, heap is heapy 🍃🧘
273            }
274        }
275    }
276
277    /// Sifts a new node downward in the tree to maintain the heap property.
278    fn sift_down(&mut self, mut index: usize) {
279        loop {
280            let left_child = 2 * index + 1;
281            let right_child = 2 * index + 2;
282            let mut target = index;
283
284            // 1) Finds a target index to swap. Sibling order is not guaranteed,
285            // but you still need to check both children to ensure heap property
286            // when sifting
287            //
288            // NOTE: (Adjust < to > for a max heap)
289            if left_child < self.arena.len() && self.arena[left_child] < self.arena[target] {
290                target = left_child;
291            }
292            if right_child < self.arena.len() && self.arena[right_child] < self.arena[target] {
293                target = right_child;
294            }
295
296            // 2) If the target is not equal to the index, do the swap operation
297            // and continue sifting, otherwise the heap property is true
298            if target != index {
299                self.arena.swap(index, target);
300                index = target; // Move down
301            } else {
302                break; // No-op, heap is heapy 🍃🧘
303            }
304        }
305    }
306
307    /// Sorts a mutable slice into ascending order in place via (max) heap operations in `O(n * log(n))` time.
308    ///
309    /// ```rust
310    /// use dsa_rust::hierarchies::bin_heap::BinHeap;
311    /// let mut v = Vec::from([8, 6, 7, 5, 3, 0, 9]);
312    /// BinHeap::heap_sort(&mut v);
313    /// assert_eq!(v, [0, 3, 5, 6, 7, 8, 9]);
314    /// ```
315    pub fn heap_sort(list: &mut [T])
316    where
317        T: Ord,
318    {
319        let len = list.len();
320
321        // Transform the list into a heap in O(n) time
322        for i in (0..len / 2).rev() {
323            Self::generic_sift_down(list, i, len); // O(log(n))
324        }
325
326        // Sort the heap by sifting/sorting in place in O(n * log(n))
327        for end in (1..len).rev() {
328            list.swap(0, end);
329            Self::generic_sift_down(list, 0, end); // O(log(n))
330        }
331    }
332
333    /// Essentially the as sift_down but takes a list (not a heap as self), and
334    /// performs max heap sifting instead of default min heap sifting.
335    fn generic_sift_down(list: &mut [T], mut root: usize, end: usize) {
336        loop {
337            let left = 2 * root + 1;
338            let right = 2 * root + 2;
339            let mut largest = root;
340
341            if left < end && list[left] > list[largest] {
342                largest = left;
343            }
344            if right < end && list[right] > list[largest] {
345                largest = right;
346            }
347
348            if largest != root {
349                list.swap(root, largest);
350                root = largest;
351            } else {
352                break;
353            }
354        }
355    }
356}
357
358#[cfg(test)]
359mod tests {
360
361    use super::*;
362
363    #[test]
364    fn min_heap_test() {
365        // Defines an arbitrary entry object,
366        // NOTE: Clone is only necessary for from_slice() usage
367        //#[derive(Clone, Debug, Eq, PartialEq)]
368        #[derive(Clone, Debug, Eq, PartialEq, PartialOrd)]
369        pub struct Job<'a> {
370            pub priority: usize,
371            pub title: &'a str,
372        }
373
374        // Heaps require total ordering, so BinHeap places an Ord trait bound on T.
375        // This implementation guarantees ordering by the priority field only.
376        // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
377        // are made using the Ord implementation.
378        use std::cmp::Ordering;
379        impl Ord for Job<'_> {
380            fn cmp(&self, other: &Self) -> Ordering {
381                self.priority.cmp(&other.priority)
382            }
383        }
384
385        // Heaps require total order, so a custom PartialOrd impl is unnecessary
386        //impl PartialOrd for Job<'_> {
387        //    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
388        //        Some(self.cmp(other))
389        //    }
390        //}
391
392        let list = vec![
393            Job {
394                priority: 9,
395                title: "Dichael",
396            },
397            Job {
398                priority: 13,
399                title: "Damiel",
400            },
401            Job {
402                priority: 8,
403                title: "Sleve",
404            },
405            Job {
406                priority: 6,
407                title: "Peter",
408            },
409            Job {
410                priority: 14,
411                title: "Flank",
412            },
413            Job {
414                priority: 11,
415                title: "Cork",
416            },
417            Job {
418                priority: 12,
419                title: "Bobson",
420            },
421        ];
422
423        // Heapify options!
424        ///////////////////
425
426        // Option 1: from_slice()
427        let mut heap = BinHeap::from_slice(&list);
428        assert_eq!(heap.pop().unwrap().title, "Peter");
429        assert_eq!(heap.pop().unwrap().title, "Sleve");
430        assert_eq!(heap.pop().unwrap().title, "Dichael");
431        assert_eq!(heap.pop().unwrap().title, "Cork");
432        assert_eq!(heap.peek().unwrap().title, "Bobson");
433        assert_eq!(heap.pop().unwrap().title, "Bobson");
434        assert_eq!(heap.pop().unwrap().title, "Damiel");
435        assert_eq!(heap.peek().unwrap().title, "Flank");
436
437        // Option 2: from_vec(), clones purely for list reuse
438        let mut heap = BinHeap::from_vec(list.clone());
439        assert_eq!(heap.pop().unwrap().title, "Peter");
440        assert_eq!(heap.pop().unwrap().title, "Sleve");
441        assert_eq!(heap.pop().unwrap().title, "Dichael");
442        assert_eq!(heap.peek().unwrap().title, "Cork");
443        assert_eq!(heap.pop().unwrap().title, "Cork");
444        assert_eq!(heap.pop().unwrap().title, "Bobson");
445        assert_eq!(heap.pop().unwrap().title, "Damiel");
446        assert_eq!(heap.peek().unwrap().title, "Flank");
447
448        // Examples that use Reverse to construct a max heap
449        use std::cmp::Reverse;
450
451        // Option 1: O(n) reverse heapify, clones purely for list reuse
452        let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
453        let mut heap = BinHeap::from_vec(reversed_vec);
454        assert_eq!(heap.pop().unwrap().0.title, "Flank");
455        assert_eq!(heap.pop().unwrap().0.title, "Damiel");
456        assert_eq!(heap.pop().unwrap().0.title, "Bobson");
457        assert_eq!(heap.pop().unwrap().0.title, "Cork");
458        assert_eq!(heap.peek().unwrap().0.title, "Dichael");
459        assert_eq!(heap.pop().unwrap().0.title, "Dichael");
460        assert_eq!(heap.pop().unwrap().0.title, "Sleve");
461        assert_eq!(heap.peek().unwrap().0.title, "Peter");
462
463        // Option 2: O(n * log(n)) incremental reverse heap construction
464        let mut heap = BinHeap::new();
465        for e in list {
466            heap.push(Reverse(e))
467        }
468        assert_eq!(heap.pop().unwrap().0.title, "Flank");
469        assert_eq!(heap.pop().unwrap().0.title, "Damiel");
470        assert_eq!(heap.pop().unwrap().0.title, "Bobson");
471        assert_eq!(heap.pop().unwrap().0.title, "Cork");
472        assert_eq!(heap.peek().unwrap().0.title, "Dichael");
473        assert_eq!(heap.pop().unwrap().0.title, "Dichael");
474        assert_eq!(heap.pop().unwrap().0.title, "Sleve");
475        assert_eq!(heap.peek().unwrap().0.title, "Peter");
476    }
477
478    #[test]
479    // Tests two approaches to heap sort:
480    // 1) Creating a separate binary heap buffer, pushing values, and popping
481    //    them back to a third sorted list.
482    // 2) Treating the backing list as the buffer for in-place heap creation
483    //    and sorting via pop & swap loop.
484    fn heap_sort() {
485        let mut list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
486        let sorted = [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56];
487
488        // 1) Sorting via heap buffer (secondary storage)
489        // Creates a heap (buffer) in O(n) time from borrowed values
490        let mut heap = BinHeap::from_slice(&list);
491        // Pops the buffered heap values to a (secondary) sorted list in O(n * log(n)) time
492        let mut sorted_list: Vec<i32> = Vec::new();
493        for _ in 0..list.len() {
494            sorted_list.push(heap.pop().expect("uh oh"))
495        }
496        assert_eq!(sorted_list, sorted);
497
498        // 2) Sorting in place
499        BinHeap::heap_sort(&mut list);
500        assert_eq!(list, sorted);
501
502        let mut v = Vec::from([6, 5, 8, 3]);
503        BinHeap::heap_sort(&mut v);
504        assert_eq!(v, [3, 5, 6, 8]);
505    }
506}