dsa_rust/hierarchies/
bin_heap.rs

1/*! An arena-allocated (min) binary heap implementation
2
3# About
4This structure represents a min heap (by default). This means that elements with the smallest ordered values sit at the top of the heap (and get popped first). E.g. an element with priorty 6 gets popped before an element with the priority 9. You can implement [std::cmp::Reverse] to construct max heaps, though it does introduce wrapping which prevents seamless type compatability or interface consistency.
5
6Push and pop operations are guaranteed to run in `O(log(n))` time, but sorting is amortized to `O(n * log(n))` time, which is exacerbated by reverse-sorted inputs.
7
8# Design
9A `Vec`-backed structure is particularly appropriate for heap implementation due to the complete binary tree invariant. Complete binary trees allow you to infer a lot of information in an indexed structure that you would not get with a linked structure.
10
11- With a complete tree you know that the maximum depth (height) of the tree is always always ⌊log2(`n`)⌋ where `n` is the number of elements in the heap.
12
13- Finding an insert point is always `O(1)` in an indexed structure with a known length. Finding an insert point in a linked backing structure requires `O(log(n))` traversal.
14
15- Element types (parent, left child, and right child) can be determined by a mathematical relationship in the backing structure. For example, for index `i`, if `heap[i] == root`, then `i + 1 == root.left`, and `i + 2 == root.right`. This can be extended to any element in the heap such that left children are always `2i + 1`, and right children are always `2i + 2`, and parents are always `(i - 1) / 2`.
16
17- You also know that indexes in a heap with backing structure `list` contain `(list.len() / 2)..list.len()` leaf nodes, and all indexes `0..(list.len() / 2)` are parent nodes. This, of course, relies on _integer division_ (division in which all remainders are truncated/discarded, not rounded).
18
19For a visual example, consider the following layout:
20| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  |  10 | 11 |
21|-------|---|---|---|---|---|---|---|---|---|----|-----|----|
22| Value | A | B | C | D | E | F | G | H | I | J  |  K  | L  |
23
24This produes the following heap structure:
25```text
26             A
27         /       \
28       B           C
29     /   \        / \
30    D     E      F   G
31   / \   / \    /
32  H   I J   K  L
33```
34You can clearly see that nodes at indexes 6 through 11 are leaf nodes, corresponding to the range
35`(list.len() / 2)..list.len()` when `list.len() == 12`. Conversely, indexes 0 through 5 are parent nodes, corresponding to the range `0..(list.len() / 2)`.
36
37You can also see that all indexes of the form `2i + 1` are left children. E.g. indexes [1, 3, 5, 7, 9, 11] are all left children. The parent of a node at index `i` is always at `(i - 1) / 2`, which produces the mapping:
38
39```text
40    M = { (0,3), (1,2), (2,2), (3,2), (4,2), (5,1) }
41```
42
43# Example
44
45```rust
46    use dsa_rust::hierarchies::bin_heap::BinHeap;
47
48    // Defines an arbitrary entry object
49    // NOTE: Clone is only necessary for from_slice() and list reuse
50    #[derive(Clone, Eq, PartialEq, PartialOrd)]
51    pub struct Job<'a> {
52        pub priority: usize,
53        pub title: &'a str,
54    }
55
56    // Heaps require total ordering, and BinHeap places an Ord trait bound on T.
57    // This implementation guarantees ordering by the priority field only.
58    // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
59    // are made using the Ord implementation.
60    use std::cmp::Ordering;
61    impl Ord for Job<'_> {
62        fn cmp(&self, other: &Self) -> Ordering {
63            self.priority.cmp(&other.priority)
64        }
65    }
66
67    let list = vec![
68        Job {
69            priority: 9,
70            title: "Dichael",
71        },
72        Job {
73            priority: 13,
74            title: "Damiel",
75        },
76        Job {
77            priority: 8,
78            title: "Sleve",
79        },
80        Job {
81            priority: 6,
82            title: "Peter",
83        },
84        Job {
85            priority: 14,
86            title: "Flank",
87        },
88        Job {
89            priority: 11,
90            title: "Cork",
91        },
92        Job {
93            priority: 12,
94            title: "Bobson",
95        },
96    ];
97
98    // Heapify options
99    //////////////////
100
101    // Option 1: from_slice() in O(n) time
102    let mut heap = BinHeap::from_slice(&list);
103    assert_eq!(heap.pop().unwrap().title, "Peter");
104    assert_eq!(heap.pop().unwrap().title, "Sleve");
105
106    // Option 2: from_vec() in O(n) time, clones purely for
107    // list reuse in this test
108    let mut heap = BinHeap::from_vec(list.clone());
109    assert_eq!(heap.pop().unwrap().title, "Peter");
110    assert_eq!(heap.pop().unwrap().title, "Sleve");
111
112    // Examples that use Reverse to construct a max heap
113    use std::cmp::Reverse;
114
115    // Option 1: O(n) reverse heapify, clones purely for list reuse
116    let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
117    let mut heap = BinHeap::from_vec(reversed_vec);
118    assert_eq!(heap.pop().unwrap().0.title, "Flank");
119    assert_eq!(heap.pop().unwrap().0.title, "Damiel");
120
121    // Option 2: O(n * log(n)) incremental reverse heap construction
122    let mut heap = BinHeap::new();
123    for e in list {
124        heap.push(Reverse(e))
125    }
126    assert_eq!(heap.pop().unwrap().0.title, "Flank");
127    assert_eq!(heap.pop().unwrap().0.title, "Damiel");
128
129    // Dirty lil heap sort
130    //////////////////////
131
132    let list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
133
134    // Creates a heap in O(n) time
135    let mut heap = BinHeap::from_slice(&list);
136
137    // Creates a sorted list in O(n * log(n)) time
138    let mut sorted_list: Vec<i32> = Vec::new();
139    for _ in 0..list.len() {
140        sorted_list.push(heap.pop().expect("oh noez"))
141    }
142
143    assert_eq!(sorted_list, [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56]);
144
145
146```
147
148*/
149
150/// # About
151/// It's `Vec` with a brand new dress.
152///
153/// See the [module-level documentation](crate::hierarchies::bin_heap) for more information.
154#[derive(Debug)]
155pub struct BinHeap<T>
156where
157    T: Ord,
158{
159    arena: Vec<T>,
160    size: usize,
161}
162impl<T> Default for BinHeap<T>
163where
164    T: Ord,
165{
166    fn default() -> Self {
167        Self::new()
168    }
169}
170impl<T> BinHeap<T>
171where
172    T: Ord,
173{
174    // PUBLIC API
175    /////////////
176
177    /// Creates a new, zero-sized `BinHeap`.
178    pub fn new() -> Self {
179        BinHeap {
180            arena: Vec::new(),
181            size: 0,
182        }
183    }
184
185    /// Creates a new `BinHeap` that pre-allocates to the given capacity.
186    pub fn new_with_capacity(capacity: usize) -> Self {
187        BinHeap {
188            arena: Vec::with_capacity(capacity),
189            size: 0,
190        }
191    }
192
193    /// A "heapify" operation that creates a `BinHeap` from any slice of entries
194    /// in `O(n)` time. The entries must be `Clone`.
195    pub fn from_slice(list: &[T]) -> Self
196    where
197        T: Clone,
198    {
199        let mut heap = BinHeap::new();
200        heap.arena = Vec::from(list);
201        let start = (list.len() / 2).saturating_sub(1);
202        for i in (0..=start).rev() {
203            heap.sift_down(i);
204        }
205        heap
206    }
207
208    /// A "heapify" operation that creates a `BinHeap` from a `Vec<T>` in `O(n)`
209    /// time. `T` does not need to be `Clone`.
210    pub fn from_vec(list: Vec<T>) -> Self {
211        let size = list.len();
212        let mut heap = BinHeap { arena: list, size };
213        let start = (size / 2).saturating_sub(1);
214        for i in (0..=start).rev() {
215            heap.sift_down(i);
216        }
217        heap
218    }
219
220    /// Returns the number of elements in the heap.
221    pub fn size(&self) -> usize {
222        self.size
223    }
224
225    /// Adds an element to the heap in `O(log(n))` time.
226    pub fn push(&mut self, data: T) {
227        // 1) Create and insert a Node into the array in a way that maintains the
228        // complete binary tree structure
229        self.arena.push(data);
230
231        // 2) Sift/bubble/percolate the heap to maintain order
232        // in O(log(n)) time
233        self.sift_up(self.arena.len() - 1);
234    }
235
236    /// Returns an immutable reference to the data at the top of the heap
237    /// in `O(1)` time.
238    pub fn peek(&self) -> Option<&T> {
239        if self.arena.is_empty() {
240            None
241        } else {
242            Some(&self.arena[0])
243        }
244    }
245
246    /// Returns and removes the element at the top of the heap in `O(log(n))` time.
247    pub fn pop(&mut self) -> Option<T> {
248        if !self.arena.is_empty() {
249            let node = self.arena.swap_remove(0); // O(1)
250            self.sift_down(0); // O(log(n))
251            Some(node)
252        } else {
253            None
254        }
255    }
256
257    // UTILITIES
258    ////////////
259
260    /// Sifts a node upward from the given index in the tree to maintain the
261    /// heap property.
262    fn sift_up(&mut self, mut index: usize) {
263        // Only sift up if the element is not the root (index 0)
264        while index > 0 {
265            // Calculate the parent's index
266            let parent_index = (index - 1) / 2;
267
268            // For a Min-Heap: If the current node's priority is LESS than its
269            // parent's, swap.
270            // For a Max-Heap: If the current node's priority is GREATER than
271            // its parent's, swap.
272            // Switch < to > for max heap
273            if self.arena[index] < self.arena[parent_index] {
274                self.arena.swap(index, parent_index);
275                // Move up to the parent's old position to continue sifting
276                index = parent_index;
277            } else {
278                break; // No-op, heap is heapy 🍃🧘
279            }
280        }
281    }
282
283    /// Sifts a new node downward in the tree to maintain the heap property.
284    fn sift_down(&mut self, mut index: usize) {
285        loop {
286            let left_child = 2 * index + 1;
287            let right_child = 2 * index + 2;
288            let mut target = index;
289
290            // 1) Finds a target index to swap. Sibling order is not guaranteed,
291            // but you still need to check both children to ensure heap property
292            // when sifting
293            //
294            // NOTE: (Adjust < to > for a max heap)
295            if left_child < self.arena.len() && self.arena[left_child] < self.arena[target] {
296                target = left_child;
297            }
298            if right_child < self.arena.len() && self.arena[right_child] < self.arena[target] {
299                target = right_child;
300            }
301
302            // 2) If the target is not equal to the index, do the swap operation
303            // and continue sifting, otherwise the heap property is true
304            if target != index {
305                self.arena.swap(index, target);
306                index = target; // Move down
307            } else {
308                break; // No-op, heap is heapy 🍃🧘
309            }
310        }
311    }
312
313    /// Sorts a mutable slice into ascending order in place via (max) heap operations in `O(n * log(n))` time.
314    ///
315    /// ```rust
316    /// use dsa_rust::hierarchies::bin_heap::BinHeap;
317    /// let mut v = Vec::from([8, 6, 7, 5, 3, 0, 9]);
318    /// BinHeap::heap_sort(&mut v);
319    /// assert_eq!(v, [0, 3, 5, 6, 7, 8, 9]);
320    /// ```
321    pub fn heap_sort(list: &mut [T])
322    where
323        T: Ord,
324    {
325        let len = list.len();
326
327        // Transform the list into a heap in O(n) time
328        for i in (0..len / 2).rev() {
329            Self::generic_sift_down(list, i, len); // O(log(n))
330        }
331
332        // Sort the heap by sifting/sorting in place in O(n * log(n))
333        for end in (1..len).rev() {
334            list.swap(0, end);
335            Self::generic_sift_down(list, 0, end); // O(log(n))
336        }
337    }
338
339    /// Essentially the as sift_down but takes a list (not a heap as self), and
340    /// performs max heap sifting instead of default min heap sifting.
341    fn generic_sift_down(list: &mut [T], mut root: usize, end: usize) {
342        loop {
343            let left = 2 * root + 1;
344            let right = 2 * root + 2;
345            let mut largest = root;
346
347            if left < end && list[left] > list[largest] {
348                largest = left;
349            }
350            if right < end && list[right] > list[largest] {
351                largest = right;
352            }
353
354            if largest != root {
355                list.swap(root, largest);
356                root = largest;
357            } else {
358                break;
359            }
360        }
361    }
362}
363
364#[cfg(test)]
365mod tests {
366
367    use super::*;
368
369    #[test]
370    fn min_heap_test() {
371        // Defines an arbitrary entry object,
372        // NOTE: Clone is only necessary for from_slice() usage
373        //#[derive(Clone, Debug, Eq, PartialEq)]
374        #[derive(Clone, Debug, Eq, PartialEq, PartialOrd)]
375        pub struct Job<'a> {
376            pub priority: usize,
377            pub title: &'a str,
378        }
379
380        // Heaps require total ordering, so BinHeap places an Ord trait bound on T.
381        // This implementation guarantees ordering by the priority field only.
382        // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
383        // are made using the Ord implementation.
384        use std::cmp::Ordering;
385        impl Ord for Job<'_> {
386            fn cmp(&self, other: &Self) -> Ordering {
387                self.priority.cmp(&other.priority)
388            }
389        }
390
391        // Heaps require total order, so a custom PartialOrd impl is unnecessary
392        //impl PartialOrd for Job<'_> {
393        //    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
394        //        Some(self.cmp(other))
395        //    }
396        //}
397
398        let list = vec![
399            Job {
400                priority: 9,
401                title: "Dichael",
402            },
403            Job {
404                priority: 13,
405                title: "Damiel",
406            },
407            Job {
408                priority: 8,
409                title: "Sleve",
410            },
411            Job {
412                priority: 6,
413                title: "Peter",
414            },
415            Job {
416                priority: 14,
417                title: "Flank",
418            },
419            Job {
420                priority: 11,
421                title: "Cork",
422            },
423            Job {
424                priority: 12,
425                title: "Bobson",
426            },
427        ];
428
429        // Heapify options!
430        ///////////////////
431
432        // Option 1: from_slice()
433        let mut heap = BinHeap::from_slice(&list);
434        assert_eq!(heap.pop().unwrap().title, "Peter");
435        assert_eq!(heap.pop().unwrap().title, "Sleve");
436        assert_eq!(heap.pop().unwrap().title, "Dichael");
437        assert_eq!(heap.pop().unwrap().title, "Cork");
438        assert_eq!(heap.peek().unwrap().title, "Bobson");
439        assert_eq!(heap.pop().unwrap().title, "Bobson");
440        assert_eq!(heap.pop().unwrap().title, "Damiel");
441        assert_eq!(heap.peek().unwrap().title, "Flank");
442
443        // Option 2: from_vec(), clones purely for list reuse
444        let mut heap = BinHeap::from_vec(list.clone());
445        assert_eq!(heap.pop().unwrap().title, "Peter");
446        assert_eq!(heap.pop().unwrap().title, "Sleve");
447        assert_eq!(heap.pop().unwrap().title, "Dichael");
448        assert_eq!(heap.peek().unwrap().title, "Cork");
449        assert_eq!(heap.pop().unwrap().title, "Cork");
450        assert_eq!(heap.pop().unwrap().title, "Bobson");
451        assert_eq!(heap.pop().unwrap().title, "Damiel");
452        assert_eq!(heap.peek().unwrap().title, "Flank");
453
454        // Examples that use Reverse to construct a max heap
455        use std::cmp::Reverse;
456
457        // Option 1: O(n) reverse heapify, clones purely for list reuse
458        let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
459        let mut heap = BinHeap::from_vec(reversed_vec);
460        assert_eq!(heap.pop().unwrap().0.title, "Flank");
461        assert_eq!(heap.pop().unwrap().0.title, "Damiel");
462        assert_eq!(heap.pop().unwrap().0.title, "Bobson");
463        assert_eq!(heap.pop().unwrap().0.title, "Cork");
464        assert_eq!(heap.peek().unwrap().0.title, "Dichael");
465        assert_eq!(heap.pop().unwrap().0.title, "Dichael");
466        assert_eq!(heap.pop().unwrap().0.title, "Sleve");
467        assert_eq!(heap.peek().unwrap().0.title, "Peter");
468
469        // Option 2: O(n * log(n)) incremental reverse heap construction
470        let mut heap = BinHeap::new();
471        for e in list {
472            heap.push(Reverse(e))
473        }
474        assert_eq!(heap.pop().unwrap().0.title, "Flank");
475        assert_eq!(heap.pop().unwrap().0.title, "Damiel");
476        assert_eq!(heap.pop().unwrap().0.title, "Bobson");
477        assert_eq!(heap.pop().unwrap().0.title, "Cork");
478        assert_eq!(heap.peek().unwrap().0.title, "Dichael");
479        assert_eq!(heap.pop().unwrap().0.title, "Dichael");
480        assert_eq!(heap.pop().unwrap().0.title, "Sleve");
481        assert_eq!(heap.peek().unwrap().0.title, "Peter");
482    }
483
484    #[test]
485    // Tests two approaches to heap sort:
486    // 1) Creating a separate binary heap buffer, pushing values, and popping
487    //    them back to a third sorted list.
488    // 2) Treating the backing list as the buffer for in-place heap creation
489    //    and sorting via pop & swap loop.
490    fn heap_sort() {
491        let mut list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
492        let sorted = [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56];
493
494        // 1) Sorting via heap buffer (secondary storage)
495        // Creates a heap (buffer) in O(n) time from borrowed values
496        let mut heap = BinHeap::from_slice(&list);
497        // Pops the buffered heap values to a (secondary) sorted list in O(n * log(n)) time
498        let mut sorted_list: Vec<i32> = Vec::new();
499        for _ in 0..list.len() {
500            sorted_list.push(heap.pop().expect("uh oh"))
501        }
502        assert_eq!(sorted_list, sorted);
503
504        // 2) Sorting in place
505        BinHeap::heap_sort(&mut list);
506        assert_eq!(list, sorted);
507
508        let mut v = Vec::from([6, 5, 8, 3]);
509        BinHeap::heap_sort(&mut v);
510        assert_eq!(v, [3, 5, 6, 8]);
511    }
512}