dsa_rust/hierarchies/bin_heap.rs
1/*! An arena-allocated (min) binary heap implementation
2
3# About
4This structure represents a min heap (by default). This means that elements with the smallest ordered values sit at the top of the heap (and get popped first). E.g. an element with priorty 6 gets popped before an element with the priority 9. You can implement [std::cmp::Reverse] to construct max heaps, though it does introduce wrapping which prevents seamless type compatability or interface consistency.
5
6Push and pop operations are guaranteed to run in `O(log(n))` time, but sorting is amortized to `O(n * log(n))` time, which is exacerbated by reverse-sorted inputs.
7
8# Design
9A `Vec`-backed structure is particularly appropriate for heap implementation due to the complete binary tree invariant. Complete binary trees allow you to infer a lot of information in an indexed structure that you would not get with a linked structure.
10
11- With a complete tree you know that the maximum depth (height) of the tree is always always ⌊log2(`n`)⌋ where `n` is the number of elements in the heap.
12
13- Finding an insert point is always `O(1)` in an indexed structure with a known length. Finding an insert point in a linked backing structure requires `O(log(n))` traversal.
14
15- Element types (parent, left child, and right child) can be determined by a mathematical relationship in the backing structure. For example, for index `i`, if `heap[i] == root`, then `i + 1 == root.left`, and `i + 2 == root.right`. This can be extended to any element in the heap such that left children are always `2i + 1`, and right children are always `2i + 2`, and parents are always `(i - 1) / 2`.
16
17- You also know that indexes in a heap with backing structure `list` contain `(list.len() / 2)..list.len()` leaf nodes, and all indexes `0..(list.len() / 2)` are parent nodes. This, of course, relies on _integer division_ (division in which all remainders are truncated/discarded, not rounded).
18
19For a visual example, consider the following layout:
20| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
21|-------|---|---|---|---|---|---|---|---|---|----|-----|----|
22| Value | A | B | C | D | E | F | G | H | I | J | K | L |
23
24This produes the following heap structure:
25```text
26 A
27 / \
28 B C
29 / \ / \
30 D E F G
31 / \ / \ /
32 H I J K L
33```
34You can clearly see that nodes at indexes 6 through 11 are leaf nodes, corresponding to the range
35`(list.len() / 2)..list.len()` when `list.len() == 12`. Conversely, indexes 0 through 5 are parent nodes, corresponding to the range `0..(list.len() / 2)`.
36
37You can also see that all indexes of the form `2i + 1` are left children. E.g. indexes [1, 3, 5, 7, 9, 11] are all left children. The parent of a node at index `i` is always at `(i - 1) / 2`, which produces the mapping:
38
39```text
40 M = { (0,3), (1,2), (2,2), (3,2), (4,2), (5,1) }
41```
42
43# Example
44
45```rust
46 use dsa_rust::hierarchies::bin_heap::BinHeap;
47
48 // Defines an arbitrary entry object
49 // NOTE: Clone is only necessary for from_slice() and list reuse
50 #[derive(Clone, Eq, PartialEq, PartialOrd)]
51 pub struct Job<'a> {
52 pub priority: usize,
53 pub title: &'a str,
54 }
55
56 // Heaps require total ordering, and BinHeap places an Ord trait bound on T.
57 // This implementation guarantees ordering by the priority field only.
58 // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
59 // are made using the Ord implementation.
60 use std::cmp::Ordering;
61 impl Ord for Job<'_> {
62 fn cmp(&self, other: &Self) -> Ordering {
63 self.priority.cmp(&other.priority)
64 }
65 }
66
67 let list = vec![
68 Job {
69 priority: 9,
70 title: "Dichael",
71 },
72 Job {
73 priority: 13,
74 title: "Damiel",
75 },
76 Job {
77 priority: 8,
78 title: "Sleve",
79 },
80 Job {
81 priority: 6,
82 title: "Peter",
83 },
84 Job {
85 priority: 14,
86 title: "Flank",
87 },
88 Job {
89 priority: 11,
90 title: "Cork",
91 },
92 Job {
93 priority: 12,
94 title: "Bobson",
95 },
96 ];
97
98 // Heapify options
99 //////////////////
100
101 // Option 1: from_slice() in O(n) time
102 let mut heap = BinHeap::from_slice(&list);
103 assert_eq!(heap.pop().unwrap().title, "Peter");
104 assert_eq!(heap.pop().unwrap().title, "Sleve");
105
106 // Option 2: from_vec() in O(n) time, clones purely for
107 // list reuse in this test
108 let mut heap = BinHeap::from_vec(list.clone());
109 assert_eq!(heap.pop().unwrap().title, "Peter");
110 assert_eq!(heap.pop().unwrap().title, "Sleve");
111
112 // Examples that use Reverse to construct a max heap
113 use std::cmp::Reverse;
114
115 // Option 1: O(n) reverse heapify, clones purely for list reuse
116 let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
117 let mut heap = BinHeap::from_vec(reversed_vec);
118 assert_eq!(heap.pop().unwrap().0.title, "Flank");
119 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
120
121 // Option 2: O(n * log(n)) incremental reverse heap construction
122 let mut heap = BinHeap::new();
123 for e in list {
124 heap.push(Reverse(e))
125 }
126 assert_eq!(heap.pop().unwrap().0.title, "Flank");
127 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
128
129 // Dirty lil heap sort
130 //////////////////////
131
132 let list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
133
134 // Creates a heap in O(n) time
135 let mut heap = BinHeap::from_slice(&list);
136
137 // Creates a sorted list in O(n * log(n)) time
138 let mut sorted_list: Vec<i32> = Vec::new();
139 for _ in 0..list.len() {
140 sorted_list.push(heap.pop().expect("oh noez"))
141 }
142
143 assert_eq!(sorted_list, [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56]);
144
145
146```
147
148*/
149
150/// # About
151/// It's `Vec` with a brand new dress.
152///
153/// See the [module-level documentation](crate::hierarchies::bin_heap) for more information.
154#[derive(Debug)]
155pub struct BinHeap<T>
156where
157 T: Ord,
158{
159 arena: Vec<T>,
160 size: usize,
161}
162impl<T> Default for BinHeap<T>
163where
164 T: Ord,
165{
166 fn default() -> Self {
167 Self::new()
168 }
169}
170impl<T> BinHeap<T>
171where
172 T: Ord,
173{
174 // PUBLIC API
175 /////////////
176
177 /// Creates a new, zero-sized `BinHeap`.
178 pub fn new() -> Self {
179 BinHeap {
180 arena: Vec::new(),
181 size: 0,
182 }
183 }
184
185 /// Creates a new `BinHeap` that pre-allocates to the given capacity.
186 pub fn new_with_capacity(capacity: usize) -> Self {
187 BinHeap {
188 arena: Vec::with_capacity(capacity),
189 size: 0,
190 }
191 }
192
193 /// A "heapify" operation that creates a `BinHeap` from any slice of entries
194 /// in `O(n)` time. The entries must be `Clone`.
195 pub fn from_slice(list: &[T]) -> Self
196 where
197 T: Clone,
198 {
199 let mut heap = BinHeap::new();
200 heap.arena = Vec::from(list);
201 let start = (list.len() / 2).saturating_sub(1);
202 for i in (0..=start).rev() {
203 heap.sift_down(i);
204 }
205 heap
206 }
207
208 /// A "heapify" operation that creates a `BinHeap` from a `Vec<T>` in `O(n)`
209 /// time. `T` does not need to be `Clone`.
210 pub fn from_vec(list: Vec<T>) -> Self {
211 let size = list.len();
212 let mut heap = BinHeap { arena: list, size };
213 let start = (size / 2).saturating_sub(1);
214 for i in (0..=start).rev() {
215 heap.sift_down(i);
216 }
217 heap
218 }
219
220 /// Returns the number of elements in the heap.
221 pub fn size(&self) -> usize {
222 self.size
223 }
224
225 /// Adds an element to the heap in `O(log(n))` time.
226 pub fn push(&mut self, data: T) {
227 // 1) Create and insert a Node into the array in a way that maintains the
228 // complete binary tree structure
229 self.arena.push(data);
230
231 // 2) Sift/bubble/percolate the heap to maintain order
232 // in O(log(n)) time
233 self.sift_up(self.arena.len() - 1);
234 }
235
236 /// Returns an immutable reference to the data at the top of the heap
237 /// in `O(1)` time.
238 pub fn peek(&self) -> Option<&T> {
239 if self.arena.is_empty() {
240 None
241 } else {
242 Some(&self.arena[0])
243 }
244 }
245
246 /// Returns and removes the element at the top of the heap in `O(log(n))` time.
247 pub fn pop(&mut self) -> Option<T> {
248 if !self.arena.is_empty() {
249 let node = self.arena.swap_remove(0); // O(1)
250 self.sift_down(0); // O(log(n))
251 Some(node)
252 } else {
253 None
254 }
255 }
256
257 // UTILITIES
258 ////////////
259
260 /// Sifts a node upward from the given index in the tree to maintain the
261 /// heap property.
262 fn sift_up(&mut self, mut index: usize) {
263 // Only sift up if the element is not the root (index 0)
264 while index > 0 {
265 // Calculate the parent's index
266 let parent_index = (index - 1) / 2;
267
268 // For a Min-Heap: If the current node's priority is LESS than its
269 // parent's, swap.
270 // For a Max-Heap: If the current node's priority is GREATER than
271 // its parent's, swap.
272 // Switch < to > for max heap
273 if self.arena[index] < self.arena[parent_index] {
274 self.arena.swap(index, parent_index);
275 // Move up to the parent's old position to continue sifting
276 index = parent_index;
277 } else {
278 break; // No-op, heap is heapy 🍃🧘
279 }
280 }
281 }
282
283 /// Sifts a new node downward in the tree to maintain the heap property.
284 fn sift_down(&mut self, mut index: usize) {
285 loop {
286 let left_child = 2 * index + 1;
287 let right_child = 2 * index + 2;
288 let mut target = index;
289
290 // 1) Finds a target index to swap. Sibling order is not guaranteed,
291 // but you still need to check both children to ensure heap property
292 // when sifting
293 //
294 // NOTE: (Adjust < to > for a max heap)
295 if left_child < self.arena.len() && self.arena[left_child] < self.arena[target] {
296 target = left_child;
297 }
298 if right_child < self.arena.len() && self.arena[right_child] < self.arena[target] {
299 target = right_child;
300 }
301
302 // 2) If the target is not equal to the index, do the swap operation
303 // and continue sifting, otherwise the heap property is true
304 if target != index {
305 self.arena.swap(index, target);
306 index = target; // Move down
307 } else {
308 break; // No-op, heap is heapy 🍃🧘
309 }
310 }
311 }
312
313 /// Sorts a mutable slice into ascending order in place via (max) heap operations in `O(n * log(n))` time.
314 ///
315 /// ```rust
316 /// use dsa_rust::hierarchies::bin_heap::BinHeap;
317 /// let mut v = Vec::from([8, 6, 7, 5, 3, 0, 9]);
318 /// BinHeap::heap_sort(&mut v);
319 /// assert_eq!(v, [0, 3, 5, 6, 7, 8, 9]);
320 /// ```
321 pub fn heap_sort(list: &mut [T])
322 where
323 T: Ord,
324 {
325 let len = list.len();
326
327 // Transform the list into a heap in O(n) time
328 for i in (0..len / 2).rev() {
329 Self::generic_sift_down(list, i, len); // O(log(n))
330 }
331
332 // Sort the heap by sifting/sorting in place in O(n * log(n))
333 for end in (1..len).rev() {
334 list.swap(0, end);
335 Self::generic_sift_down(list, 0, end); // O(log(n))
336 }
337 }
338
339 /// Essentially the as sift_down but takes a list (not a heap as self), and
340 /// performs max heap sifting instead of default min heap sifting.
341 fn generic_sift_down(list: &mut [T], mut root: usize, end: usize) {
342 loop {
343 let left = 2 * root + 1;
344 let right = 2 * root + 2;
345 let mut largest = root;
346
347 if left < end && list[left] > list[largest] {
348 largest = left;
349 }
350 if right < end && list[right] > list[largest] {
351 largest = right;
352 }
353
354 if largest != root {
355 list.swap(root, largest);
356 root = largest;
357 } else {
358 break;
359 }
360 }
361 }
362}
363
364#[cfg(test)]
365mod tests {
366
367 use super::*;
368
369 #[test]
370 fn min_heap_test() {
371 // Defines an arbitrary entry object,
372 // NOTE: Clone is only necessary for from_slice() usage
373 //#[derive(Clone, Debug, Eq, PartialEq)]
374 #[derive(Clone, Debug, Eq, PartialEq, PartialOrd)]
375 pub struct Job<'a> {
376 pub priority: usize,
377 pub title: &'a str,
378 }
379
380 // Heaps require total ordering, so BinHeap places an Ord trait bound on T.
381 // This implementation guarantees ordering by the priority field only.
382 // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
383 // are made using the Ord implementation.
384 use std::cmp::Ordering;
385 impl Ord for Job<'_> {
386 fn cmp(&self, other: &Self) -> Ordering {
387 self.priority.cmp(&other.priority)
388 }
389 }
390
391 // Heaps require total order, so a custom PartialOrd impl is unnecessary
392 //impl PartialOrd for Job<'_> {
393 // fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
394 // Some(self.cmp(other))
395 // }
396 //}
397
398 let list = vec![
399 Job {
400 priority: 9,
401 title: "Dichael",
402 },
403 Job {
404 priority: 13,
405 title: "Damiel",
406 },
407 Job {
408 priority: 8,
409 title: "Sleve",
410 },
411 Job {
412 priority: 6,
413 title: "Peter",
414 },
415 Job {
416 priority: 14,
417 title: "Flank",
418 },
419 Job {
420 priority: 11,
421 title: "Cork",
422 },
423 Job {
424 priority: 12,
425 title: "Bobson",
426 },
427 ];
428
429 // Heapify options!
430 ///////////////////
431
432 // Option 1: from_slice()
433 let mut heap = BinHeap::from_slice(&list);
434 assert_eq!(heap.pop().unwrap().title, "Peter");
435 assert_eq!(heap.pop().unwrap().title, "Sleve");
436 assert_eq!(heap.pop().unwrap().title, "Dichael");
437 assert_eq!(heap.pop().unwrap().title, "Cork");
438 assert_eq!(heap.peek().unwrap().title, "Bobson");
439 assert_eq!(heap.pop().unwrap().title, "Bobson");
440 assert_eq!(heap.pop().unwrap().title, "Damiel");
441 assert_eq!(heap.peek().unwrap().title, "Flank");
442
443 // Option 2: from_vec(), clones purely for list reuse
444 let mut heap = BinHeap::from_vec(list.clone());
445 assert_eq!(heap.pop().unwrap().title, "Peter");
446 assert_eq!(heap.pop().unwrap().title, "Sleve");
447 assert_eq!(heap.pop().unwrap().title, "Dichael");
448 assert_eq!(heap.peek().unwrap().title, "Cork");
449 assert_eq!(heap.pop().unwrap().title, "Cork");
450 assert_eq!(heap.pop().unwrap().title, "Bobson");
451 assert_eq!(heap.pop().unwrap().title, "Damiel");
452 assert_eq!(heap.peek().unwrap().title, "Flank");
453
454 // Examples that use Reverse to construct a max heap
455 use std::cmp::Reverse;
456
457 // Option 1: O(n) reverse heapify, clones purely for list reuse
458 let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
459 let mut heap = BinHeap::from_vec(reversed_vec);
460 assert_eq!(heap.pop().unwrap().0.title, "Flank");
461 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
462 assert_eq!(heap.pop().unwrap().0.title, "Bobson");
463 assert_eq!(heap.pop().unwrap().0.title, "Cork");
464 assert_eq!(heap.peek().unwrap().0.title, "Dichael");
465 assert_eq!(heap.pop().unwrap().0.title, "Dichael");
466 assert_eq!(heap.pop().unwrap().0.title, "Sleve");
467 assert_eq!(heap.peek().unwrap().0.title, "Peter");
468
469 // Option 2: O(n * log(n)) incremental reverse heap construction
470 let mut heap = BinHeap::new();
471 for e in list {
472 heap.push(Reverse(e))
473 }
474 assert_eq!(heap.pop().unwrap().0.title, "Flank");
475 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
476 assert_eq!(heap.pop().unwrap().0.title, "Bobson");
477 assert_eq!(heap.pop().unwrap().0.title, "Cork");
478 assert_eq!(heap.peek().unwrap().0.title, "Dichael");
479 assert_eq!(heap.pop().unwrap().0.title, "Dichael");
480 assert_eq!(heap.pop().unwrap().0.title, "Sleve");
481 assert_eq!(heap.peek().unwrap().0.title, "Peter");
482 }
483
484 #[test]
485 // Tests two approaches to heap sort:
486 // 1) Creating a separate binary heap buffer, pushing values, and popping
487 // them back to a third sorted list.
488 // 2) Treating the backing list as the buffer for in-place heap creation
489 // and sorting via pop & swap loop.
490 fn heap_sort() {
491 let mut list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
492 let sorted = [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56];
493
494 // 1) Sorting via heap buffer (secondary storage)
495 // Creates a heap (buffer) in O(n) time from borrowed values
496 let mut heap = BinHeap::from_slice(&list);
497 // Pops the buffered heap values to a (secondary) sorted list in O(n * log(n)) time
498 let mut sorted_list: Vec<i32> = Vec::new();
499 for _ in 0..list.len() {
500 sorted_list.push(heap.pop().expect("uh oh"))
501 }
502 assert_eq!(sorted_list, sorted);
503
504 // 2) Sorting in place
505 BinHeap::heap_sort(&mut list);
506 assert_eq!(list, sorted);
507
508 let mut v = Vec::from([6, 5, 8, 3]);
509 BinHeap::heap_sort(&mut v);
510 assert_eq!(v, [3, 5, 6, 8]);
511 }
512}