dsa_rust/hierarchies/bin_heap.rs
1/*! An arena-allocated (min) binary heap implementation
2
3# About
4This structure represents a min heap (by default). This means that elements with the smallest ordered values sit at the top of the heap (and get popped first). E.g. an element with priorty 6 gets popped before an element with the priority 9. You can implement [std::cmp::Reverse] to construct max heaps, though it does introduce wrapping which prevents seamless type compatability or interface consistency.
5
6Push and pop operations are guaranteed to run in `O(log(n))` time, but sorting is amortized to `O(n * log(n))` time, which is exacerbated by reverse-sorted inputs.
7
8# Design
9A `Vec`-backed structure is particularly appropriate for heap implementation due to the complete binary tree invariant. Complete binary trees allow you to infer a lot of information in an indexed structure that you would not get with a linked structure.
10
11- With a complete tree you know that the maximum depth (height) of the tree is always always ⌊log2(`n`)⌋ where `n` is the number of elements in the heap.
12
13- Finding an insert point is always `O(1)` in an indexed structure with a known length. Finding an insert point in a linked backing structure requires `O(log(n))` traversal.
14
15- Element types (parent, left child, and right child) can be determined by a mathematical relationship in the backing structure. For example, for index `i`, if `heap[i] == root`, then `i + 1 == root.left`, and `i + 2 == root.right`. This can be extended to any element in the heap such that left children are always `2i + 1`, and right children are always `2i + 2`, and parents are always `(i - 1) / 2`.
16
17- You also know that indexes in a heap with backing structure `list` contain `(list.len() / 2)..list.len()` leaf nodes, and all indexes `0..(list.len() / 2)` are parent nodes. This, of course, relies on _integer division_ (division in which all remainders are truncated/discarded, not rounded).
18
19For a visual example, consider the following layout:
20| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
21|-------|---|---|---|---|---|---|---|---|---|----|-----|----|
22| Value | A | B | C | D | E | F | G | H | I | J | K | L |
23
24This produes the following heap structure:
25```text
26 A
27 / \
28 B C
29 / \ / \
30 D E F G
31 / \ / \ /
32 H I J K L
33```
34You can clearly see that nodes at indexes 6 through 11 are leaf nodes, corresponding to the range
35`(list.len() / 2)..list.len()` when `list.len() == 12`. Conversely, indexes 0 through 5 are parent nodes, corresponding to the range `0..(list.len() / 2)`.
36
37You can also see that all indexes of the form `2i + 1` are left children. E.g. indexes [1, 3, 5, 7, 9, 11] are all left children. The parent of a node at index `i` is always at `(i - 1) / 2`, which produces the mapping:
38
39```text
40 M = { (0,3), (1,2), (2,2), (3,2), (4,2), (5,1) }
41```
42
43# Example
44
45```rust
46 use dsa_rust::hierarchies::bin_heap::BinHeap;
47
48 // Defines an arbitrary entry object
49 // NOTE: Clone is only necessary for from_slice() and list reuse
50 #[derive(Clone, Eq, PartialEq, PartialOrd)]
51 pub struct Job<'a> {
52 pub priority: usize,
53 pub title: &'a str,
54 }
55
56 // Heaps require total ordering, and BinHeap places an Ord trait bound on T.
57 // This implementation guarantees ordering by the priority field only.
58 // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
59 // are made using the Ord implementation.
60 use std::cmp::Ordering;
61 impl Ord for Job<'_> {
62 fn cmp(&self, other: &Self) -> Ordering {
63 self.priority.cmp(&other.priority)
64 }
65 }
66
67 let list = vec![
68 Job {
69 priority: 9,
70 title: "Dichael",
71 },
72 Job {
73 priority: 13,
74 title: "Damiel",
75 },
76 Job {
77 priority: 8,
78 title: "Sleve",
79 },
80 Job {
81 priority: 6,
82 title: "Peter",
83 },
84 Job {
85 priority: 14,
86 title: "Flank",
87 },
88 Job {
89 priority: 11,
90 title: "Cork",
91 },
92 Job {
93 priority: 12,
94 title: "Bobson",
95 },
96 ];
97
98 // Heapify options
99 //////////////////
100
101 // Option 1: from_slice() in O(n) time
102 let mut heap = BinHeap::from_slice(&list);
103 assert_eq!(heap.pop().unwrap().title, "Peter");
104 assert_eq!(heap.pop().unwrap().title, "Sleve");
105
106 // Option 2: from_vec() in O(n) time, clones purely for
107 // list reuse in this test
108 let mut heap = BinHeap::from_vec(list.clone());
109 assert_eq!(heap.pop().unwrap().title, "Peter");
110 assert_eq!(heap.pop().unwrap().title, "Sleve");
111
112 // Examples that use Reverse to construct a max heap
113 use std::cmp::Reverse;
114
115 // Option 1: O(n) reverse heapify, clones purely for list reuse
116 let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
117 let mut heap = BinHeap::from_vec(reversed_vec);
118 assert_eq!(heap.pop().unwrap().0.title, "Flank");
119 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
120
121 // Option 2: O(n * log(n)) incremental reverse heap construction
122 let mut heap = BinHeap::new();
123 for e in list {
124 heap.push(Reverse(e))
125 }
126 assert_eq!(heap.pop().unwrap().0.title, "Flank");
127 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
128
129 // Dirty lil heap sort
130 //////////////////////
131
132 let list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
133
134 // Creates a heap in O(n) time
135 let mut heap = BinHeap::from_slice(&list);
136
137 // Creates a sorted list in O(n * log(n)) time
138 let mut sorted_list: Vec<i32> = Vec::new();
139 for _ in 0..list.len() {
140 sorted_list.push(heap.pop().expect("oh noez"))
141 }
142
143 assert_eq!(sorted_list, [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56]);
144
145
146```
147
148*/
149
150/// # About
151/// It's `Vec` with a brand new dress.
152///
153/// See the [module-level documentation](crate::hierarchies::bin_heap) for more information.
154#[derive(Debug)]
155pub struct BinHeap<T> {
156 arena: Vec<T>,
157 size: usize,
158}
159impl<T: Ord> Default for BinHeap<T> {
160 fn default() -> Self {
161 Self::new()
162 }
163}
164impl<T> BinHeap<T>
165where
166 T: Ord,
167{
168 // PUBLIC API
169 /////////////
170
171 /// Creates a new, zero-sized `BinHeap`.
172 pub fn new() -> Self {
173 BinHeap {
174 arena: Vec::new(),
175 size: 0,
176 }
177 }
178
179 /// Creates a new `BinHeap` that pre-allocates to the given capacity.
180 pub fn new_with_capacity(capacity: usize) -> Self {
181 BinHeap {
182 arena: Vec::with_capacity(capacity),
183 size: 0,
184 }
185 }
186
187 /// A "heapify" operation that creates a `BinHeap` from any slice of entries
188 /// in `O(n)` time. The entries must be `Clone`.
189 pub fn from_slice(list: &[T]) -> Self
190 where
191 T: Clone,
192 {
193 let mut heap = BinHeap::new();
194 heap.arena = Vec::from(list);
195 let start = (list.len() / 2).saturating_sub(1);
196 for i in (0..=start).rev() {
197 heap.sift_down(i);
198 }
199 heap
200 }
201
202 /// A "heapify" operation that creates a `BinHeap` from a `Vec<T>` in `O(n)`
203 /// time. `T` does not need to be `Clone`.
204 pub fn from_vec(list: Vec<T>) -> Self {
205 let size = list.len();
206 let mut heap = BinHeap { arena: list, size };
207 let start = (size / 2).saturating_sub(1);
208 for i in (0..=start).rev() {
209 heap.sift_down(i);
210 }
211 heap
212 }
213
214 /// Returns the number of elements in the heap.
215 pub fn size(&self) -> usize {
216 self.size
217 }
218
219 /// Adds an element to the heap in `O(log(n))` time.
220 pub fn push(&mut self, data: T) {
221 // 1) Create and insert a Node into the array in a way that maintains the
222 // complete binary tree structure
223 self.arena.push(data);
224
225 // 2) Sift/bubble/percolate the heap to maintain order
226 // in O(log(n)) time
227 self.sift_up(self.arena.len() - 1);
228 }
229
230 /// Returns an immutable reference to the data at the top of the heap
231 /// in `O(1)` time.
232 pub fn peek(&self) -> Option<&T> {
233 if self.arena.is_empty() {
234 None
235 } else {
236 Some(&self.arena[0])
237 }
238 }
239
240 /// Returns and removes the element at the top of the heap in `O(log(n))` time.
241 pub fn pop(&mut self) -> Option<T> {
242 if !self.arena.is_empty() {
243 let node = self.arena.swap_remove(0); // O(1)
244 self.sift_down(0); // O(log(n))
245 Some(node)
246 } else {
247 None
248 }
249 }
250
251 // UTILITIES
252 ////////////
253
254 /// Sifts a node upward from the given index in the tree to maintain the
255 /// heap property.
256 fn sift_up(&mut self, mut index: usize) {
257 // Only sift up if the element is not the root (index 0)
258 while index > 0 {
259 // Calculate the parent's index
260 let parent_index = (index - 1) / 2;
261
262 // For a Min-Heap: If the current node's priority is LESS than its
263 // parent's, swap.
264 // For a Max-Heap: If the current node's priority is GREATER than
265 // its parent's, swap.
266 // Switch < to > for max heap
267 if self.arena[index] < self.arena[parent_index] {
268 self.arena.swap(index, parent_index);
269 // Move up to the parent's old position to continue sifting
270 index = parent_index;
271 } else {
272 break; // No-op, heap is heapy 🍃🧘
273 }
274 }
275 }
276
277 /// Sifts a new node downward in the tree to maintain the heap property.
278 fn sift_down(&mut self, mut index: usize) {
279 loop {
280 let left_child = 2 * index + 1;
281 let right_child = 2 * index + 2;
282 let mut target = index;
283
284 // 1) Finds a target index to swap. Sibling order is not guaranteed,
285 // but you still need to check both children to ensure heap property
286 // when sifting
287 //
288 // NOTE: (Adjust < to > for a max heap)
289 if left_child < self.arena.len() && self.arena[left_child] < self.arena[target] {
290 target = left_child;
291 }
292 if right_child < self.arena.len() && self.arena[right_child] < self.arena[target] {
293 target = right_child;
294 }
295
296 // 2) If the target is not equal to the index, do the swap operation
297 // and continue sifting, otherwise the heap property is true
298 if target != index {
299 self.arena.swap(index, target);
300 index = target; // Move down
301 } else {
302 break; // No-op, heap is heapy 🍃🧘
303 }
304 }
305 }
306
307 /// Sorts a mutable slice into ascending order in place via (max) heap operations in `O(n * log(n))` time.
308 ///
309 /// ```rust
310 /// use dsa_rust::hierarchies::bin_heap::BinHeap;
311 /// let mut v = Vec::from([8, 6, 7, 5, 3, 0, 9]);
312 /// BinHeap::heap_sort(&mut v);
313 /// assert_eq!(v, [0, 3, 5, 6, 7, 8, 9]);
314 /// ```
315 pub fn heap_sort(list: &mut [T])
316 where
317 T: Ord,
318 {
319 let len = list.len();
320
321 // Transform the list into a heap in O(n) time
322 for i in (0..len / 2).rev() {
323 Self::generic_sift_down(list, i, len); // O(log(n))
324 }
325
326 // Sort the heap by sifting/sorting in place in O(n * log(n))
327 for end in (1..len).rev() {
328 list.swap(0, end);
329 Self::generic_sift_down(list, 0, end); // O(log(n))
330 }
331 }
332
333 /// Essentially the as sift_down but takes a list (not a heap as self), and
334 /// performs max heap sifting instead of default min heap sifting.
335 fn generic_sift_down(list: &mut [T], mut root: usize, end: usize) {
336 loop {
337 let left = 2 * root + 1;
338 let right = 2 * root + 2;
339 let mut largest = root;
340
341 if left < end && list[left] > list[largest] {
342 largest = left;
343 }
344 if right < end && list[right] > list[largest] {
345 largest = right;
346 }
347
348 if largest != root {
349 list.swap(root, largest);
350 root = largest;
351 } else {
352 break;
353 }
354 }
355 }
356}
357
358#[cfg(test)]
359mod tests {
360
361 use super::*;
362
363 #[test]
364 fn min_heap_test() {
365 // Defines an arbitrary entry object,
366 // NOTE: Clone is only necessary for from_slice() usage
367 //#[derive(Clone, Debug, Eq, PartialEq)]
368 #[derive(Clone, Debug, Eq, PartialEq, PartialOrd)]
369 pub struct Job<'a> {
370 pub priority: usize,
371 pub title: &'a str,
372 }
373
374 // Heaps require total ordering, so BinHeap places an Ord trait bound on T.
375 // This implementation guarantees ordering by the priority field only.
376 // Although PartialOrd is derived, it is unused in BinHeap because all comparisons
377 // are made using the Ord implementation.
378 use std::cmp::Ordering;
379 impl Ord for Job<'_> {
380 fn cmp(&self, other: &Self) -> Ordering {
381 self.priority.cmp(&other.priority)
382 }
383 }
384
385 // Heaps require total order, so a custom PartialOrd impl is unnecessary
386 //impl PartialOrd for Job<'_> {
387 // fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
388 // Some(self.cmp(other))
389 // }
390 //}
391
392 let list = vec![
393 Job {
394 priority: 9,
395 title: "Dichael",
396 },
397 Job {
398 priority: 13,
399 title: "Damiel",
400 },
401 Job {
402 priority: 8,
403 title: "Sleve",
404 },
405 Job {
406 priority: 6,
407 title: "Peter",
408 },
409 Job {
410 priority: 14,
411 title: "Flank",
412 },
413 Job {
414 priority: 11,
415 title: "Cork",
416 },
417 Job {
418 priority: 12,
419 title: "Bobson",
420 },
421 ];
422
423 // Heapify options!
424 ///////////////////
425
426 // Option 1: from_slice()
427 let mut heap = BinHeap::from_slice(&list);
428 assert_eq!(heap.pop().unwrap().title, "Peter");
429 assert_eq!(heap.pop().unwrap().title, "Sleve");
430 assert_eq!(heap.pop().unwrap().title, "Dichael");
431 assert_eq!(heap.pop().unwrap().title, "Cork");
432 assert_eq!(heap.peek().unwrap().title, "Bobson");
433 assert_eq!(heap.pop().unwrap().title, "Bobson");
434 assert_eq!(heap.pop().unwrap().title, "Damiel");
435 assert_eq!(heap.peek().unwrap().title, "Flank");
436
437 // Option 2: from_vec(), clones purely for list reuse
438 let mut heap = BinHeap::from_vec(list.clone());
439 assert_eq!(heap.pop().unwrap().title, "Peter");
440 assert_eq!(heap.pop().unwrap().title, "Sleve");
441 assert_eq!(heap.pop().unwrap().title, "Dichael");
442 assert_eq!(heap.peek().unwrap().title, "Cork");
443 assert_eq!(heap.pop().unwrap().title, "Cork");
444 assert_eq!(heap.pop().unwrap().title, "Bobson");
445 assert_eq!(heap.pop().unwrap().title, "Damiel");
446 assert_eq!(heap.peek().unwrap().title, "Flank");
447
448 // Examples that use Reverse to construct a max heap
449 use std::cmp::Reverse;
450
451 // Option 1: O(n) reverse heapify, clones purely for list reuse
452 let reversed_vec: Vec<Reverse<_>> = list.clone().into_iter().map(Reverse).collect();
453 let mut heap = BinHeap::from_vec(reversed_vec);
454 assert_eq!(heap.pop().unwrap().0.title, "Flank");
455 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
456 assert_eq!(heap.pop().unwrap().0.title, "Bobson");
457 assert_eq!(heap.pop().unwrap().0.title, "Cork");
458 assert_eq!(heap.peek().unwrap().0.title, "Dichael");
459 assert_eq!(heap.pop().unwrap().0.title, "Dichael");
460 assert_eq!(heap.pop().unwrap().0.title, "Sleve");
461 assert_eq!(heap.peek().unwrap().0.title, "Peter");
462
463 // Option 2: O(n * log(n)) incremental reverse heap construction
464 let mut heap = BinHeap::new();
465 for e in list {
466 heap.push(Reverse(e))
467 }
468 assert_eq!(heap.pop().unwrap().0.title, "Flank");
469 assert_eq!(heap.pop().unwrap().0.title, "Damiel");
470 assert_eq!(heap.pop().unwrap().0.title, "Bobson");
471 assert_eq!(heap.pop().unwrap().0.title, "Cork");
472 assert_eq!(heap.peek().unwrap().0.title, "Dichael");
473 assert_eq!(heap.pop().unwrap().0.title, "Dichael");
474 assert_eq!(heap.pop().unwrap().0.title, "Sleve");
475 assert_eq!(heap.peek().unwrap().0.title, "Peter");
476 }
477
478 #[test]
479 // Tests two approaches to heap sort:
480 // 1) Creating a separate binary heap buffer, pushing values, and popping
481 // them back to a third sorted list.
482 // 2) Treating the backing list as the buffer for in-place heap creation
483 // and sorting via pop & swap loop.
484 fn heap_sort() {
485 let mut list = [1, 5, 7, 23, 45, 3, 2, 17, 56, 9, 21];
486 let sorted = [1, 2, 3, 5, 7, 9, 17, 21, 23, 45, 56];
487
488 // 1) Sorting via heap buffer (secondary storage)
489 // Creates a heap (buffer) in O(n) time from borrowed values
490 let mut heap = BinHeap::from_slice(&list);
491 // Pops the buffered heap values to a (secondary) sorted list in O(n * log(n)) time
492 let mut sorted_list: Vec<i32> = Vec::new();
493 for _ in 0..list.len() {
494 sorted_list.push(heap.pop().expect("uh oh"))
495 }
496 assert_eq!(sorted_list, sorted);
497
498 // 2) Sorting in place
499 BinHeap::heap_sort(&mut list);
500 assert_eq!(list, sorted);
501
502 let mut v = Vec::from([6, 5, 8, 3]);
503 BinHeap::heap_sort(&mut v);
504 assert_eq!(v, [3, 5, 6, 8]);
505 }
506}