dsa_rust/associative/probing_hash_table.rs
1/*! Safe, open addressing hash table with MAD compression and quadratic probing
2
3# About
4This hash map implementation uses open-addressing to take advantage of cache locality, and relies on a "multiply, add, and divide" (MAD) compression algorithm with quadratic probing to handle collisions.
5
6This structure provides the basis for several other structures in this library, including the [PriorityQueue<K, P>](crate::composite::priority_queue) and [HashSet\<T>](crate::associative::hash_set) implementations.
7
8# Design
9This structure uses Rust's [default hasher](std::hash::DefaultHasher) to hash keys. The default hasher currently represents a cryptographically strong [SipHash-1-3](https://en.wikipedia.org/wiki/SipHash) design as of the publication date (9/2025). In practice, hashing small key objects <= 128 bytes provide similar performance as ULID generation, but hashing arbitrarily large key objects may incur performance penalties.
10
11The primary [HashMap] struct uses a private byte mask (as a "control" list) to track positions with valid ("live"), empty, and removed ("tombstone") entries. Using a conventional free list and overwriting data for a marked slot would break the quadratic probing logic.
12
13```text
14+-------------------------------------------------+
15| ctrl | Option<Entry<K, V>> |
16|------+------------------------------------------|
17| 0 | None |
18| 0 | None |
19| 0 | None |
20| 0 | None |
21| 0 | None |
22| 1 | Some(Entry { key: "Dingus", value: 18 }) |
23| 1 | Some(Entry { key: "Brain", value: 37 }) |
24| 0 | None |
25| 0 | None |
26| 0 | None |
27| 0 | None |
28| 0 | None |
29| 0 | None |
30| 0 | None |
31| 1 | Some(Entry { key: "Remus", value: 22 }) |
32| 187 | None |
33| 0 | None |
34| 1 | Some(Entry { key: "Bobson", value: 36 }) |
35| 0 | None |
36| 0 | None |
37| 0 | None |
38| 187 | None |
39+-------------------------------------------------+
40```
41
42## Insertion
43The structure contains two primary ways to insert entries into the map.
44
45The [put()](crate::associative::probing_hash_table::HashMap::put) operation overwrites values for existing keys, but does not mutate the existing key. This keeps with the standard library's implementation where two keys can be `==` without being identical.
46
47```rust
48 use dsa_rust::associative::probing_hash_table::HashMap;
49
50 // Build a map with put()
51 let mut map = HashMap::<&str, u8>::new();
52 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
53 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
54 for (k, v) in names.iter().zip(values.into_iter()) {
55 map.put(k, v);
56 }
57```
58
59The structure also provides a [mut_val_or()](crate::associative::probing_hash_table::HashMap::mut_val_or) operation for mutating the value(s) of the map or inserting a default value. This is useful with the structures iterators to make linear passes over source data during mapping.
60
61
62```rust
63 use dsa_rust::associative::probing_hash_table::HashMap;
64
65 // Build a map with mut_val_or()
66 let mut map = HashMap::<char, u8>::new();
67 let word: &str = "Hello, sickos!";
68 for k in word.chars() {
69 map.mut_val_or(k, |x| *x += 1, 1);
70 }
71
72```
73
74## Removal & Rehashing
75The structure contains a [remove()](crate::associative::probing_hash_table::HashMap::remove) operation that returns an owned entry if it exists in the map. The `remove()` operation marks the index with a tombstone, and uses [std::mem::take] to remove and return the value. Even though `take()` swaps out a `None` variant the index remains unusable until the calling code invokes a rehash operation. This is done in order to retain probing logic. For long-lived maps with many removals/tombstones, the structure provides two re-hashing operations. The [rehash()](crate::associative::probing_hash_table::HashMap::rehash) operation rehashes the entire map (freeing up tombstones) _at the current underlying structure's capacity_. This essentially allows you to permanently remove deleted entries and reduce the map's load factor. Conversely, the [shrink_to_fit()](crate::associative::probing_hash_table::HashMap::shrink_to_fit) operation rehashes the map (freeing up tombstones) down to the minimum backing capacity that still provides a load factor of _<= .5_. Both processes necessarily require _O(n)_ time where _n_ is the number of live entries in the map.
76
77Example of a map that has had many entries removed:
78
79```text
80+-------------------------------------------------+
81| ctrl | Option<Entry<K, V>> |
82|------+------------------------------------------|
83| 1 | Some(Entry { key: 'o', value: 2 }) |
84| 187 | None |
85| 0 | None |
86| 187 | None |
87| 1 | Some(Entry { key: 'l', value: 2 }) |
88| 187 | None |
89| 0 | None |
90| 0 | None |
91| 0 | None |
92| 187 | None |
93| 187 | None |
94| 0 | None |
95| 0 | None |
96| 0 | None |
97| 0 | None |
98| 1 | Some(Entry { key: 'H', value: 1 }) |
99| 0 | None |
100| 187 | None |
101| 0 | None |
102| 0 | None |
103| 1 | Some(Entry { key: 'e', value: 1 }) |
104| 187 | None |
105| 0 | None |
106+-------------------------------------------------+
107```
108
109This example illustrates how the map might look after a `rehash()`:
110
111```text
112+-------------------------------------------------+
113| ctrl | Option<Entry<K, V>> |
114|------+------------------------------------------|
115| 1 | Some(Entry { key: 'H', value: 1 }) |
116| 1 | Some(Entry { key: 'o', value: 2 }) |
117| 0 | None |
118| 0 | None |
119| 0 | None |
120| 0 | None |
121| 0 | None |
122| 0 | None |
123| 0 | None |
124| 0 | None |
125| 0 | None |
126| 0 | None |
127| 1 | Some(Entry { key: 'l', value: 2 }) |
128| 0 | None |
129| 0 | None |
130| 0 | None |
131| 0 | None |
132| 0 | None |
133| 0 | None |
134| 0 | None |
135| 1 | Some(Entry { key: 'e', value: 1 }) |
136| 0 | None |
137| 0 | None |
138+-------------------------------------------------+
139```
140
141This example illustrates how the map might look after a `shrink_to_fit()`:
142
143```text
144+-------------------------------------------------+
145| ctrl | Option<Entry<K, V>> |
146|------+------------------------------------------|
147| 1 | Some(Entry { key: 'H', value: 1 }) |
148| 0 | None |
149| 0 | None |
150| 1 | Some(Entry { key: 'l', value: 2 }) |
151| 1 | Some(Entry { key: 'o', value: 2 }) |
152| 0 | None |
153| 0 | None |
154| 1 | Some(Entry { key: 'e', value: 1 }) |
155| 0 | None |
156| 0 | None |
157| 0 | None |
158+-------------------------------------------------+
159```
160
161# Example
162
163```rust
164
165 use dsa_rust::associative::probing_hash_table::HashMap;
166
167 //Creates a new hash map with str keys and u8 values
168 let mut map = HashMap::<&str, u8>::new();
169
170 println!("Map stats: size: {}, capacity: {}, active entries: {}",
171 map.occupied(),
172 map.capacity(),
173 map.len()
174 );
175
176 // Puts some entries into the map
177 println!("Building the map...");
178 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
179 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
180 for (k, v) in names.iter().zip(values.into_iter()) {
181 map.put(k, v);
182 }
183
184 // Checks that the map contains what we'd expect
185 if map.contains("Peter") == false {
186 panic!()
187 };
188 let val = map.get("Peter").unwrap();
189 println!("Peter is {val}");
190
191 // Replaces a value for a given key and
192 // checks that the new value took
193 let new = 41;
194
195 //let old = map.put("Peter", new).unwrap().value;
196 let old = map.get("Peter").unwrap().clone();
197 map.put("Peter", new);
198
199 println!("[Peter's age increased from {old} to {new}]");
200 let val = map.get("Peter").unwrap();
201 println!("Uhhhhh, I meant Peter is {val}");
202
203 // Shows the map and its data
204 println!(
205 "\nMap stats: size: {}, capacity: {}, active entries: {}",
206 map.occupied(),
207 map.capacity(),
208 map.len()
209 );
210 map.contents();
211
212 // Illustrates removing entries
213 println!("\nThere can be only one!");
214 names.remove(0);
215 for e in names {
216 let removed = map.remove(&e);
217 if let Some(entry) = removed {
218 println!("Remove: {}", entry.key());
219 }
220 }
221
222 // The final result
223 println!("\nMap stats: size: {}, capacity: {}, active entries: {}",
224 map.occupied(),
225 map.capacity(),
226 map.len()
227 );
228 map.contents();
229```
230
231*/
232
233use crate::associative::hash_lib;
234//use crate::maps::traits;
235use rand::Rng;
236use std::collections::hash_map::DefaultHasher;
237use std::fmt::Debug;
238use std::hash::{Hash, Hasher};
239
240#[derive(Debug, PartialEq)]
241// Contains the actual key:value pair
242pub struct Entry<K, V> {
243 key: K,
244 value: V,
245}
246impl<K, V> Entry<K, V>
247where
248 K: Debug + Hash + PartialEq,
249 V: Debug,
250{
251 /// Constructs a new Entry
252 fn new(key: K, value: V) -> Entry<K, V> {
253 Entry { key, value }
254 }
255
256 /// Returns the key from an Entry
257 pub fn key(&self) -> &K {
258 &self.key
259 }
260
261 /// Returns the value from an Entry
262 pub fn value(&self) -> &V {
263 &self.value
264 }
265}
266impl<K, V> Default for Entry<K, V>
267where
268 K: Default + Debug + Hash + PartialEq,
269 V: Debug + Default,
270{
271 fn default() -> Self {
272 Self {
273 key: K::default(),
274 value: V::default(),
275 }
276 }
277}
278
279// Byte masks were deemed more performant, but this is valuable historic pattern
280//#[derive(Debug, PartialEq)]
281//pub enum Entry<K, V> {
282// Live { key: K, value: V },
283// Tombstone,
284//}
285//impl<K, V> Entry<K, V>
286//where
287// K: Debug + Hash + PartialEq,
288// V: PartialEq,
289//{
290// /// Constructs a new Entry
291// fn new(key: K, value: V) -> Entry<K, V> {
292// Entry::Live { key, value }
293// }
294//
295// /// Returns the key from an Entry
296// pub fn key(&self) -> Option<&K> {
297// match self {
298// Entry::Live { key, .. } => Some(key),
299// Entry::Tombstone => None,
300// }
301// }
302//
303// /// Returns the value from an Entry
304// pub fn value(&self) -> Option<&V> {
305// match self {
306// Entry::Live { value, .. } => Some(value),
307// Entry::Tombstone => None,
308// }
309// }
310//
311// /// Entry adapter that takes a closure to mutate the value of an entry, if it exists.
312// pub fn mut_val<F>(&mut self, f: F)
313// where
314// F: FnOnce(&mut V),
315// {
316// if let Entry::Live { value, .. } = self {
317// f(value);
318// } else {
319// // No-op: Entry is a Tombstone and that would break the probing logic.
320// }
321// }
322//}
323
324#[derive(Debug)]
325pub struct HashMap<K, V> {
326 pub data: Vec<Option<Entry<K, V>>>, // The primary memory backing
327 pub ctrl: Vec<u8>, // A byte mask to identify available positions
328 size: usize, // The total number of entries in the map (live + deleted)
329 live: usize, // The number of "live" entries in the map
330
331 // NOTE: Prime, scale, and shift are used by the MAD compression algorithm.
332 // These randomly-generated values must remain static for the
333 // lifetime of the backing structure; The grow() operation changes these values
334 prime: usize,
335 scale: usize,
336 shift: usize,
337}
338impl<K, V> Default for HashMap<K, V>
339where
340 K: Debug + Default + Eq + Hash + PartialEq,
341 V: Default + PartialEq + std::fmt::Debug,
342{
343 fn default() -> Self {
344 Self::new()
345 }
346}
347impl<K, V> HashMap<K, V>
348where
349 K: Debug + Eq + Hash + PartialEq,
350 V: std::fmt::Debug,
351{
352 /// Constructor for an empty map with a default capacity of 2.
353 pub fn new() -> Self {
354 let new_capacity = 2;
355 let mut table = Self {
356 data: Vec::with_capacity(new_capacity),
357 ctrl: Vec::with_capacity(new_capacity),
358 prime: 13,
359 scale: 5,
360 shift: 7,
361 size: 0,
362 live: 0,
363 };
364 // Initialize storage to ensure access
365 table.data.resize_with(new_capacity, || None);
366 table.ctrl.resize_with(new_capacity, || 0x00);
367 table
368 }
369
370 /// Constructor for an empty map with a given capacity.
371 pub fn new_with_capacity(size: usize) -> Self {
372 let mut table = Self {
373 data: Vec::with_capacity(size),
374 ctrl: Vec::with_capacity(size),
375 prime: 13,
376 scale: 5,
377 shift: 7,
378 size: 0,
379 live: 0,
380 };
381 // Initialize storage to ensure access
382 table.data.resize_with(size, || None);
383 table.ctrl.resize_with(size, || 0x00);
384 table
385 }
386
387 /// Returns the number of active entries in the map in _O(1)_ time.
388 pub fn len(&self) -> usize {
389 self.live
390 }
391
392 /// Returns the total number of entries in the map (active + deleted entries) in _O(n)_ time.
393 pub fn occupied(&self) -> usize {
394 let mut occupied = 0;
395 for e in self.ctrl.iter() {
396 if *e == 1 || *e == 187 {
397 occupied += 1
398 }
399 }
400 occupied
401 }
402
403 /// Returns the total number of _deleted_ entries in the map in _O(n)_ time.
404 pub fn deleted(&self) -> usize {
405 let mut occupied = 0;
406 for e in self.ctrl.iter() {
407 if *e == 187 {
408 occupied += 1
409 }
410 }
411 occupied
412 }
413
414 /// Returns the total number of available slots in the map in _O(1)_ time.
415 ///
416 /// NOTE: The load factor is the quotient of `len() / capacity()`.
417 pub fn capacity(&self) -> usize {
418 self.data.len()
419 }
420
421 /// Returns true if the map is either empty or contains only empty slots and deleted entries.
422 pub fn is_empty(&self) -> bool {
423 self.live == 0 || self.data.is_empty()
424 }
425
426 /// Pretty-prints the map's contents.
427 pub fn contents(&self) {
428 for (e, m) in self.data.iter().zip(self.ctrl.iter()) {
429 println!(" {m:>3}: {e:?}")
430 }
431 }
432
433 /// Takes a key as a reference and returns a Boolean indicating whether its in
434 /// the map. The expected temporal complexity is _O(1)_, as the map
435 /// maintains a [laod factor](https://www.headyimage.com/cs/dsa/maps/#collision-handling-schemes) of <=.5.
436 pub fn contains<Q>(&self, key: &Q) -> bool
437 where
438 K: std::borrow::Borrow<Q>,
439 Q: Debug + Hash + Eq + ?Sized,
440 {
441 self.find_index(key) >= 0
442 }
443
444 /// Returns a reference to the value associated with a key, if the key exists.
445 pub fn get<Q>(&self, key: &Q) -> Option<&V>
446 //pub fn get(&self, key: &K) -> Option<&V> {
447 where
448 K: std::borrow::Borrow<Q>,
449 Q: Debug + Hash + Eq + ?Sized,
450 {
451 // find_index() uses quadratic probing
452 let location = self.find_index(key);
453 if location >= 0 {
454 let value = &self.data[location as usize].as_ref().unwrap().value;
455 Some(value)
456 } else {
457 None
458 }
459 }
460
461 /// Adds entry `(k, v)`, overwriting any value `v` associated with an
462 /// existing key `k`, returns old value. If a new addition increases
463 /// the map's load factor above the designated threshhold of 0.5
464 /// the map resizes.
465 pub fn put(&mut self, key: K, value: V) -> Option<Entry<K, V>>
466//where
467 // K: std::default::Default,
468 {
469 // Checks if the addition will bring the load factor above threshold
470 if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
471 self.grow();
472 }
473
474 let location = self.find_index(&key);
475
476 // Creates a new Entry and inserts it
477 let entry = Entry::new(key, value);
478 let mut old_entry: Option<Entry<K, V>> = None;
479 if location >= 0 {
480 // Replace an entry
481 //println!("COLLISION!!!! {:?}", &entry.key);
482 old_entry = self.data[location as usize].take();
483 self.data[location as usize] = Some(entry);
484 } else {
485 // Add a new entry
486 self.data[-(location + 1) as usize] = Some(entry);
487 self.ctrl[-(location + 1) as usize] = 0x01;
488 self.size += 1;
489 self.live += 1;
490 };
491 old_entry
492 }
493
494 // Attempts to add entry `(k, v)` to the map, if key `k` does not
495 // already exist. If a new addition increases
496 // the map's load factor above the designated threshhold of 0.5
497 // the map resizes.
498 pub fn try_put(&mut self, key: K, value: V) {
499 // Checks if the addition will bring the load factor above threshold
500 if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
501 self.grow();
502 }
503
504 let location = self.find_index(&key);
505
506 // Avoids duplicates all together
507 if location < 0 {
508 let entry = Entry::new(key, value);
509 self.data[-(location + 1) as usize] = Some(entry);
510 self.ctrl[-(location + 1) as usize] = 0x01;
511 self.size += 1;
512 self.live += 1;
513 }
514 }
515
516 /// Takes a key, a closure, and a default value. If the key is in the map, applies the closure
517 /// to the corresponding entry's value. If the key is not in the map, creates a new entry with
518 /// the provided value.
519 ///
520 /// Example:
521 /// ```rust
522 /// use dsa_rust::associative::probing_hash_table::HashMap;
523 /// let mut count: HashMap<char, u32> = HashMap::new();
524 /// let word: &str = "Hello";
525 /// for k in word.chars() {
526 /// count.mut_val_or(k, |x| *x += 1, 1);
527 /// }
528 /// ```
529 pub fn mut_val_or<F>(&mut self, key: K, f: F, default: V)
530 where
531 F: FnOnce(&mut V),
532 K: std::default::Default,
533 {
534 // Find the appropriate index
535 let index = self.find_index(&key);
536
537 if index >= 0 {
538 // Found existing key, mutate value
539 if let Some(Entry { value, .. }) = self.data[index as usize].as_mut() {
540 f(value);
541 }
542 } else {
543 // Create new entry
544 self.put(key, default);
545 }
546 }
547
548 /// Removes and returns an entry from the map for a given key, if it exists in the map.
549 pub fn remove<Q>(&mut self, key: &Q) -> Option<Entry<K, V>>
550 where
551 K: std::borrow::Borrow<Q>,
552 Q: Debug + Hash + Eq + ?Sized,
553 {
554 //pub fn remove(&mut self, key: K) -> Option<Entry<K, V>> {
555 let location = self.find_index(key);
556
557 let mut entry = None;
558 if location >= 0 {
559 //entry = self.data[location as usize].replace(Entry::default());
560 entry = self.data[location as usize].take();
561 self.ctrl[location as usize] = 0xBB; // Mark with tombstone
562 self.live -= 1;
563 }
564 entry
565 }
566
567 /// Consumes self and returns a new map of the same size, but without any tombstones.
568 /// Works like a cleanup operation in _O(capacity)_ time because `into_iter` checks all positions.
569 pub fn rehash(self) -> Self {
570 let mut new = Self::new_with_capacity(self.data.len());
571 for (k, v) in self.into_iter() {
572 new.put(k, v);
573 }
574 new
575 }
576
577 /// Rebuilds the map to eliminate accumulated tombstones thereby reducing
578 /// spatial bloat. This operation runs in _O(n)_ time and is intended for
579 /// long-lived maps that have undergone many deletions.
580 ///
581 /// The operation consumes the existing map and returns a new `HashMap`
582 /// with the same live entries.
583 pub fn shrink_to_fit(self) -> Self {
584 let mut new = Self::new();
585 for (k, v) in self.into_iter() {
586 new.put(k, v);
587 }
588 new
589 }
590
591 /// Returns an iterator over borrowed values. The resulting values
592 /// appear in random order.
593 ///
594 /// Example use:
595 /// ```rust
596 /// use dsa_rust::associative::probing_hash_table::HashMap;
597 /// let mut count: HashMap<char, u8> = HashMap::new();
598 /// let mut v = Vec::new();
599 /// for e in count.iter() {
600 /// v.push(*e.0);
601 /// }
602 /// ```
603 pub fn iter(&self) -> Iter<'_, K, V> {
604 Iter {
605 iter: self.data.iter(),
606 }
607 }
608
609 /// Returns an iterator over borrowed keys only. The keys
610 /// appear in random order.
611 ///
612 /// Example use:
613 /// ```text
614 /// for e in map.keys() {
615 /// println!("{e}");
616 /// }
617 /// ```
618 pub fn keys(&self) -> Keys<'_, K, V> {
619 Keys {
620 iter: self.data.iter(),
621 }
622 }
623
624 /// Returns an iterator over borrowed values only. The values
625 /// appear in random order.
626 ///
627 /// Example use:
628 /// ```text
629 /// for e in map.values() {
630 /// println!("{e}");
631 /// }
632 /// ```
633 pub fn values(&self) -> Values<'_, K, V> {
634 Values {
635 iter: self.data.iter(),
636 }
637 }
638
639 // UTILITY FUNCTIONS
640 ////////////////////
641
642 /// Finds the correct insertion location using probing
643 /// Searches the map for key:
644 /// if `>= 0`, overwrite the location and return the old Entry,
645 /// if `< 0`, insert new entry at that location, return None
646 fn find_index<Q>(&self, key: &Q) -> isize
647 where
648 K: std::borrow::Borrow<Q>,
649 Q: Debug + Hash + Eq + ?Sized,
650 {
651 let mut i: usize = 1;
652 let hash = Self::hash(&key);
653 let mut current_index = self.compress(hash);
654
655 // Quadratic probing logic
656 loop {
657 match &self.data[current_index] {
658 Some(val) => {
659 if val.key.borrow() == key {
660 return current_index as isize;
661 }
662 }
663 None => {
664 if self.ctrl[current_index] != 0xBB {
665 return -(current_index as isize + 1);
666 }
667 }
668 }
669 current_index = (current_index + i.pow(2)) % self.data.len();
670 i += 1;
671 }
672 //while let Some(entry) = &self.data[current_index] {
673 // if entry.key.borrow() == key && self.ctrl[current_index] != 0xBB {
674 // return current_index as isize;
675 // } else {
676 // current_index = (current_index + i.pow(2)) % self.data.len();
677 // i += 1;
678 // }
679 //}
680 //-(current_index as isize + 1)
681 }
682
683 /// Takes a reference to a type `T` and uses Rust's default hasher to return a
684 /// 64-bit digest.
685 fn hash<T: Hash + Debug + ?Sized>(key: &T) -> usize {
686 let mut hasher = DefaultHasher::new();
687 key.hash(&mut hasher); // Hash::hash()
688 hasher.finish() as usize // Hasher::finish()
689 }
690
691 // Compresses the hash using the MAD algorithm
692 fn compress(&self, hash: usize) -> usize {
693 (hash.wrapping_mul(self.scale)).wrapping_add(self.shift) % (self.prime) % (self.data.len())
694 }
695
696 /// Grows the base (storage) vector to the next prime larger than
697 /// double the length of the original vector, rehashes and compresses
698 /// hashes for new distribution.
699 fn grow(&mut self) {
700 // Create a new base vector with_capacity() and resize_with()
701 // to ensure all indexes exist, otherwise you could push to an
702 // index that doesn't exist causing a panic.
703 // NOTE: Vec::resize_with() may result in "hidden allocation"
704 // despite description that indicates that the function resizes
705 // "in place", initializes new_base with all None values.
706 let new_capacity = hash_lib::next_prime(self.data.len() * 2);
707 let mut new_base: Vec<Option<Entry<K, V>>> = Vec::with_capacity(new_capacity);
708 new_base.resize_with(new_capacity, || None);
709 let mut new_ctrl: Vec<u8> = Vec::with_capacity(new_capacity);
710 new_ctrl.resize_with(new_capacity, || 0x00);
711
712 //println!("Growing from {} to {}", self.data.len(), new_capacity);
713
714 // Reset the instance values for MAD compression
715 // Finds a prime > len, starting much larger to ensure even spread
716 self.prime = hash_lib::next_prime(new_capacity * 2);
717 let mut rng = rand::rng(); // Thread-local RNG
718 self.scale = rng.random_range(1..=self.prime - 1);
719 self.shift = rng.random_range(0..=self.prime - 1);
720
721 // Move entries from self.data into new_base
722 // For each Some entry in old table, calculate new location
723 // via hash/compression and insert the entry into new_base[location]
724 for e in &mut self.data {
725 if let Some(v) = e.take() {
726 // MAD compression algorithm
727 let mut location: usize = ((hash_lib::hash(&v.key)).wrapping_mul(self.scale))
728 //let mut location: usize = ((hash_lib::hash(&v.key().unwrap())).wrapping_mul(self.scale))
729 .wrapping_add(self.shift)
730 % (self.prime)
731 % (new_capacity);
732
733 // Handle potential collisions for new vec insert
734 let mut i: usize = 0;
735 while new_base[location].is_some() {
736 location = (location + i.pow(2)) % new_capacity;
737 i += 1;
738 }
739 new_base[location] = Some(v);
740 new_ctrl[location] = 0x01;
741 }
742 }
743
744 //Update the struct instance to the new collections
745 self.data = new_base;
746 self.ctrl = new_ctrl;
747 }
748}
749
750// Returns a tuple of borrowed key:value pairs (&K, &V)
751//pub struct Iter<'a, K, V> {
752// iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
753//}
754//impl<'a, K, V> Iterator for Iter<'a, K, V>
755// where
756// K: Debug + Hash + PartialEq,
757// V: PartialEq,
758//{
759//
760// type Item = (&'a K, &'a V);
761//
762// fn next(&mut self) -> Option<Self::Item> {
763// for opt_entry in self.iter.by_ref() {
764// if let Some(entry) = opt_entry {
765// return Some((&entry.key, &entry.value));
766// }
767// }
768// None
769// }
770//}
771pub struct Iter<'a, K, V> {
772 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
773}
774
775impl<'a, K, V> Iterator for Iter<'a, K, V>
776where
777 K: Debug + Hash + PartialEq,
778 V: PartialEq,
779{
780 type Item = (&'a K, &'a V);
781
782 fn next(&mut self) -> Option<Self::Item> {
783 self.iter
784 .find_map(|opt_entry| opt_entry.as_ref().map(|entry| (&entry.key, &entry.value)))
785 }
786}
787
788// Returns a tuple of owned key:value pairs (K, V)
789pub struct IntoIter<K, V> {
790 inner: std::iter::Zip<std::vec::IntoIter<Option<Entry<K, V>>>, std::vec::IntoIter<u8>>,
791}
792impl<K, V> Iterator for IntoIter<K, V> {
793 type Item = (K, V);
794
795 fn next(&mut self) -> Option<Self::Item> {
796 for (slot, tag) in self.inner.by_ref() {
797 if tag == 1 {
798 if let Some(entry) = slot {
799 return Some((entry.key, entry.value));
800 }
801 }
802 }
803 None
804 }
805}
806
807// Allows caller to consume HashMap as an iterator
808impl<K, V> IntoIterator for HashMap<K, V> {
809 type Item = (K, V);
810 type IntoIter = IntoIter<K, V>;
811
812 fn into_iter(self) -> Self::IntoIter {
813 IntoIter {
814 inner: self.data.into_iter().zip(self.ctrl),
815 }
816 }
817}
818
819pub struct Keys<'a, K, V> {
820 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
821}
822impl<'a, K, V> Iterator for Keys<'a, K, V> {
823 type Item = &'a K;
824
825 fn next(&mut self) -> Option<Self::Item> {
826 if let Some(entry) = self.iter.by_ref().flatten().next() {
827 return Some(&entry.key);
828 }
829 None
830 }
831}
832
833pub struct Values<'a, K, V> {
834 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
835}
836impl<'a, K, V> Iterator for Values<'a, K, V> {
837 type Item = &'a V;
838
839 fn next(&mut self) -> Option<Self::Item> {
840 if let Some(entry) = self.iter.by_ref().flatten().next() {
841 return Some(&entry.value);
842 }
843 None
844 }
845}
846
847// Unit tests
848/////////////
849
850#[test]
851// Generic type test
852fn probing_hash_table_test() {
853 //Creates a new hash map
854 let mut map = HashMap::<&str, u8>::new();
855
856 assert_eq!(map.data.len(), 2);
857 assert_eq!(map.ctrl.len(), 2);
858
859 // Illustrates that put() and get() work
860 map.put("Peter", 40);
861 assert_eq!(map.size, 1);
862 assert_eq!(map.data.len(), 5);
863 assert_eq!(map.ctrl.len(), 5);
864
865 let fetch = map.get("Peter").unwrap();
866 //assert_eq!(*fetch, 40 as u8);
867 assert_eq!(*fetch, 40);
868
869 // Illustrates that the map grows correctly
870 map.put("Brain", 39); // Grows the map
871 assert_eq!(map.data.len(), 5);
872 map.put("Remus", 22);
873 map.put("Bobson", 36); // Grows the map
874 assert_eq!(map.data.len(), 11);
875 map.put("Dingus", 18);
876 map.put("Dangus", 27); // Grows the map
877 assert_eq!(map.size, 6);
878 assert_eq!(map.data.len(), 23);
879
880 // Illustrates that contains() works as intended
881 assert!(map.contains("Dingus"));
882
883 // Illustrates that put() returns old values and
884 // overwrites existing values upon collision...
885 let collision = map.put("Peter", 41).unwrap();
886 assert_eq!(collision.value, 40_u8);
887 //assert_eq!(*collision.value().unwrap(), 40_u8);
888 let new_val = map.get("Peter").unwrap();
889 //assert_eq!(*new_val, 41 as u8);
890 assert_eq!(*new_val, 41);
891 // Without increasing the list size
892 assert_eq!(map.size, 6);
893
894 // Illustrates that removes entries by key and returns the value
895 assert!(map.contains("Dangus"));
896 let removed = map.remove(&"Dangus").unwrap();
897 assert_eq!(
898 removed,
899 Entry {
900 key: "Dangus",
901 value: 27
902 }
903 );
904 assert!(!map.contains("Dangus"));
905}
906
907#[test]
908// Tests the custom update value function
909fn mut_val_test() {
910 //Creates a new hash map
911 let mut count = HashMap::<char, u8>::new();
912 let phrase: &str = "Hello, sickos";
913
914 // Seeds the map with just the characters and a default value
915 for char in phrase.chars() {
916 count.put(char, 0);
917 }
918
919 eprintln!("\nInitial state:");
920 count.contents();
921
922 // Iterates through the map again, updating each value based on its occurence
923 // NOTE: Can also be used as initial mapping operation with the same code,
924 // but was split here for illustrative purposes
925 for char in phrase.chars() {
926 count.mut_val_or(char, |x| *x += 1, 1);
927 }
928
929 // Pretty-prints the contents of the map
930 eprintln!("\nModified:");
931 count.contents();
932
933 // Uncomment to trigger debug print
934 //assert!(count.is_empty());
935}
936
937#[test]
938// Tests that the structure is iterable
939fn iter_test() {
940 //Creates a new hash map of the frequence letters in the given phrase
941 let mut map = HashMap::<char, u8>::new();
942 let phrase: &str = "Hello, sickos";
943 for char in phrase.chars() {
944 map.mut_val_or(char, |x| *x += 1, 1);
945 }
946
947 eprintln!("\nInitial map state:");
948 map.contents();
949
950 // Iterates through the map, pushing entries
951 // to a Vec as tuples returned from the iterator
952 eprintln!("\nCompact map: ");
953 let mut vec = Vec::new();
954 for e in map.iter() {
955 vec.push(*e.0);
956 eprintln!("{e:?}")
957 }
958
959 eprintln!("\nFinal map state:");
960 map.contents();
961
962 eprintln!("\nFinal vec state:");
963 for e in vec.iter() {
964 eprintln!("{:?}", *e)
965 }
966
967 // Proves that all entries are still valid
968 // Vec contains only keys
969 assert!(vec.contains(&'s'));
970 assert!(vec.contains(&'i'));
971 assert!(vec.contains(&'c'));
972 assert!(vec.contains(&'k'));
973 assert!(vec.contains(&'o'));
974
975 // Map contains whole entries
976 assert!(map.contains(&'s'));
977 assert!(map.contains(&'i'));
978 assert!(map.contains(&'c'));
979 assert!(map.contains(&'k'));
980 assert!(map.contains(&'o'));
981
982 eprintln!("\nPrinting just the keys:");
983 for key in map.keys() {
984 eprintln!("{key}");
985 }
986
987 eprintln!("\nPrinting just the values:");
988 for value in map.values() {
989 eprintln!("{value}");
990 }
991
992 // Consume the map and transfer it to a new map
993 let mut new_map = HashMap::<char, u8>::new();
994 for e in map.into_iter() {
995 new_map.put(e.0, e.1);
996 }
997 // Illegal, map is moved/consumed!
998 //assert!(map.is_empty());
999
1000 // But new_map contains data now
1001 assert!(new_map.contains(&'s'));
1002 assert!(new_map.contains(&'i'));
1003 assert!(new_map.contains(&'c'));
1004 assert!(new_map.contains(&'k'));
1005 assert!(new_map.contains(&'o'));
1006
1007 // Uncomment to trigger debug print
1008 //assert!(map.is_empty());
1009}
1010
1011#[test]
1012// Tests that into_iter consumes the map,
1013// and that rehash and shrink_to_fit delete removed entries
1014fn rehash_test() {
1015 //Creates a new hash map
1016 let mut map = HashMap::<char, u8>::new();
1017 let word: &str = "Hello, sickos!";
1018 for k in word.chars() {
1019 map.mut_val_or(k, |x| *x += 1, 1);
1020 }
1021
1022 // Prints initial state of the map
1023 // and illustrates that values exist
1024 eprintln!("\nInitial state: ");
1025 map.contents();
1026 assert!(map.contains(&','));
1027 assert!(map.contains(&' '));
1028 assert!(map.contains(&'!'));
1029 assert!(map.contains(&'s'));
1030 assert_eq!(map.len(), 11);
1031 assert_eq!(map.data.len(), 23);
1032
1033 // Removes the values
1034 map.remove(&',');
1035 map.remove(&' ');
1036 map.remove(&'!');
1037 map.remove(&'s');
1038 map.remove(&'i');
1039 map.remove(&'c');
1040 map.remove(&'k');
1041
1042 // Checks that the values are removed
1043 eprintln!("\nModified: ");
1044 map.contents();
1045 assert!(!map.contains(&','));
1046 assert!(!map.contains(&' '));
1047 assert!(!map.contains(&'!'));
1048 assert!(!map.contains(&'s'));
1049 assert_eq!(map.len(), 4);
1050 assert_eq!(map.data.len(), 23);
1051
1052 // Rehashes to get rid of deleted items
1053 // map goes out of scope here
1054 let new_map = map.rehash();
1055
1056 // Illegal operation! into_iter has consumed map!
1057 // assert!(map.contains(&'H'));
1058
1059 // Checks that the values are removed
1060 // but the capacity remains the same as before
1061 eprintln!("\nRehash: ");
1062 new_map.contents();
1063 assert!(!new_map.contains(&','));
1064 assert!(!new_map.contains(&' '));
1065 assert!(!new_map.contains(&'!'));
1066 assert!(!new_map.contains(&'s'));
1067 assert_eq!(new_map.len(), 4);
1068 assert_eq!(new_map.data.len(), 23);
1069
1070 // Repeats the process for shrink_to_fit
1071 //
1072 //Creates a new hash map
1073 let mut map = HashMap::<char, u8>::new();
1074 let word: &str = "Hello, sickos!";
1075 for k in word.chars() {
1076 map.mut_val_or(k, |x| *x += 1, 1);
1077 }
1078
1079 // Prints initial state of the map
1080 // and illustrates that values exist
1081 eprintln!("\nInitial state: ");
1082 map.contents();
1083 assert!(map.contains(&','));
1084 assert!(map.contains(&' '));
1085 assert!(map.contains(&'!'));
1086 assert!(map.contains(&'s'));
1087 assert_eq!(map.len(), 11);
1088 assert_eq!(map.data.len(), 23);
1089
1090 // Removes the values
1091 map.remove(&',');
1092 map.remove(&' ');
1093 map.remove(&'!');
1094 map.remove(&'s');
1095 map.remove(&'i');
1096 map.remove(&'c');
1097 map.remove(&'k');
1098
1099 // Checks that the values are removed
1100 eprintln!("\nModified: ");
1101 map.contents();
1102 assert!(!map.contains(&','));
1103 assert!(!map.contains(&' '));
1104 assert!(!map.contains(&'!'));
1105 assert!(!map.contains(&'s'));
1106 assert_eq!(map.len(), 4);
1107 assert_eq!(map.data.len(), 23);
1108
1109 // Rehashes to get rid of deleted items
1110 // map goes out of scope here
1111 let new_map = map.shrink_to_fit();
1112
1113 // Checks that the values are removed
1114 // but the capacity remains the same as before
1115 eprintln!("\nRehash (shrink to fit): ");
1116 new_map.contents();
1117 assert!(!new_map.contains(&','));
1118 assert!(!new_map.contains(&' '));
1119 assert!(!new_map.contains(&'!'));
1120 assert!(!new_map.contains(&'s'));
1121 assert_eq!(new_map.len(), 4);
1122 assert_eq!(new_map.data.len(), 11);
1123
1124 // Uncomment to trigger debug print
1125 //panic!("MANUAL TEST FAILURE");
1126}
1127
1128// TODO: This belongs in either an external runner or in an integration test module
1129pub fn example() {
1130 //Creates a new hash map
1131 let mut map = HashMap::<&str, u8>::new();
1132
1133 let s = format!(
1134 "Map stats: size: {}, capacity: {}, active entries: {}",
1135 map.size,
1136 map.data.len(),
1137 map.live
1138 );
1139 let l = "=".repeat(s.len());
1140 println!("{l}\n{s}");
1141
1142 // Puts some entries into the map
1143 println!("Building the map...");
1144 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
1145 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
1146 for (k, v) in names.iter().zip(values.into_iter()) {
1147 map.put(k, v);
1148 }
1149
1150 // Checks that the map contains what we'd expect
1151 if !map.contains("Peter") {
1152 panic!()
1153 };
1154 let val = map.get("Peter").unwrap();
1155 println!("Peter is {val}");
1156
1157 // Replaces a value for a given key and
1158 // checks that the new value took
1159 let new = 41;
1160 let old = map.put("Peter", new).unwrap().value;
1161 println!("[Peter's age increased from {old} to {new}]");
1162 let val = map.get("Peter").unwrap();
1163 println!("Uhhhhh, I meant Peter is {val}");
1164
1165 // Shows the map and its data
1166 println!(
1167 "\nMap stats: size: {}, capacity: {}, active entries: {}",
1168 map.size,
1169 map.data.len(),
1170 map.live
1171 );
1172 for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1173 println!("\t{m:>3}: {e:?}")
1174 }
1175
1176 // Illustrates removing entries
1177 println!("\nThere can be only one!");
1178 names.remove(0);
1179 for e in names {
1180 println!("Remove: {}", map.remove(&e).unwrap().key);
1181 }
1182
1183 // The final result
1184 let s = format!(
1185 "\nMap stats: size: {}, capacity: {}, active entries: {}",
1186 map.size,
1187 map.data.len(),
1188 map.live
1189 );
1190 println!("{s}");
1191 for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1192 println!("\t{m:>3}: {e:?}")
1193 }
1194
1195 let mut count: HashMap<char, u8> = HashMap::new();
1196 let word: &str = "The quick brown fox jumps over the lazy dog";
1197 for k in word.chars() {
1198 count.mut_val_or(k, |x| *x += 1, 1);
1199 }
1200 println!(
1201 "\n\nString: {word:?}
1202\nNOTE: There are 28 entries because T != t, and the space literal ' ' is included.
1203\nMap stats: size: {}, capacity: {}, active entries: {}",
1204 count.occupied(), // Total number of elements
1205 count.capacity(),
1206 count.len() // Active entries
1207 );
1208 for (e, m) in count.data.iter().zip(count.ctrl.iter()) {
1209 println!("\t{m:>3}: {e:?}")
1210 }
1211
1212 let l = "=".repeat(s.len());
1213 println!("\n{l}");
1214}