dsa_rust/associative/probing_hash_table.rs
1/*! Safe, open addressing hash table with MAD compression and quadratic probing
2
3# About
4This hash map implementation uses open-addressing to take advantage of cache locality, and relies on a "multiply, add, and divide" (MAD) compression algorithm with quadratic probing to handle collisions.
5
6This structure provides the basis for several other structures in this library, including the [PriorityQueue<K, P>](crate::composite::priority_queue) and [HashSet\<T>](crate::associative::hash_set) implementations.
7
8# Design
9This structure uses Rust's [default hasher](std::hash::DefaultHasher) to hash keys. The default hasher currently represents a cryptographically strong [SipHash-1-3](https://en.wikipedia.org/wiki/SipHash) design as of the publication date (9/2025). In practice, hashing small key objects <= 128 bytes provide similar performance as ULID generation, but hashing arbitrarily large key objects may incur performance penalties.
10
11The primary [HashMap] struct uses a private byte mask (as a "control" list) to track positions with valid ("live"), empty, and removed ("tombstone") entries. Using a conventional free list and overwriting data for a marked slot would break the quadratic probing logic.
12
13```text
14+-------------------------------------------------+
15| ctrl | Option<Entry<K, V>> |
16|------+------------------------------------------|
17| 0 | None |
18| 0 | None |
19| 0 | None |
20| 0 | None |
21| 0 | None |
22| 1 | Some(Entry { key: "Dingus", value: 18 }) |
23| 1 | Some(Entry { key: "Brain", value: 37 }) |
24| 0 | None |
25| 0 | None |
26| 0 | None |
27| 0 | None |
28| 0 | None |
29| 0 | None |
30| 0 | None |
31| 1 | Some(Entry { key: "Remus", value: 22 }) |
32| 187 | None |
33| 0 | None |
34| 1 | Some(Entry { key: "Bobson", value: 36 }) |
35| 0 | None |
36| 0 | None |
37| 0 | None |
38| 187 | None |
39+-------------------------------------------------+
40```
41
42## Insertion
43The structure contains two primary ways to insert entries into the map.
44
45The [put()](crate::associative::probing_hash_table::HashMap::put) operation overwrites values for existing keys, but does not mutate the existing key. This keeps with the standard library's implementation where two keys can be `==` without being identical.
46
47```rust
48 use dsa_rust::associative::probing_hash_table::HashMap;
49
50 // Build a map with put()
51 let mut map = HashMap::<&str, u8>::new();
52 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
53 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
54 for (k, v) in names.iter().zip(values.into_iter()) {
55 map.put(k, v);
56 }
57```
58
59The structure also provides a [mut_val_or()](crate::associative::probing_hash_table::HashMap::mut_val_or) operation for mutating the value(s) of the map or inserting a default value. This is useful with the structures iterators to make linear passes over source data during mapping.
60
61
62```rust
63 use dsa_rust::associative::probing_hash_table::HashMap;
64
65 // Build a map with mut_val_or()
66 let mut map = HashMap::<char, u8>::new();
67 let word: &str = "Hello, sickos!";
68 for k in word.chars() {
69 map.mut_val_or(k, |x| *x += 1, 1);
70 }
71
72```
73
74## Removal & Rehashing
75The structure contains a [remove()](crate::associative::probing_hash_table::HashMap::remove) operation that returns an owned entry if it exists in the map. The `remove()` operation marks the index with a tombstone, and uses [std::mem::take] to remove and return the value. Even though `take()` swaps out a `None` variant the index remains unusable until the calling code invokes a rehash operation. This is done in order to retain probing logic. For long-lived maps with many removals/tombstones, the structure provides two re-hashing operations. The [rehash()](crate::associative::probing_hash_table::HashMap::rehash) operation rehashes the entire map (freeing up tombstones) _at the current underlying structure's capacity_. This essentially allows you to permanently remove deleted entries and reduce the map's load factor. Conversely, the [shrink_to_fit()](crate::associative::probing_hash_table::HashMap::shrink_to_fit) operation rehashes the map (freeing up tombstones) down to the minimum backing capacity that still provides a load factor of _<= .5_. Both processes necessarily require _O(n)_ time where _n_ is the number of live entries in the map.
76
77Example of a map that has had many entries removed:
78
79```text
80+-------------------------------------------------+
81| ctrl | Option<Entry<K, V>> |
82|------+------------------------------------------|
83| 1 | Some(Entry { key: 'o', value: 2 }) |
84| 187 | None |
85| 0 | None |
86| 187 | None |
87| 1 | Some(Entry { key: 'l', value: 2 }) |
88| 187 | None |
89| 0 | None |
90| 0 | None |
91| 0 | None |
92| 187 | None |
93| 187 | None |
94| 0 | None |
95| 0 | None |
96| 0 | None |
97| 0 | None |
98| 1 | Some(Entry { key: 'H', value: 1 }) |
99| 0 | None |
100| 187 | None |
101| 0 | None |
102| 0 | None |
103| 1 | Some(Entry { key: 'e', value: 1 }) |
104| 187 | None |
105| 0 | None |
106+-------------------------------------------------+
107```
108
109This example illustrates how the map might look after a `rehash()`:
110
111```text
112+-------------------------------------------------+
113| ctrl | Option<Entry<K, V>> |
114|------+------------------------------------------|
115| 1 | Some(Entry { key: 'H', value: 1 }) |
116| 1 | Some(Entry { key: 'o', value: 2 }) |
117| 0 | None |
118| 0 | None |
119| 0 | None |
120| 0 | None |
121| 0 | None |
122| 0 | None |
123| 0 | None |
124| 0 | None |
125| 0 | None |
126| 0 | None |
127| 1 | Some(Entry { key: 'l', value: 2 }) |
128| 0 | None |
129| 0 | None |
130| 0 | None |
131| 0 | None |
132| 0 | None |
133| 0 | None |
134| 0 | None |
135| 1 | Some(Entry { key: 'e', value: 1 }) |
136| 0 | None |
137| 0 | None |
138+-------------------------------------------------+
139```
140
141This example illustrates how the map might look after a `shrink_to_fit()`:
142
143```text
144+-------------------------------------------------+
145| ctrl | Option<Entry<K, V>> |
146|------+------------------------------------------|
147| 1 | Some(Entry { key: 'H', value: 1 }) |
148| 0 | None |
149| 0 | None |
150| 1 | Some(Entry { key: 'l', value: 2 }) |
151| 1 | Some(Entry { key: 'o', value: 2 }) |
152| 0 | None |
153| 0 | None |
154| 1 | Some(Entry { key: 'e', value: 1 }) |
155| 0 | None |
156| 0 | None |
157| 0 | None |
158+-------------------------------------------------+
159```
160
161# Example
162
163```rust
164
165 use dsa_rust::associative::probing_hash_table::HashMap;
166
167 //Creates a new hash map with str keys and u8 values
168 let mut map = HashMap::<&str, u8>::new();
169
170 println!("Map stats: size: {}, capacity: {}, active entries: {}",
171 map.occupied(),
172 map.capacity(),
173 map.len()
174 );
175
176 // Puts some entries into the map
177 println!("Building the map...");
178 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
179 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
180 for (k, v) in names.iter().zip(values.into_iter()) {
181 map.put(k, v);
182 }
183
184 // Checks that the map contains what we'd expect
185 if map.contains("Peter") == false {
186 panic!()
187 };
188 let val = map.get("Peter").unwrap();
189 println!("Peter is {val}");
190
191 // Replaces a value for a given key and
192 // checks that the new value took
193 let new = 41;
194
195 //let old = map.put("Peter", new).unwrap().value;
196 let old = map.get("Peter").unwrap().clone();
197 map.put("Peter", new);
198
199 println!("[Peter's age increased from {old} to {new}]");
200 let val = map.get("Peter").unwrap();
201 println!("Uhhhhh, I meant Peter is {val}");
202
203 // Shows the map and its data
204 println!(
205 "\nMap stats: size: {}, capacity: {}, active entries: {}",
206 map.occupied(),
207 map.capacity(),
208 map.len()
209 );
210 map.contents();
211
212 // Illustrates removing entries
213 println!("\nThere can be only one!");
214 names.remove(0);
215 for e in names {
216 let removed = map.remove(&e);
217 if let Some(entry) = removed {
218 println!("Remove: {}", entry.key());
219 }
220 }
221
222 // The final result
223 println!("\nMap stats: size: {}, capacity: {}, active entries: {}",
224 map.occupied(),
225 map.capacity(),
226 map.len()
227 );
228 map.contents();
229```
230
231*/
232
233use crate::associative::hash_lib;
234//use crate::maps::traits;
235use rand::Rng;
236use std::collections::hash_map::DefaultHasher;
237use std::fmt::Debug;
238use std::hash::{Hash, Hasher};
239
240#[derive(Debug, PartialEq)]
241// Contains the actual key:value pair
242pub struct Entry<K, V> {
243 key: K,
244 value: V,
245}
246impl<K, V> Entry<K, V>
247where
248 K: Debug + Hash + PartialEq,
249 V: Debug,
250{
251 /// Constructs a new Entry
252 fn new(key: K, value: V) -> Entry<K, V> {
253 Entry { key, value }
254 }
255
256 /// Returns the key from an Entry
257 pub fn key(&self) -> &K {
258 &self.key
259 }
260
261 /// Returns the value from an Entry
262 pub fn value(&self) -> &V {
263 &self.value
264 }
265}
266impl<K, V> Default for Entry<K, V>
267where
268 K: Default + Debug + Hash + PartialEq,
269 V: Debug + Default,
270{
271 fn default() -> Self {
272 Self {
273 key: K::default(),
274 value: V::default(),
275 }
276 }
277}
278
279// Byte masks were deemed more performant, but this is valuable historic pattern
280//#[derive(Debug, PartialEq)]
281//pub enum Entry<K, V> {
282// Live { key: K, value: V },
283// Tombstone,
284//}
285//impl<K, V> Entry<K, V>
286//where
287// K: Debug + Hash + PartialEq,
288// V: PartialEq,
289//{
290// /// Constructs a new Entry
291// fn new(key: K, value: V) -> Entry<K, V> {
292// Entry::Live { key, value }
293// }
294//
295// /// Returns the key from an Entry
296// pub fn key(&self) -> Option<&K> {
297// match self {
298// Entry::Live { key, .. } => Some(key),
299// Entry::Tombstone => None,
300// }
301// }
302//
303// /// Returns the value from an Entry
304// pub fn value(&self) -> Option<&V> {
305// match self {
306// Entry::Live { value, .. } => Some(value),
307// Entry::Tombstone => None,
308// }
309// }
310//
311// /// Entry adapter that takes a closure to mutate the value of an entry, if it exists.
312// pub fn mut_val<F>(&mut self, f: F)
313// where
314// F: FnOnce(&mut V),
315// {
316// if let Entry::Live { value, .. } = self {
317// f(value);
318// } else {
319// // No-op: Entry is a Tombstone and that would break the probing logic.
320// }
321// }
322//}
323
324#[derive(Debug)]
325pub struct HashMap<K, V> {
326 pub data: Vec<Option<Entry<K, V>>>, // The primary memory backing
327 pub ctrl: Vec<u8>, // A byte mask to identify available positions
328 size: usize, // The total number of entries in the map (live + deleted)
329 live: usize, // The number of "live" entries in the map
330
331 // NOTE: Prime, scale, and shift are used by the MAD compression algorithm.
332 // These randomly-generated values must remain static for the
333 // lifetime of the backing structure; The grow() operation changes these values
334 prime: usize,
335 scale: usize,
336 shift: usize,
337}
338impl<K, V> Default for HashMap<K, V>
339where
340 K: Debug + Default + Eq + Hash + PartialEq,
341 V: Default + PartialEq + std::fmt::Debug,
342{
343 fn default() -> Self {
344 Self::new()
345 }
346}
347impl<K, V> HashMap<K, V>
348where
349 K: Debug + Eq + Hash + PartialEq,
350 V: std::fmt::Debug,
351{
352 /// Constructor for an empty map with a default capacity of 2.
353 pub fn new() -> Self {
354 let new_capacity = 2;
355 let mut table = Self {
356 data: Vec::with_capacity(new_capacity),
357 ctrl: Vec::with_capacity(new_capacity),
358 prime: 13,
359 scale: 5,
360 shift: 7,
361 size: 0,
362 live: 0,
363 };
364 // Initialize storage to ensure access
365 table.data.resize_with(new_capacity, || None);
366 table.ctrl.resize_with(new_capacity, || 0x00);
367 table
368 }
369
370 /// Constructor for an empty map with a given capacity.
371 pub fn new_with_capacity(size: usize) -> Self {
372 let mut table = Self {
373 data: Vec::with_capacity(size),
374 ctrl: Vec::with_capacity(size),
375 prime: 13,
376 scale: 5,
377 shift: 7,
378 size: 0,
379 live: 0,
380 };
381 // Initialize storage to ensure access
382 table.data.resize_with(size, || None);
383 table.ctrl.resize_with(size, || 0x00);
384 table
385 }
386
387 /// Returns the number of active entries in the map in _O(1)_ time.
388 pub fn len(&self) -> usize {
389 self.live
390 }
391
392 /// Returns the total number of entries in the map (active + deleted entries) in _O(n)_ time.
393 pub fn occupied(&self) -> usize {
394 let mut occupied = 0;
395 for e in self.ctrl.iter() {
396 if *e == 1 || *e == 187 { occupied += 1 }
397 }
398 occupied
399 }
400
401 /// Returns the total number of _deleted_ entries in the map in _O(n)_ time.
402 pub fn deleted(&self) -> usize {
403 let mut occupied = 0;
404 for e in self.ctrl.iter() {
405 if *e == 187 { occupied += 1 }
406 }
407 occupied
408 }
409
410 /// Returns the total number of available slots in the map in _O(1)_ time.
411 ///
412 /// NOTE: The load factor is the quotient of `len() / capacity()`.
413 pub fn capacity(&self) -> usize {
414 self.data.len()
415 }
416
417 /// Returns true if the map is either empty or contains only empty slots and deleted entries.
418 pub fn is_empty(&self) -> bool {
419 self.live == 0 || self.data.is_empty()
420 }
421
422 /// Pretty-prints the map's contents.
423 pub fn contents(&self) {
424 for (e, m) in self.data.iter().zip(self.ctrl.iter()) {
425 println!(" {m:>3}: {e:?}")
426 }
427 }
428
429 /// Takes a key as a reference and returns a Boolean indicating whether its in
430 /// the map. The expected temporal complexity is _O(1)_, as the map
431 /// maintains a [laod factor](https://www.headyimage.com/cs/dsa/maps/#collision-handling-schemes) of <=.5.
432 pub fn contains<Q>(&self, key: &Q) -> bool
433 where
434 K: std::borrow::Borrow<Q>,
435 Q: Debug + Hash + Eq + ?Sized,
436 {
437 self.find_index(key) >= 0
438 }
439
440 /// Returns a reference to the value associated with a key, if the key exists.
441 pub fn get<Q>(&self, key: &Q) -> Option<&V>
442 //pub fn get(&self, key: &K) -> Option<&V> {
443 where
444 K: std::borrow::Borrow<Q>,
445 Q: Debug + Hash + Eq + ?Sized,
446 {
447 // find_index() uses quadratic probing
448 let location = self.find_index(key);
449 if location >= 0 {
450 let value = &self.data[location as usize].as_ref().unwrap().value;
451 Some(value)
452 } else {
453 None
454 }
455 }
456
457 /// Adds entry `(k, v)`, overwriting any value `v` associated with an
458 /// existing key `k`, returns old value. If a new addition increases
459 /// the map's load factor above the designated threshhold of 0.5
460 /// the map resizes.
461 pub fn put(&mut self, key: K, value: V) -> Option<Entry<K, V>>
462 //where
463 // K: std::default::Default,
464 {
465 // Checks if the addition will bring the load factor above threshold
466 if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
467 self.grow();
468 }
469
470 let location = self.find_index(&key);
471
472 // Creates a new Entry and inserts it
473 let entry = Entry::new(key, value);
474 let mut old_entry: Option<Entry<K, V>> = None;
475 if location >= 0 {
476 // Replace an entry
477 //println!("COLLISION!!!! {:?}", &entry.key);
478 old_entry = self.data[location as usize].take();
479 self.data[location as usize] = Some(entry);
480 } else {
481 // Add a new entry
482 self.data[-(location + 1) as usize] = Some(entry);
483 self.ctrl[-(location + 1) as usize] = 0x01;
484 self.size += 1;
485 self.live += 1;
486 };
487 old_entry
488 }
489
490 // Attempts to add entry `(k, v)` to the map, if key `k` does not
491 // already exist. If a new addition increases
492 // the map's load factor above the designated threshhold of 0.5
493 // the map resizes.
494 pub fn try_put(&mut self, key: K, value: V) {
495 // Checks if the addition will bring the load factor above threshold
496 if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
497 self.grow();
498 }
499
500 let location = self.find_index(&key);
501
502 // Avoids duplicates all together
503 if location < 0 {
504 let entry = Entry::new(key, value);
505 self.data[-(location + 1) as usize] = Some(entry);
506 self.ctrl[-(location + 1) as usize] = 0x01;
507 self.size += 1;
508 self.live += 1;
509 }
510 }
511
512 /// Takes a key, a closure, and a default value. If the key is in the map, applies the closure
513 /// to the corresponding entry's value. If the key is not in the map, creates a new entry with
514 /// the provided value.
515 ///
516 /// Example:
517 /// ```rust
518 /// use dsa_rust::associative::probing_hash_table::HashMap;
519 /// let mut count: HashMap<char, u32> = HashMap::new();
520 /// let word: &str = "Hello";
521 /// for k in word.chars() {
522 /// count.mut_val_or(k, |x| *x += 1, 1);
523 /// }
524 /// ```
525 pub fn mut_val_or<F>(&mut self, key: K, f: F, default: V)
526 where
527 F: FnOnce(&mut V),
528 K: std::default::Default,
529 {
530 // Find the appropriate index
531 let index = self.find_index(&key);
532
533 if index >= 0 {
534 // Found existing key, mutate value
535 if let Some(Entry { value, .. }) = self.data[index as usize].as_mut() {
536 f(value);
537 }
538 } else {
539 // Create new entry
540 self.put(key, default);
541 }
542 }
543
544 /// Removes and returns an entry from the map for a given key, if it exists in the map.
545 pub fn remove<Q>(&mut self, key: &Q) -> Option<Entry<K, V>>
546 where
547 K: std::borrow::Borrow<Q>,
548 Q: Debug + Hash + Eq + ?Sized,
549 {
550 //pub fn remove(&mut self, key: K) -> Option<Entry<K, V>> {
551 let location = self.find_index(key);
552
553 let mut entry = None;
554 if location >= 0 {
555 //entry = self.data[location as usize].replace(Entry::default());
556 entry = self.data[location as usize].take();
557 self.ctrl[location as usize] = 0xBB; // Mark with tombstone
558 self.live -= 1;
559 }
560 entry
561 }
562
563 /// Consumes self and returns a new map of the same size, but without any tombstones.
564 /// Works like a cleanup operation in _O(capacity)_ time because `into_iter` checks all positions.
565 pub fn rehash(self) -> Self {
566 let mut new = Self::new_with_capacity(self.data.len());
567 for (k, v) in self.into_iter() {
568 new.put(k, v);
569 }
570 new
571 }
572
573 /// Rebuilds the map to eliminate accumulated tombstones thereby reducing
574 /// spatial bloat. This operation runs in _O(n)_ time and is intended for
575 /// long-lived maps that have undergone many deletions.
576 ///
577 /// The operation consumes the existing map and returns a new `HashMap`
578 /// with the same live entries.
579 pub fn shrink_to_fit(self) -> Self {
580 let mut new = Self::new();
581 for (k, v) in self.into_iter() {
582 new.put(k, v);
583 }
584 new
585 }
586
587 /// Returns an iterator over borrowed values. The resulting values
588 /// appear in random order.
589 ///
590 /// Example use:
591 /// ```rust
592 /// use dsa_rust::associative::probing_hash_table::HashMap;
593 /// let mut count: HashMap<char, u8> = HashMap::new();
594 /// let mut v = Vec::new();
595 /// for e in count.iter() {
596 /// v.push(*e.0);
597 /// }
598 /// ```
599 pub fn iter(&self) -> Iter<'_, K, V> {
600 Iter {
601 iter: self.data.iter(),
602 }
603 }
604
605 /// Returns an iterator over borrowed keys only. The keys
606 /// appear in random order.
607 ///
608 /// Example use:
609 /// ```text
610 /// for e in map.keys() {
611 /// println!("{e}");
612 /// }
613 /// ```
614 pub fn keys(&self) -> Keys<'_, K, V> {
615 Keys {
616 iter: self.data.iter(),
617 }
618 }
619
620 /// Returns an iterator over borrowed values only. The values
621 /// appear in random order.
622 ///
623 /// Example use:
624 /// ```text
625 /// for e in map.values() {
626 /// println!("{e}");
627 /// }
628 /// ```
629 pub fn values(&self) -> Values<'_, K, V> {
630 Values {
631 iter: self.data.iter(),
632 }
633 }
634
635 // UTILITY FUNCTIONS
636 ////////////////////
637
638 /// Finds the correct insertion location using probing
639 /// Searches the map for key:
640 /// if `>= 0`, overwrite the location and return the old Entry,
641 /// if `< 0`, insert new entry at that location, return None
642 fn find_index<Q>(&self, key: &Q) -> isize
643 where
644 K: std::borrow::Borrow<Q>,
645 Q: Debug + Hash + Eq + ?Sized,
646 {
647 let mut i: usize = 1;
648 let hash = Self::hash(&key);
649 let mut current_index = self.compress(hash);
650
651 // Quadratic probing logic
652 loop {
653 match &self.data[current_index] {
654 Some(val) => {
655 if val.key.borrow() == key {
656 return current_index as isize;
657 }
658 },
659 None => {
660 if self.ctrl[current_index] != 0xBB {
661 return -(current_index as isize + 1)
662 }
663 }
664 }
665 current_index = (current_index + i.pow(2)) % self.data.len();
666 i += 1;
667 }
668 //while let Some(entry) = &self.data[current_index] {
669 // if entry.key.borrow() == key && self.ctrl[current_index] != 0xBB {
670 // return current_index as isize;
671 // } else {
672 // current_index = (current_index + i.pow(2)) % self.data.len();
673 // i += 1;
674 // }
675 //}
676 //-(current_index as isize + 1)
677 }
678
679 /// Takes a reference to a type `T` and uses Rust's default hasher to return a
680 /// 64-bit digest.
681 fn hash<T: Hash + Debug + ?Sized>(key: &T) -> usize {
682 let mut hasher = DefaultHasher::new();
683 key.hash(&mut hasher); // Hash::hash()
684 hasher.finish() as usize // Hasher::finish()
685 }
686
687 // Compresses the hash using the MAD algorithm
688 fn compress(&self, hash: usize) -> usize {
689 (hash.wrapping_mul(self.scale)).wrapping_add(self.shift) % (self.prime) % (self.data.len())
690 }
691
692 /// Grows the base (storage) vector to the next prime larger than
693 /// double the length of the original vector, rehashes and compresses
694 /// hashes for new distribution.
695 fn grow(&mut self) {
696 // Create a new base vector with_capacity() and resize_with()
697 // to ensure all indexes exist, otherwise you could push to an
698 // index that doesn't exist causing a panic.
699 // NOTE: Vec::resize_with() may result in "hidden allocation"
700 // despite description that indicates that the function resizes
701 // "in place", initializes new_base with all None values.
702 let new_capacity = hash_lib::next_prime(self.data.len() * 2);
703 let mut new_base: Vec<Option<Entry<K, V>>> = Vec::with_capacity(new_capacity);
704 new_base.resize_with(new_capacity, || None);
705 let mut new_ctrl: Vec<u8> = Vec::with_capacity(new_capacity);
706 new_ctrl.resize_with(new_capacity, || 0x00);
707
708 //println!("Growing from {} to {}", self.data.len(), new_capacity);
709
710 // Reset the instance values for MAD compression
711 // Finds a prime > len, starting much larger to ensure even spread
712 self.prime = hash_lib::next_prime(new_capacity * 2);
713 let mut rng = rand::rng(); // Thread-local RNG
714 self.scale = rng.random_range(1..=self.prime - 1);
715 self.shift = rng.random_range(0..=self.prime - 1);
716
717 // Move entries from self.data into new_base
718 // For each Some entry in old table, calculate new location
719 // via hash/compression and insert the entry into new_base[location]
720 for e in &mut self.data {
721 if let Some(v) = e.take() {
722 // MAD compression algorithm
723 let mut location: usize = ((hash_lib::hash(&v.key)).wrapping_mul(self.scale))
724 //let mut location: usize = ((hash_lib::hash(&v.key().unwrap())).wrapping_mul(self.scale))
725 .wrapping_add(self.shift)
726 % (self.prime)
727 % (new_capacity);
728
729 // Handle potential collisions for new vec insert
730 let mut i: usize = 0;
731 while new_base[location].is_some() {
732 location = (location + i.pow(2)) % new_capacity;
733 i += 1;
734 }
735 new_base[location] = Some(v);
736 new_ctrl[location] = 0x01;
737 }
738 }
739
740 //Update the struct instance to the new collections
741 self.data = new_base;
742 self.ctrl = new_ctrl;
743 }
744}
745
746// Returns a tuple of borrowed key:value pairs (&K, &V)
747//pub struct Iter<'a, K, V> {
748// iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
749//}
750//impl<'a, K, V> Iterator for Iter<'a, K, V>
751// where
752// K: Debug + Hash + PartialEq,
753// V: PartialEq,
754//{
755//
756// type Item = (&'a K, &'a V);
757//
758// fn next(&mut self) -> Option<Self::Item> {
759// for opt_entry in self.iter.by_ref() {
760// if let Some(entry) = opt_entry {
761// return Some((&entry.key, &entry.value));
762// }
763// }
764// None
765// }
766//}
767pub struct Iter<'a, K, V> {
768 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
769}
770
771impl<'a, K, V> Iterator for Iter<'a, K, V>
772where
773 K: Debug + Hash + PartialEq,
774 V: PartialEq,
775{
776 type Item = (&'a K, &'a V);
777
778 fn next(&mut self) -> Option<Self::Item> {
779 self.iter
780 .find_map(|opt_entry| opt_entry.as_ref().map(|entry| (&entry.key, &entry.value)))
781 }
782}
783
784// Returns a tuple of owned key:value pairs (K, V)
785pub struct IntoIter<K, V> {
786 inner: std::iter::Zip<
787 std::vec::IntoIter<Option<Entry<K, V>>>,
788 std::vec::IntoIter<u8>
789 >,
790}
791impl<K, V> Iterator for IntoIter<K, V> {
792 type Item = (K, V);
793
794 fn next(&mut self) -> Option<Self::Item> {
795 for (slot, tag) in self.inner.by_ref() {
796 if tag == 1 {
797 if let Some(entry) = slot {
798 return Some((entry.key, entry.value));
799 }
800 }
801
802 }
803 None
804 }
805}
806
807// Allows caller to consume HashMap as an iterator
808impl<K, V> IntoIterator for HashMap<K, V> {
809 type Item = (K, V);
810 type IntoIter = IntoIter<K, V>;
811
812 fn into_iter(self) -> Self::IntoIter {
813 IntoIter {
814 inner: self.data.into_iter().zip(self.ctrl),
815 }
816 }
817}
818
819pub struct Keys<'a, K, V> {
820 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
821}
822impl<'a, K, V> Iterator for Keys<'a, K, V> {
823 type Item = &'a K;
824
825 fn next(&mut self) -> Option<Self::Item> {
826 if let Some(entry) = self.iter.by_ref().flatten().next() {
827 return Some(&entry.key);
828 }
829 None
830 }
831}
832
833pub struct Values<'a, K, V> {
834 iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
835}
836impl<'a, K, V> Iterator for Values<'a, K, V> {
837 type Item = &'a V;
838
839 fn next(&mut self) -> Option<Self::Item> {
840 if let Some(entry) = self.iter.by_ref().flatten().next() {
841 return Some(&entry.value);
842 }
843 None
844 }
845}
846
847// Unit tests
848/////////////
849
850#[test]
851// Generic type test
852fn probing_hash_table_test() {
853 //Creates a new hash map
854 let mut map = HashMap::<&str, u8>::new();
855
856 assert_eq!(map.data.len(), 2);
857 assert_eq!(map.ctrl.len(), 2);
858
859 // Illustrates that put() and get() work
860 map.put("Peter", 40);
861 assert_eq!(map.size, 1);
862 assert_eq!(map.data.len(), 5);
863 assert_eq!(map.ctrl.len(), 5);
864
865 let fetch = map.get("Peter").unwrap();
866 //assert_eq!(*fetch, 40 as u8);
867 assert_eq!(*fetch, 40);
868
869 // Illustrates that the map grows correctly
870 map.put("Brain", 39); // Grows the map
871 assert_eq!(map.data.len(), 5);
872 map.put("Remus", 22);
873 map.put("Bobson", 36); // Grows the map
874 assert_eq!(map.data.len(), 11);
875 map.put("Dingus", 18);
876 map.put("Dangus", 27); // Grows the map
877 assert_eq!(map.size, 6);
878 assert_eq!(map.data.len(), 23);
879
880 // Illustrates that contains() works as intended
881 assert!(map.contains("Dingus"));
882
883 // Illustrates that put() returns old values and
884 // overwrites existing values upon collision...
885 let collision = map.put("Peter", 41).unwrap();
886 assert_eq!(collision.value, 40_u8);
887 //assert_eq!(*collision.value().unwrap(), 40_u8);
888 let new_val = map.get("Peter").unwrap();
889 //assert_eq!(*new_val, 41 as u8);
890 assert_eq!(*new_val, 41);
891 // Without increasing the list size
892 assert_eq!(map.size, 6);
893
894 // Illustrates that removes entries by key and returns the value
895 assert!(map.contains("Dangus"));
896 let removed = map.remove(&"Dangus").unwrap();
897 assert_eq!(
898 removed,
899 Entry {
900 key: "Dangus",
901 value: 27
902 }
903 );
904 assert!(!map.contains("Dangus"));
905}
906
907#[test]
908// Tests the custom update value function
909fn mut_val_test() {
910 //Creates a new hash map
911 let mut count = HashMap::<char, u8>::new();
912 let phrase: &str = "Hello, sickos";
913
914 // Seeds the map with just the characters and a default value
915 for char in phrase.chars() {
916 count.put(char, 0);
917 }
918
919 eprintln!("\nInitial state:");
920 count.contents();
921
922 // Iterates through the map again, updating each value based on its occurence
923 // NOTE: Can also be used as initial mapping operation with the same code,
924 // but was split here for illustrative purposes
925 for char in phrase.chars() {
926 count.mut_val_or(char, |x| *x += 1, 1);
927 }
928
929 // Pretty-prints the contents of the map
930 eprintln!("\nModified:");
931 count.contents();
932
933 // Uncomment to trigger debug print
934 //assert!(count.is_empty());
935}
936
937#[test]
938// Tests that the structure is iterable
939fn iter_test() {
940 //Creates a new hash map of the frequence letters in the given phrase
941 let mut map = HashMap::<char, u8>::new();
942 let phrase: &str = "Hello, sickos";
943 for char in phrase.chars() {
944 map.mut_val_or(char, |x| *x += 1, 1);
945 }
946
947 eprintln!("\nInitial map state:");
948 map.contents();
949
950 // Iterates through the map, pushing entries
951 // to a Vec as tuples returned from the iterator
952 eprintln!("\nCompact map: ");
953 let mut vec = Vec::new();
954 for e in map.iter() {
955 vec.push(*e.0);
956 eprintln!("{e:?}")
957 }
958
959 eprintln!("\nFinal map state:");
960 map.contents();
961
962 eprintln!("\nFinal vec state:");
963 for e in vec.iter() {
964 eprintln!("{:?}", *e)
965 }
966
967 // Proves that all entries are still valid
968 // Vec contains only keys
969 assert!(vec.contains(&'s'));
970 assert!(vec.contains(&'i'));
971 assert!(vec.contains(&'c'));
972 assert!(vec.contains(&'k'));
973 assert!(vec.contains(&'o'));
974
975 // Map contains whole entries
976 assert!(map.contains(&'s'));
977 assert!(map.contains(&'i'));
978 assert!(map.contains(&'c'));
979 assert!(map.contains(&'k'));
980 assert!(map.contains(&'o'));
981
982 eprintln!("\nPrinting just the keys:");
983 for key in map.keys() {
984 eprintln!("{key}");
985 }
986
987 eprintln!("\nPrinting just the values:");
988 for value in map.values() {
989 eprintln!("{value}");
990 }
991
992 // Consume the map and transfer it to a new map
993 let mut new_map = HashMap::<char, u8>::new();
994 for e in map.into_iter() {
995 new_map.put(e.0, e.1);
996 }
997 // Illegal, map is moved/consumed!
998 //assert!(map.is_empty());
999
1000 // But new_map contains data now
1001 assert!(new_map.contains(&'s'));
1002 assert!(new_map.contains(&'i'));
1003 assert!(new_map.contains(&'c'));
1004 assert!(new_map.contains(&'k'));
1005 assert!(new_map.contains(&'o'));
1006
1007 // Uncomment to trigger debug print
1008 //assert!(map.is_empty());
1009}
1010
1011#[test]
1012// Tests that into_iter consumes the map,
1013// and that rehash and shrink_to_fit delete removed entries
1014fn rehash_test() {
1015 //Creates a new hash map
1016 let mut map = HashMap::<char, u8>::new();
1017 let word: &str = "Hello, sickos!";
1018 for k in word.chars() {
1019 map.mut_val_or(k, |x| *x += 1, 1);
1020 }
1021
1022 // Prints initial state of the map
1023 // and illustrates that values exist
1024 eprintln!("\nInitial state: ");
1025 map.contents();
1026 assert!(map.contains(&','));
1027 assert!(map.contains(&' '));
1028 assert!(map.contains(&'!'));
1029 assert!(map.contains(&'s'));
1030 assert_eq!(map.len(), 11);
1031 assert_eq!(map.data.len(), 23);
1032
1033 // Removes the values
1034 map.remove(&',');
1035 map.remove(&' ');
1036 map.remove(&'!');
1037 map.remove(&'s');
1038 map.remove(&'i');
1039 map.remove(&'c');
1040 map.remove(&'k');
1041
1042 // Checks that the values are removed
1043 eprintln!("\nModified: ");
1044 map.contents();
1045 assert!(!map.contains(&','));
1046 assert!(!map.contains(&' '));
1047 assert!(!map.contains(&'!'));
1048 assert!(!map.contains(&'s'));
1049 assert_eq!(map.len(), 4);
1050 assert_eq!(map.data.len(), 23);
1051
1052 // Rehashes to get rid of deleted items
1053 // map goes out of scope here
1054 let new_map = map.rehash();
1055
1056 // Illegal operation! into_iter has consumed map!
1057 // assert!(map.contains(&'H'));
1058
1059 // Checks that the values are removed
1060 // but the capacity remains the same as before
1061 eprintln!("\nRehash: ");
1062 new_map.contents();
1063 assert!(!new_map.contains(&','));
1064 assert!(!new_map.contains(&' '));
1065 assert!(!new_map.contains(&'!'));
1066 assert!(!new_map.contains(&'s'));
1067 assert_eq!(new_map.len(), 4);
1068 assert_eq!(new_map.data.len(), 23);
1069
1070 // Repeats the process for shrink_to_fit
1071 //
1072 //Creates a new hash map
1073 let mut map = HashMap::<char, u8>::new();
1074 let word: &str = "Hello, sickos!";
1075 for k in word.chars() {
1076 map.mut_val_or(k, |x| *x += 1, 1);
1077 }
1078
1079 // Prints initial state of the map
1080 // and illustrates that values exist
1081 eprintln!("\nInitial state: ");
1082 map.contents();
1083 assert!(map.contains(&','));
1084 assert!(map.contains(&' '));
1085 assert!(map.contains(&'!'));
1086 assert!(map.contains(&'s'));
1087 assert_eq!(map.len(), 11);
1088 assert_eq!(map.data.len(), 23);
1089
1090 // Removes the values
1091 map.remove(&',');
1092 map.remove(&' ');
1093 map.remove(&'!');
1094 map.remove(&'s');
1095 map.remove(&'i');
1096 map.remove(&'c');
1097 map.remove(&'k');
1098
1099 // Checks that the values are removed
1100 eprintln!("\nModified: ");
1101 map.contents();
1102 assert!(!map.contains(&','));
1103 assert!(!map.contains(&' '));
1104 assert!(!map.contains(&'!'));
1105 assert!(!map.contains(&'s'));
1106 assert_eq!(map.len(), 4);
1107 assert_eq!(map.data.len(), 23);
1108
1109 // Rehashes to get rid of deleted items
1110 // map goes out of scope here
1111 let new_map = map.shrink_to_fit();
1112
1113 // Checks that the values are removed
1114 // but the capacity remains the same as before
1115 eprintln!("\nRehash (shrink to fit): ");
1116 new_map.contents();
1117 assert!(!new_map.contains(&','));
1118 assert!(!new_map.contains(&' '));
1119 assert!(!new_map.contains(&'!'));
1120 assert!(!new_map.contains(&'s'));
1121 assert_eq!(new_map.len(), 4);
1122 assert_eq!(new_map.data.len(), 11);
1123
1124 // Uncomment to trigger debug print
1125 //panic!("MANUAL TEST FAILURE");
1126}
1127
1128
1129// TODO: This belongs in either an external runner or in an integration test module
1130pub fn example() {
1131 //Creates a new hash map
1132 let mut map = HashMap::<&str, u8>::new();
1133
1134 let s = format!(
1135 "Map stats: size: {}, capacity: {}, active entries: {}",
1136 map.size,
1137 map.data.len(),
1138 map.live
1139 );
1140 let l = "=".repeat(s.len());
1141 println!("{l}\n{s}");
1142
1143 // Puts some entries into the map
1144 println!("Building the map...");
1145 let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
1146 let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
1147 for (k, v) in names.iter().zip(values.into_iter()) {
1148 map.put(k, v);
1149 }
1150
1151 // Checks that the map contains what we'd expect
1152 if !map.contains("Peter") {
1153 panic!()
1154 };
1155 let val = map.get("Peter").unwrap();
1156 println!("Peter is {val}");
1157
1158 // Replaces a value for a given key and
1159 // checks that the new value took
1160 let new = 41;
1161 let old = map.put("Peter", new).unwrap().value;
1162 println!("[Peter's age increased from {old} to {new}]");
1163 let val = map.get("Peter").unwrap();
1164 println!("Uhhhhh, I meant Peter is {val}");
1165
1166 // Shows the map and its data
1167 println!(
1168 "\nMap stats: size: {}, capacity: {}, active entries: {}",
1169 map.size,
1170 map.data.len(),
1171 map.live
1172 );
1173 for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1174 println!("\t{m:>3}: {e:?}")
1175 }
1176
1177 // Illustrates removing entries
1178 println!("\nThere can be only one!");
1179 names.remove(0);
1180 for e in names {
1181 println!("Remove: {}", map.remove(&e).unwrap().key);
1182 }
1183
1184 // The final result
1185 let s = format!(
1186 "\nMap stats: size: {}, capacity: {}, active entries: {}",
1187 map.size,
1188 map.data.len(),
1189 map.live
1190 );
1191 println!("{s}");
1192 for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1193 println!("\t{m:>3}: {e:?}")
1194 }
1195
1196 let mut count: HashMap<char, u8> = HashMap::new();
1197 let word: &str = "The quick brown fox jumps over the lazy dog";
1198 for k in word.chars() {
1199 count.mut_val_or(k, |x| *x += 1, 1);
1200 }
1201 println!(
1202 "\n\nString: {word:?}
1203\nNOTE: There are 28 entries because T != t, and the space literal ' ' is included.
1204\nMap stats: size: {}, capacity: {}, active entries: {}",
1205 count.occupied(), // Total number of elements
1206 count.capacity(),
1207 count.len() // Active entries
1208
1209 );
1210 for (e, m) in count.data.iter().zip(count.ctrl.iter()) {
1211 println!("\t{m:>3}: {e:?}")
1212 }
1213
1214 let l = "=".repeat(s.len());
1215 println!("\n{l}");
1216
1217}