dsa_rust/associative/
probing_hash_table.rs

1/*! Safe, open addressing hash table with MAD compression and quadratic probing
2
3# About
4This hash map implementation uses open-addressing to take advantage of cache locality, and relies on a "multiply, add, and divide" (MAD) compression algorithm with quadratic probing to handle collisions.
5
6This structure provides the basis for several other structures in this library, including the [PriorityQueue<K, P>](crate::composite::priority_queue) and [HashSet\<T>](crate::associative::hash_set) implementations.
7
8# Design
9This structure uses Rust's [default hasher](std::hash::DefaultHasher) to hash keys. The default hasher currently represents a cryptographically strong [SipHash-1-3](https://en.wikipedia.org/wiki/SipHash) design as of the publication date (9/2025). In practice, hashing small key objects <= 128 bytes provide similar performance as ULID generation, but hashing arbitrarily large key objects may incur performance penalties.
10
11The primary [HashMap] struct uses a private byte mask (as a "control" list) to track positions with valid ("live"), empty, and removed ("tombstone") entries. Using a conventional free list and overwriting data for a marked slot would break the quadratic probing logic.
12
13```text
14+-------------------------------------------------+
15| ctrl | Option<Entry<K, V>>                      |
16|------+------------------------------------------|
17|    0 | None                                     |
18|    0 | None                                     |
19|    0 | None                                     |
20|    0 | None                                     |
21|    0 | None                                     |
22|    1 | Some(Entry { key: "Dingus", value: 18 }) |
23|    1 | Some(Entry { key: "Brain", value: 37 })  |
24|    0 | None                                     |
25|    0 | None                                     |
26|    0 | None                                     |
27|    0 | None                                     |
28|    0 | None                                     |
29|    0 | None                                     |
30|    0 | None                                     |
31|    1 | Some(Entry { key: "Remus", value: 22 })  |
32|  187 | None                                     |
33|    0 | None                                     |
34|    1 | Some(Entry { key: "Bobson", value: 36 }) |
35|    0 | None                                     |
36|    0 | None                                     |
37|    0 | None                                     |
38|  187 | None                                     |
39+-------------------------------------------------+
40```
41
42## Insertion
43The structure contains two primary ways to insert entries into the map.
44
45The [put()](crate::associative::probing_hash_table::HashMap::put) operation overwrites values for existing keys, but does not mutate the existing key. This keeps with the standard library's implementation where two keys can be `==` without being identical.
46
47```rust
48    use dsa_rust::associative::probing_hash_table::HashMap;
49
50    // Build a map with put()
51    let mut map = HashMap::<&str, u8>::new();
52    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
53    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
54    for (k, v) in names.iter().zip(values.into_iter()) {
55        map.put(k, v);
56    }
57```
58
59The structure also provides a [mut_val_or()](crate::associative::probing_hash_table::HashMap::mut_val_or) operation for mutating the value(s) of the map or inserting a default value. This is useful with the structures iterators to make linear passes over source data during mapping.
60
61
62```rust
63    use dsa_rust::associative::probing_hash_table::HashMap;
64
65    // Build a map with mut_val_or()
66    let mut map = HashMap::<char, u8>::new();
67    let word: &str = "Hello, sickos!";
68    for k in word.chars() {
69        map.mut_val_or(k, |x| *x += 1, 1);
70    }
71
72```
73
74## Removal & Rehashing
75The structure contains a [remove()](crate::associative::probing_hash_table::HashMap::remove) operation that returns an owned entry if it exists in the map. The `remove()` operation marks the index with a tombstone, and uses [std::mem::take] to remove and return the value. Even though `take()` swaps out a `None` variant the index remains unusable until the calling code invokes a rehash operation. This is done in order to retain probing logic. For long-lived maps with many removals/tombstones, the structure provides two re-hashing operations. The [rehash()](crate::associative::probing_hash_table::HashMap::rehash) operation rehashes the entire map (freeing up tombstones) _at the current underlying structure's capacity_. This essentially allows you to permanently remove deleted entries and reduce the map's load factor. Conversely, the [shrink_to_fit()](crate::associative::probing_hash_table::HashMap::shrink_to_fit) operation rehashes the map (freeing up tombstones) down to the minimum backing capacity that still provides a load factor of _<= .5_. Both processes necessarily require _O(n)_ time where _n_ is the number of live entries in the map.
76
77Example of a map that has had many entries removed:
78
79```text
80+-------------------------------------------------+
81| ctrl | Option<Entry<K, V>>                      |
82|------+------------------------------------------|
83|    1 | Some(Entry { key: 'o', value: 2 })       |
84|  187 | None                                     |
85|    0 | None                                     |
86|  187 | None                                     |
87|    1 | Some(Entry { key: 'l', value: 2 })       |
88|  187 | None                                     |
89|    0 | None                                     |
90|    0 | None                                     |
91|    0 | None                                     |
92|  187 | None                                     |
93|  187 | None                                     |
94|    0 | None                                     |
95|    0 | None                                     |
96|    0 | None                                     |
97|    0 | None                                     |
98|    1 | Some(Entry { key: 'H', value: 1 })       |
99|    0 | None                                     |
100|  187 | None                                     |
101|    0 | None                                     |
102|    0 | None                                     |
103|    1 | Some(Entry { key: 'e', value: 1 })       |
104|  187 | None                                     |
105|    0 | None                                     |
106+-------------------------------------------------+
107```
108
109This example illustrates how the map might look after a `rehash()`:
110
111```text
112+-------------------------------------------------+
113| ctrl | Option<Entry<K, V>>                      |
114|------+------------------------------------------|
115|    1 | Some(Entry { key: 'H', value: 1 })       |
116|    1 | Some(Entry { key: 'o', value: 2 })       |
117|    0 | None                                     |
118|    0 | None                                     |
119|    0 | None                                     |
120|    0 | None                                     |
121|    0 | None                                     |
122|    0 | None                                     |
123|    0 | None                                     |
124|    0 | None                                     |
125|    0 | None                                     |
126|    0 | None                                     |
127|    1 | Some(Entry { key: 'l', value: 2 })       |
128|    0 | None                                     |
129|    0 | None                                     |
130|    0 | None                                     |
131|    0 | None                                     |
132|    0 | None                                     |
133|    0 | None                                     |
134|    0 | None                                     |
135|    1 | Some(Entry { key: 'e', value: 1 })       |
136|    0 | None                                     |
137|    0 | None                                     |
138+-------------------------------------------------+
139```
140
141This example illustrates how the map might look after a `shrink_to_fit()`:
142
143```text
144+-------------------------------------------------+
145| ctrl | Option<Entry<K, V>>                      |
146|------+------------------------------------------|
147|    1 | Some(Entry { key: 'H', value: 1 })       |
148|    0 | None                                     |
149|    0 | None                                     |
150|    1 | Some(Entry { key: 'l', value: 2 })       |
151|    1 | Some(Entry { key: 'o', value: 2 })       |
152|    0 | None                                     |
153|    0 | None                                     |
154|    1 | Some(Entry { key: 'e', value: 1 })       |
155|    0 | None                                     |
156|    0 | None                                     |
157|    0 | None                                     |
158+-------------------------------------------------+
159```
160
161# Example
162
163```rust
164
165    use dsa_rust::associative::probing_hash_table::HashMap;
166
167    //Creates a new hash map with str keys and u8 values
168    let mut map = HashMap::<&str, u8>::new();
169
170    println!("Map stats: size: {}, capacity: {}, active entries: {}",
171        map.occupied(),
172        map.capacity(),
173        map.len()
174    );
175
176    // Puts some entries into the map
177    println!("Building the map...");
178    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
179    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
180    for (k, v) in names.iter().zip(values.into_iter()) {
181        map.put(k, v);
182    }
183
184    // Checks that the map contains what we'd expect
185    if map.contains("Peter") == false {
186        panic!()
187    };
188    let val = map.get("Peter").unwrap();
189    println!("Peter is {val}");
190
191    // Replaces a value for a given key and
192    // checks that the new value took
193    let new = 41;
194
195    //let old = map.put("Peter", new).unwrap().value;
196    let old = map.get("Peter").unwrap().clone();
197    map.put("Peter", new);
198
199    println!("[Peter's age increased from {old} to {new}]");
200    let val = map.get("Peter").unwrap();
201    println!("Uhhhhh, I meant Peter is {val}");
202
203    // Shows the map and its data
204    println!(
205        "\nMap stats: size: {}, capacity: {}, active entries: {}",
206        map.occupied(),
207        map.capacity(),
208        map.len()
209    );
210    map.contents();
211
212    // Illustrates removing entries
213    println!("\nThere can be only one!");
214    names.remove(0);
215    for e in names {
216        let removed = map.remove(&e);
217        if let Some(entry) = removed {
218            println!("Remove: {}", entry.key());
219        }
220    }
221
222    // The final result
223    println!("\nMap stats: size: {}, capacity: {}, active entries: {}",
224        map.occupied(),
225        map.capacity(),
226        map.len()
227    );
228    map.contents();
229```
230
231*/
232
233use crate::associative::hash_lib;
234//use crate::maps::traits;
235use rand::Rng;
236use std::collections::hash_map::DefaultHasher;
237use std::fmt::Debug;
238use std::hash::{Hash, Hasher};
239
240#[derive(Debug, PartialEq)]
241// Contains the actual key:value pair
242pub struct Entry<K, V> {
243    key: K,
244    value: V,
245}
246impl<K, V> Entry<K, V>
247where
248    K: Debug + Hash + PartialEq,
249    V: Debug,
250{
251    /// Constructs a new Entry
252    fn new(key: K, value: V) -> Entry<K, V> {
253        Entry { key, value }
254    }
255
256    /// Returns the key from an Entry
257    pub fn key(&self) -> &K {
258        &self.key
259    }
260
261    /// Returns the value from an Entry
262    pub fn value(&self) -> &V {
263        &self.value
264    }
265}
266impl<K, V> Default for Entry<K, V>
267where
268    K: Default + Debug + Hash + PartialEq,
269    V: Debug + Default,
270{
271    fn default() -> Self {
272        Self {
273            key: K::default(),
274            value: V::default(),
275        }
276    }
277}
278
279// Byte masks were deemed more performant, but this is valuable historic pattern
280//#[derive(Debug, PartialEq)]
281//pub enum Entry<K, V> {
282//    Live { key: K, value: V },
283//    Tombstone,
284//}
285//impl<K, V> Entry<K, V>
286//where
287//    K: Debug + Hash + PartialEq,
288//    V: PartialEq,
289//{
290//    /// Constructs a new Entry
291//    fn new(key: K, value: V) -> Entry<K, V> {
292//        Entry::Live { key, value }
293//    }
294//
295//    /// Returns the key from an Entry
296//    pub fn key(&self) -> Option<&K> {
297//        match self {
298//            Entry::Live { key, .. } => Some(key),
299//            Entry::Tombstone => None,
300//        }
301//    }
302//
303//    /// Returns the value from an Entry
304//    pub fn value(&self) -> Option<&V> {
305//        match self {
306//            Entry::Live { value, .. } => Some(value),
307//            Entry::Tombstone => None,
308//        }
309//    }
310//
311//    /// Entry adapter that takes a closure to mutate the value of an entry, if it exists.
312//    pub fn mut_val<F>(&mut self, f: F)
313//    where
314//        F: FnOnce(&mut V),
315//    {
316//        if let Entry::Live { value, .. } = self {
317//            f(value);
318//        } else {
319//            // No-op: Entry is a Tombstone and that would break the probing logic.
320//        }
321//    }
322//}
323
324#[derive(Debug)]
325pub struct HashMap<K, V> {
326    pub data: Vec<Option<Entry<K, V>>>, // The primary memory backing
327    pub ctrl: Vec<u8>,                  // A byte mask to identify available positions
328    size: usize,                        // The total number of entries in the map (live + deleted)
329    live: usize,                        // The number of "live" entries in the map
330
331    // NOTE: Prime, scale, and shift are used by the MAD compression algorithm.
332    // These randomly-generated values must remain static for the
333    // lifetime of the backing structure; The grow() operation changes these values
334    prime: usize,
335    scale: usize,
336    shift: usize,
337}
338impl<K, V> Default for HashMap<K, V>
339where
340    K: Debug + Default + Eq + Hash + PartialEq,
341    V: Default + PartialEq + std::fmt::Debug,
342{
343    fn default() -> Self {
344        Self::new()
345    }
346}
347impl<K, V> HashMap<K, V>
348where
349    K: Debug + Eq + Hash + PartialEq,
350    V: std::fmt::Debug,
351{
352    /// Constructor for an empty map with a default capacity of 2.
353    pub fn new() -> Self {
354        let new_capacity = 2;
355        let mut table = Self {
356            data: Vec::with_capacity(new_capacity),
357            ctrl: Vec::with_capacity(new_capacity),
358            prime: 13,
359            scale: 5,
360            shift: 7,
361            size: 0,
362            live: 0,
363        };
364        // Initialize storage to ensure access
365        table.data.resize_with(new_capacity, || None);
366        table.ctrl.resize_with(new_capacity, || 0x00);
367        table
368    }
369
370    /// Constructor for an empty map with a given capacity.
371    pub fn new_with_capacity(size: usize) -> Self {
372        let mut table = Self {
373            data: Vec::with_capacity(size),
374            ctrl: Vec::with_capacity(size),
375            prime: 13,
376            scale: 5,
377            shift: 7,
378            size: 0,
379            live: 0,
380        };
381        // Initialize storage to ensure access
382        table.data.resize_with(size, || None);
383        table.ctrl.resize_with(size, || 0x00);
384        table
385    }
386
387    /// Returns the number of active entries in the map in _O(1)_ time.
388    pub fn len(&self) -> usize {
389        self.live
390    }
391
392    /// Returns the total number of entries in the map (active + deleted entries) in _O(n)_ time.
393    pub fn occupied(&self) -> usize {
394        let mut occupied = 0;
395        for e in self.ctrl.iter() {
396            if *e == 1 || *e == 187 {
397                occupied += 1
398            }
399        }
400        occupied
401    }
402
403    /// Returns the total number of _deleted_ entries in the map in _O(n)_ time.
404    pub fn deleted(&self) -> usize {
405        let mut occupied = 0;
406        for e in self.ctrl.iter() {
407            if *e == 187 {
408                occupied += 1
409            }
410        }
411        occupied
412    }
413
414    /// Returns the total number of available slots in the map in _O(1)_ time.
415    ///
416    /// NOTE: The load factor is the quotient of `len() / capacity()`.
417    pub fn capacity(&self) -> usize {
418        self.data.len()
419    }
420
421    /// Returns true if the map is either empty or contains only empty slots and deleted entries.
422    pub fn is_empty(&self) -> bool {
423        self.live == 0 || self.data.is_empty()
424    }
425
426    /// Pretty-prints the map's contents.
427    pub fn contents(&self) {
428        for (e, m) in self.data.iter().zip(self.ctrl.iter()) {
429            println!("  {m:>3}: {e:?}")
430        }
431    }
432
433    /// Takes a key as a reference and returns a Boolean indicating whether its in
434    /// the map. The expected temporal complexity is _O(1)_, as the map
435    /// maintains a [laod factor](https://www.headyimage.com/cs/dsa/maps/#collision-handling-schemes) of <=.5.
436    pub fn contains<Q>(&self, key: &Q) -> bool
437    where
438        K: std::borrow::Borrow<Q>,
439        Q: Debug + Hash + Eq + ?Sized,
440    {
441        self.find_index(key) >= 0
442    }
443
444    /// Returns a reference to the value associated with a key, if the key exists.
445    pub fn get<Q>(&self, key: &Q) -> Option<&V>
446    //pub fn get(&self, key: &K) -> Option<&V> {
447    where
448        K: std::borrow::Borrow<Q>,
449        Q: Debug + Hash + Eq + ?Sized,
450    {
451        // find_index() uses quadratic probing
452        let location = self.find_index(key);
453        if location >= 0 {
454            let value = &self.data[location as usize].as_ref().unwrap().value;
455            Some(value)
456        } else {
457            None
458        }
459    }
460
461    /// Adds entry `(k, v)`, overwriting any value `v` associated with an
462    /// existing key `k`, returns old value. If a new addition increases
463    /// the map's load factor above the designated threshhold of 0.5
464    /// the map resizes.
465    pub fn put(&mut self, key: K, value: V) -> Option<Entry<K, V>>
466//where 
467    //    K: std::default::Default,
468    {
469        // Checks if the addition will bring the load factor above threshold
470        if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
471            self.grow();
472        }
473
474        let location = self.find_index(&key);
475
476        // Creates a new Entry and inserts it
477        let entry = Entry::new(key, value);
478        let mut old_entry: Option<Entry<K, V>> = None;
479        if location >= 0 {
480            // Replace an entry
481            //println!("COLLISION!!!! {:?}", &entry.key);
482            old_entry = self.data[location as usize].take();
483            self.data[location as usize] = Some(entry);
484        } else {
485            // Add a new entry
486            self.data[-(location + 1) as usize] = Some(entry);
487            self.ctrl[-(location + 1) as usize] = 0x01;
488            self.size += 1;
489            self.live += 1;
490        };
491        old_entry
492    }
493
494    // Attempts to add entry `(k, v)` to the map, if key `k` does not
495    // already exist. If a new addition increases
496    // the map's load factor above the designated threshhold of 0.5
497    // the map resizes.
498    pub fn try_put(&mut self, key: K, value: V) {
499        // Checks if the addition will bring the load factor above threshold
500        if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
501            self.grow();
502        }
503
504        let location = self.find_index(&key);
505
506        // Avoids duplicates all together
507        if location < 0 {
508            let entry = Entry::new(key, value);
509            self.data[-(location + 1) as usize] = Some(entry);
510            self.ctrl[-(location + 1) as usize] = 0x01;
511            self.size += 1;
512            self.live += 1;
513        }
514    }
515
516    /// Takes a key, a closure, and a default value. If the key is in the map, applies the closure
517    /// to the corresponding entry's value. If the key is not in the map, creates a new entry with
518    /// the provided value.
519    ///
520    /// Example:
521    /// ```rust
522    /// use dsa_rust::associative::probing_hash_table::HashMap;
523    /// let mut count: HashMap<char, u32> = HashMap::new();
524    /// let word: &str = "Hello";
525    /// for k in word.chars() {
526    ///     count.mut_val_or(k, |x| *x += 1, 1);
527    /// }
528    /// ```
529    pub fn mut_val_or<F>(&mut self, key: K, f: F, default: V)
530    where
531        F: FnOnce(&mut V),
532        K: std::default::Default,
533    {
534        // Find the appropriate index
535        let index = self.find_index(&key);
536
537        if index >= 0 {
538            // Found existing key, mutate value
539            if let Some(Entry { value, .. }) = self.data[index as usize].as_mut() {
540                f(value);
541            }
542        } else {
543            // Create new entry
544            self.put(key, default);
545        }
546    }
547
548    /// Removes and returns an entry from the map for a given key, if it exists in the map.
549    pub fn remove<Q>(&mut self, key: &Q) -> Option<Entry<K, V>>
550    where
551        K: std::borrow::Borrow<Q>,
552        Q: Debug + Hash + Eq + ?Sized,
553    {
554        //pub fn remove(&mut self, key: K) -> Option<Entry<K, V>> {
555        let location = self.find_index(key);
556
557        let mut entry = None;
558        if location >= 0 {
559            //entry = self.data[location as usize].replace(Entry::default());
560            entry = self.data[location as usize].take();
561            self.ctrl[location as usize] = 0xBB; // Mark with tombstone
562            self.live -= 1;
563        }
564        entry
565    }
566
567    /// Consumes self and returns a new map of the same size, but without any tombstones.
568    /// Works like a cleanup operation in _O(capacity)_ time because `into_iter` checks all positions.
569    pub fn rehash(self) -> Self {
570        let mut new = Self::new_with_capacity(self.data.len());
571        for (k, v) in self.into_iter() {
572            new.put(k, v);
573        }
574        new
575    }
576
577    /// Rebuilds the map to eliminate accumulated tombstones thereby reducing
578    /// spatial bloat. This operation runs in _O(n)_ time and is intended for
579    /// long-lived maps that have undergone many deletions.
580    ///
581    /// The operation consumes the existing map and returns a new `HashMap`
582    /// with the same live entries.
583    pub fn shrink_to_fit(self) -> Self {
584        let mut new = Self::new();
585        for (k, v) in self.into_iter() {
586            new.put(k, v);
587        }
588        new
589    }
590
591    /// Returns an iterator over borrowed values. The resulting values
592    /// appear in random order.
593    ///
594    /// Example use:
595    /// ```rust
596    /// use dsa_rust::associative::probing_hash_table::HashMap;
597    /// let mut count: HashMap<char, u8> = HashMap::new();
598    /// let mut v = Vec::new();
599    /// for e in count.iter() {
600    ///     v.push(*e.0);
601    /// }
602    /// ```
603    pub fn iter(&self) -> Iter<'_, K, V> {
604        Iter {
605            iter: self.data.iter(),
606        }
607    }
608
609    /// Returns an iterator over borrowed keys only. The keys
610    /// appear in random order.
611    ///
612    /// Example use:
613    /// ```text
614    /// for e in map.keys() {
615    ///     println!("{e}");
616    /// }
617    /// ```
618    pub fn keys(&self) -> Keys<'_, K, V> {
619        Keys {
620            iter: self.data.iter(),
621        }
622    }
623
624    /// Returns an iterator over borrowed values only. The values
625    /// appear in random order.
626    ///
627    /// Example use:
628    /// ```text
629    /// for e in map.values() {
630    ///     println!("{e}");
631    /// }
632    /// ```
633    pub fn values(&self) -> Values<'_, K, V> {
634        Values {
635            iter: self.data.iter(),
636        }
637    }
638
639    // UTILITY FUNCTIONS
640    ////////////////////
641
642    /// Finds the correct insertion location using probing
643    /// Searches the map for key:
644    /// if `>= 0`, overwrite the location and return the old Entry,
645    /// if `< 0`, insert new entry at that location, return None
646    fn find_index<Q>(&self, key: &Q) -> isize
647    where
648        K: std::borrow::Borrow<Q>,
649        Q: Debug + Hash + Eq + ?Sized,
650    {
651        let mut i: usize = 1;
652        let hash = Self::hash(&key);
653        let mut current_index = self.compress(hash);
654
655        // Quadratic probing logic
656        loop {
657            match &self.data[current_index] {
658                Some(val) => {
659                    if val.key.borrow() == key {
660                        return current_index as isize;
661                    }
662                }
663                None => {
664                    if self.ctrl[current_index] != 0xBB {
665                        return -(current_index as isize + 1);
666                    }
667                }
668            }
669            current_index = (current_index + i.pow(2)) % self.data.len();
670            i += 1;
671        }
672        //while let Some(entry) = &self.data[current_index] {
673        //    if entry.key.borrow() == key && self.ctrl[current_index] != 0xBB {
674        //        return current_index as isize;
675        //    } else {
676        //        current_index = (current_index + i.pow(2)) % self.data.len();
677        //        i += 1;
678        //    }
679        //}
680        //-(current_index as isize + 1)
681    }
682
683    /// Takes a reference to a type `T` and uses Rust's default hasher to return a
684    /// 64-bit digest.
685    fn hash<T: Hash + Debug + ?Sized>(key: &T) -> usize {
686        let mut hasher = DefaultHasher::new();
687        key.hash(&mut hasher); // Hash::hash()
688        hasher.finish() as usize // Hasher::finish()
689    }
690
691    // Compresses the hash using the MAD algorithm
692    fn compress(&self, hash: usize) -> usize {
693        (hash.wrapping_mul(self.scale)).wrapping_add(self.shift) % (self.prime) % (self.data.len())
694    }
695
696    /// Grows the base (storage) vector to the next prime larger than
697    /// double the length of the original vector, rehashes and compresses
698    /// hashes for new distribution.
699    fn grow(&mut self) {
700        // Create a new base vector with_capacity() and resize_with()
701        // to ensure all indexes exist, otherwise you could push to an
702        // index that doesn't exist causing a panic.
703        // NOTE: Vec::resize_with() may result in "hidden allocation"
704        // despite description that indicates that the function resizes
705        // "in place", initializes new_base with all None values.
706        let new_capacity = hash_lib::next_prime(self.data.len() * 2);
707        let mut new_base: Vec<Option<Entry<K, V>>> = Vec::with_capacity(new_capacity);
708        new_base.resize_with(new_capacity, || None);
709        let mut new_ctrl: Vec<u8> = Vec::with_capacity(new_capacity);
710        new_ctrl.resize_with(new_capacity, || 0x00);
711
712        //println!("Growing from {} to {}", self.data.len(), new_capacity);
713
714        // Reset the instance values for MAD compression
715        // Finds a prime > len, starting much larger to ensure even spread
716        self.prime = hash_lib::next_prime(new_capacity * 2);
717        let mut rng = rand::rng(); // Thread-local RNG
718        self.scale = rng.random_range(1..=self.prime - 1);
719        self.shift = rng.random_range(0..=self.prime - 1);
720
721        // Move entries from self.data into new_base
722        // For each Some entry in old table, calculate new location
723        // via hash/compression and insert the entry into new_base[location]
724        for e in &mut self.data {
725            if let Some(v) = e.take() {
726                // MAD compression algorithm
727                let mut location: usize = ((hash_lib::hash(&v.key)).wrapping_mul(self.scale))
728                    //let mut location: usize = ((hash_lib::hash(&v.key().unwrap())).wrapping_mul(self.scale))
729                    .wrapping_add(self.shift)
730                    % (self.prime)
731                    % (new_capacity);
732
733                // Handle potential collisions for new vec insert
734                let mut i: usize = 0;
735                while new_base[location].is_some() {
736                    location = (location + i.pow(2)) % new_capacity;
737                    i += 1;
738                }
739                new_base[location] = Some(v);
740                new_ctrl[location] = 0x01;
741            }
742        }
743
744        //Update the struct instance to the new collections
745        self.data = new_base;
746        self.ctrl = new_ctrl;
747    }
748}
749
750// Returns a tuple of borrowed key:value pairs (&K, &V)
751//pub struct Iter<'a, K, V> {
752//    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
753//}
754//impl<'a, K, V> Iterator for Iter<'a, K, V>
755//    where
756//        K: Debug + Hash + PartialEq,
757//        V: PartialEq,
758//{
759//
760//    type Item = (&'a K, &'a V);
761//
762//    fn next(&mut self) -> Option<Self::Item> {
763//        for opt_entry in self.iter.by_ref() {
764//            if let Some(entry) = opt_entry {
765//                return Some((&entry.key, &entry.value));
766//            }
767//        }
768//        None
769//    }
770//}
771pub struct Iter<'a, K, V> {
772    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
773}
774
775impl<'a, K, V> Iterator for Iter<'a, K, V>
776where
777    K: Debug + Hash + PartialEq,
778    V: PartialEq,
779{
780    type Item = (&'a K, &'a V);
781
782    fn next(&mut self) -> Option<Self::Item> {
783        self.iter
784            .find_map(|opt_entry| opt_entry.as_ref().map(|entry| (&entry.key, &entry.value)))
785    }
786}
787
788// Returns a tuple of owned key:value pairs (K, V)
789pub struct IntoIter<K, V> {
790    inner: std::iter::Zip<std::vec::IntoIter<Option<Entry<K, V>>>, std::vec::IntoIter<u8>>,
791}
792impl<K, V> Iterator for IntoIter<K, V> {
793    type Item = (K, V);
794
795    fn next(&mut self) -> Option<Self::Item> {
796        for (slot, tag) in self.inner.by_ref() {
797            if tag == 1 {
798                if let Some(entry) = slot {
799                    return Some((entry.key, entry.value));
800                }
801            }
802        }
803        None
804    }
805}
806
807// Allows caller to consume HashMap as an iterator
808impl<K, V> IntoIterator for HashMap<K, V> {
809    type Item = (K, V);
810    type IntoIter = IntoIter<K, V>;
811
812    fn into_iter(self) -> Self::IntoIter {
813        IntoIter {
814            inner: self.data.into_iter().zip(self.ctrl),
815        }
816    }
817}
818
819pub struct Keys<'a, K, V> {
820    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
821}
822impl<'a, K, V> Iterator for Keys<'a, K, V> {
823    type Item = &'a K;
824
825    fn next(&mut self) -> Option<Self::Item> {
826        if let Some(entry) = self.iter.by_ref().flatten().next() {
827            return Some(&entry.key);
828        }
829        None
830    }
831}
832
833pub struct Values<'a, K, V> {
834    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
835}
836impl<'a, K, V> Iterator for Values<'a, K, V> {
837    type Item = &'a V;
838
839    fn next(&mut self) -> Option<Self::Item> {
840        if let Some(entry) = self.iter.by_ref().flatten().next() {
841            return Some(&entry.value);
842        }
843        None
844    }
845}
846
847// Unit tests
848/////////////
849
850#[test]
851// Generic type test
852fn probing_hash_table_test() {
853    //Creates a new hash map
854    let mut map = HashMap::<&str, u8>::new();
855
856    assert_eq!(map.data.len(), 2);
857    assert_eq!(map.ctrl.len(), 2);
858
859    // Illustrates that put() and get() work
860    map.put("Peter", 40);
861    assert_eq!(map.size, 1);
862    assert_eq!(map.data.len(), 5);
863    assert_eq!(map.ctrl.len(), 5);
864
865    let fetch = map.get("Peter").unwrap();
866    //assert_eq!(*fetch, 40 as u8);
867    assert_eq!(*fetch, 40);
868
869    // Illustrates that the map grows correctly
870    map.put("Brain", 39); // Grows the map
871    assert_eq!(map.data.len(), 5);
872    map.put("Remus", 22);
873    map.put("Bobson", 36); // Grows the map
874    assert_eq!(map.data.len(), 11);
875    map.put("Dingus", 18);
876    map.put("Dangus", 27); // Grows the map
877    assert_eq!(map.size, 6);
878    assert_eq!(map.data.len(), 23);
879
880    // Illustrates that contains() works as intended
881    assert!(map.contains("Dingus"));
882
883    // Illustrates that put() returns old values and
884    // overwrites existing values upon collision...
885    let collision = map.put("Peter", 41).unwrap();
886    assert_eq!(collision.value, 40_u8);
887    //assert_eq!(*collision.value().unwrap(), 40_u8);
888    let new_val = map.get("Peter").unwrap();
889    //assert_eq!(*new_val, 41 as u8);
890    assert_eq!(*new_val, 41);
891    // Without increasing the list size
892    assert_eq!(map.size, 6);
893
894    // Illustrates that removes entries by key and returns the value
895    assert!(map.contains("Dangus"));
896    let removed = map.remove(&"Dangus").unwrap();
897    assert_eq!(
898        removed,
899        Entry {
900            key: "Dangus",
901            value: 27
902        }
903    );
904    assert!(!map.contains("Dangus"));
905}
906
907#[test]
908// Tests the custom update value function
909fn mut_val_test() {
910    //Creates a new hash map
911    let mut count = HashMap::<char, u8>::new();
912    let phrase: &str = "Hello, sickos";
913
914    // Seeds the map with just the characters and a default value
915    for char in phrase.chars() {
916        count.put(char, 0);
917    }
918
919    eprintln!("\nInitial state:");
920    count.contents();
921
922    // Iterates through the map again, updating each value based on its occurence
923    // NOTE: Can also be used as initial mapping operation with the same code,
924    // but was split here for illustrative purposes
925    for char in phrase.chars() {
926        count.mut_val_or(char, |x| *x += 1, 1);
927    }
928
929    // Pretty-prints the contents of the map
930    eprintln!("\nModified:");
931    count.contents();
932
933    // Uncomment to trigger debug print
934    //assert!(count.is_empty());
935}
936
937#[test]
938// Tests that the structure is iterable
939fn iter_test() {
940    //Creates a new hash map of the frequence letters in the given phrase
941    let mut map = HashMap::<char, u8>::new();
942    let phrase: &str = "Hello, sickos";
943    for char in phrase.chars() {
944        map.mut_val_or(char, |x| *x += 1, 1);
945    }
946
947    eprintln!("\nInitial map state:");
948    map.contents();
949
950    // Iterates through the map, pushing entries
951    // to a Vec as tuples returned from the iterator
952    eprintln!("\nCompact map: ");
953    let mut vec = Vec::new();
954    for e in map.iter() {
955        vec.push(*e.0);
956        eprintln!("{e:?}")
957    }
958
959    eprintln!("\nFinal map state:");
960    map.contents();
961
962    eprintln!("\nFinal vec state:");
963    for e in vec.iter() {
964        eprintln!("{:?}", *e)
965    }
966
967    // Proves that all entries are still valid
968    // Vec contains only keys
969    assert!(vec.contains(&'s'));
970    assert!(vec.contains(&'i'));
971    assert!(vec.contains(&'c'));
972    assert!(vec.contains(&'k'));
973    assert!(vec.contains(&'o'));
974
975    // Map contains whole entries
976    assert!(map.contains(&'s'));
977    assert!(map.contains(&'i'));
978    assert!(map.contains(&'c'));
979    assert!(map.contains(&'k'));
980    assert!(map.contains(&'o'));
981
982    eprintln!("\nPrinting just the keys:");
983    for key in map.keys() {
984        eprintln!("{key}");
985    }
986
987    eprintln!("\nPrinting just the values:");
988    for value in map.values() {
989        eprintln!("{value}");
990    }
991
992    // Consume the map and transfer it to a new map
993    let mut new_map = HashMap::<char, u8>::new();
994    for e in map.into_iter() {
995        new_map.put(e.0, e.1);
996    }
997    // Illegal, map is moved/consumed!
998    //assert!(map.is_empty());
999
1000    // But new_map contains data now
1001    assert!(new_map.contains(&'s'));
1002    assert!(new_map.contains(&'i'));
1003    assert!(new_map.contains(&'c'));
1004    assert!(new_map.contains(&'k'));
1005    assert!(new_map.contains(&'o'));
1006
1007    // Uncomment to trigger debug print
1008    //assert!(map.is_empty());
1009}
1010
1011#[test]
1012// Tests that into_iter consumes the map,
1013// and that rehash and shrink_to_fit delete removed entries
1014fn rehash_test() {
1015    //Creates a new hash map
1016    let mut map = HashMap::<char, u8>::new();
1017    let word: &str = "Hello, sickos!";
1018    for k in word.chars() {
1019        map.mut_val_or(k, |x| *x += 1, 1);
1020    }
1021
1022    // Prints initial state of the map
1023    // and illustrates that values exist
1024    eprintln!("\nInitial state: ");
1025    map.contents();
1026    assert!(map.contains(&','));
1027    assert!(map.contains(&' '));
1028    assert!(map.contains(&'!'));
1029    assert!(map.contains(&'s'));
1030    assert_eq!(map.len(), 11);
1031    assert_eq!(map.data.len(), 23);
1032
1033    // Removes the values
1034    map.remove(&',');
1035    map.remove(&' ');
1036    map.remove(&'!');
1037    map.remove(&'s');
1038    map.remove(&'i');
1039    map.remove(&'c');
1040    map.remove(&'k');
1041
1042    // Checks that the values are removed
1043    eprintln!("\nModified: ");
1044    map.contents();
1045    assert!(!map.contains(&','));
1046    assert!(!map.contains(&' '));
1047    assert!(!map.contains(&'!'));
1048    assert!(!map.contains(&'s'));
1049    assert_eq!(map.len(), 4);
1050    assert_eq!(map.data.len(), 23);
1051
1052    // Rehashes to get rid of deleted items
1053    // map goes out of scope here
1054    let new_map = map.rehash();
1055
1056    // Illegal operation! into_iter has consumed map!
1057    // assert!(map.contains(&'H'));
1058
1059    // Checks that the values are removed
1060    // but the capacity remains the same as before
1061    eprintln!("\nRehash: ");
1062    new_map.contents();
1063    assert!(!new_map.contains(&','));
1064    assert!(!new_map.contains(&' '));
1065    assert!(!new_map.contains(&'!'));
1066    assert!(!new_map.contains(&'s'));
1067    assert_eq!(new_map.len(), 4);
1068    assert_eq!(new_map.data.len(), 23);
1069
1070    // Repeats the process for shrink_to_fit
1071    //
1072    //Creates a new hash map
1073    let mut map = HashMap::<char, u8>::new();
1074    let word: &str = "Hello, sickos!";
1075    for k in word.chars() {
1076        map.mut_val_or(k, |x| *x += 1, 1);
1077    }
1078
1079    // Prints initial state of the map
1080    // and illustrates that values exist
1081    eprintln!("\nInitial state: ");
1082    map.contents();
1083    assert!(map.contains(&','));
1084    assert!(map.contains(&' '));
1085    assert!(map.contains(&'!'));
1086    assert!(map.contains(&'s'));
1087    assert_eq!(map.len(), 11);
1088    assert_eq!(map.data.len(), 23);
1089
1090    // Removes the values
1091    map.remove(&',');
1092    map.remove(&' ');
1093    map.remove(&'!');
1094    map.remove(&'s');
1095    map.remove(&'i');
1096    map.remove(&'c');
1097    map.remove(&'k');
1098
1099    // Checks that the values are removed
1100    eprintln!("\nModified: ");
1101    map.contents();
1102    assert!(!map.contains(&','));
1103    assert!(!map.contains(&' '));
1104    assert!(!map.contains(&'!'));
1105    assert!(!map.contains(&'s'));
1106    assert_eq!(map.len(), 4);
1107    assert_eq!(map.data.len(), 23);
1108
1109    // Rehashes to get rid of deleted items
1110    // map goes out of scope here
1111    let new_map = map.shrink_to_fit();
1112
1113    // Checks that the values are removed
1114    // but the capacity remains the same as before
1115    eprintln!("\nRehash (shrink to fit): ");
1116    new_map.contents();
1117    assert!(!new_map.contains(&','));
1118    assert!(!new_map.contains(&' '));
1119    assert!(!new_map.contains(&'!'));
1120    assert!(!new_map.contains(&'s'));
1121    assert_eq!(new_map.len(), 4);
1122    assert_eq!(new_map.data.len(), 11);
1123
1124    // Uncomment to trigger debug print
1125    //panic!("MANUAL TEST FAILURE");
1126}
1127
1128// TODO: This belongs in either an external runner or in an integration test module
1129pub fn example() {
1130    //Creates a new hash map
1131    let mut map = HashMap::<&str, u8>::new();
1132
1133    let s = format!(
1134        "Map stats: size: {}, capacity: {}, active entries: {}",
1135        map.size,
1136        map.data.len(),
1137        map.live
1138    );
1139    let l = "=".repeat(s.len());
1140    println!("{l}\n{s}");
1141
1142    // Puts some entries into the map
1143    println!("Building the map...");
1144    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
1145    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
1146    for (k, v) in names.iter().zip(values.into_iter()) {
1147        map.put(k, v);
1148    }
1149
1150    // Checks that the map contains what we'd expect
1151    if !map.contains("Peter") {
1152        panic!()
1153    };
1154    let val = map.get("Peter").unwrap();
1155    println!("Peter is {val}");
1156
1157    // Replaces a value for a given key and
1158    // checks that the new value took
1159    let new = 41;
1160    let old = map.put("Peter", new).unwrap().value;
1161    println!("[Peter's age increased from {old} to {new}]");
1162    let val = map.get("Peter").unwrap();
1163    println!("Uhhhhh, I meant Peter is {val}");
1164
1165    // Shows the map and its data
1166    println!(
1167        "\nMap stats: size: {}, capacity: {}, active entries: {}",
1168        map.size,
1169        map.data.len(),
1170        map.live
1171    );
1172    for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1173        println!("\t{m:>3}: {e:?}")
1174    }
1175
1176    // Illustrates removing entries
1177    println!("\nThere can be only one!");
1178    names.remove(0);
1179    for e in names {
1180        println!("Remove: {}", map.remove(&e).unwrap().key);
1181    }
1182
1183    // The final result
1184    let s = format!(
1185        "\nMap stats: size: {}, capacity: {}, active entries: {}",
1186        map.size,
1187        map.data.len(),
1188        map.live
1189    );
1190    println!("{s}");
1191    for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1192        println!("\t{m:>3}: {e:?}")
1193    }
1194
1195    let mut count: HashMap<char, u8> = HashMap::new();
1196    let word: &str = "The quick brown fox jumps over the lazy dog";
1197    for k in word.chars() {
1198        count.mut_val_or(k, |x| *x += 1, 1);
1199    }
1200    println!(
1201        "\n\nString: {word:?}
1202\nNOTE: There are 28 entries because T != t, and the space literal ' ' is included.
1203\nMap stats: size: {}, capacity: {}, active entries: {}",
1204        count.occupied(), // Total number of elements
1205        count.capacity(),
1206        count.len() // Active entries
1207    );
1208    for (e, m) in count.data.iter().zip(count.ctrl.iter()) {
1209        println!("\t{m:>3}: {e:?}")
1210    }
1211
1212    let l = "=".repeat(s.len());
1213    println!("\n{l}");
1214}