dsa_rust/associative/
probing_hash_table.rs

1/*! Safe, open addressing hash table with MAD compression and quadratic probing
2
3# About
4This hash map implementation uses open-addressing to take advantage of cache locality, and relies on a "multiply, add, and divide" (MAD) compression algorithm with quadratic probing to handle collisions.
5
6This structure provides the basis for several other structures in this library, including the [PriorityQueue<K, P>](crate::composite::priority_queue) and [HashSet\<T>](crate::associative::hash_set) implementations.
7
8# Design
9This structure uses Rust's [default hasher](std::hash::DefaultHasher) to hash keys. The default hasher currently represents a cryptographically strong [SipHash-1-3](https://en.wikipedia.org/wiki/SipHash) design as of the publication date (9/2025). In practice, hashing small key objects <= 128 bytes provide similar performance as ULID generation, but hashing arbitrarily large key objects may incur performance penalties.
10
11The primary [HashMap] struct uses a private byte mask (as a "control" list) to track positions with valid ("live"), empty, and removed ("tombstone") entries. Using a conventional free list and overwriting data for a marked slot would break the quadratic probing logic.
12
13```text
14+-------------------------------------------------+
15| ctrl | Option<Entry<K, V>>                      |
16|------+------------------------------------------|
17|    0 | None                                     |
18|    0 | None                                     |
19|    0 | None                                     |
20|    0 | None                                     |
21|    0 | None                                     |
22|    1 | Some(Entry { key: "Dingus", value: 18 }) |
23|    1 | Some(Entry { key: "Brain", value: 37 })  |
24|    0 | None                                     |
25|    0 | None                                     |
26|    0 | None                                     |
27|    0 | None                                     |
28|    0 | None                                     |
29|    0 | None                                     |
30|    0 | None                                     |
31|    1 | Some(Entry { key: "Remus", value: 22 })  |
32|  187 | None                                     |
33|    0 | None                                     |
34|    1 | Some(Entry { key: "Bobson", value: 36 }) |
35|    0 | None                                     |
36|    0 | None                                     |
37|    0 | None                                     |
38|  187 | None                                     |
39+-------------------------------------------------+
40```
41
42## Insertion
43The structure contains two primary ways to insert entries into the map. 
44
45The [put()](crate::associative::probing_hash_table::HashMap::put) operation overwrites values for existing keys, but does not mutate the existing key. This keeps with the standard library's implementation where two keys can be `==` without being identical. 
46
47```rust
48    use dsa_rust::associative::probing_hash_table::HashMap;
49
50    // Build a map with put()
51    let mut map = HashMap::<&str, u8>::new();
52    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
53    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
54    for (k, v) in names.iter().zip(values.into_iter()) {
55        map.put(k, v);
56    }
57```
58
59The structure also provides a [mut_val_or()](crate::associative::probing_hash_table::HashMap::mut_val_or) operation for mutating the value(s) of the map or inserting a default value. This is useful with the structures iterators to make linear passes over source data during mapping.
60
61
62```rust
63    use dsa_rust::associative::probing_hash_table::HashMap;
64
65    // Build a map with mut_val_or()
66    let mut map = HashMap::<char, u8>::new();
67    let word: &str = "Hello, sickos!";
68    for k in word.chars() {
69        map.mut_val_or(k, |x| *x += 1, 1);
70    }
71
72```
73
74## Removal & Rehashing
75The structure contains a [remove()](crate::associative::probing_hash_table::HashMap::remove) operation that returns an owned entry if it exists in the map. The `remove()` operation marks the index with a tombstone, and uses [std::mem::take] to remove and return the value. Even though `take()` swaps out a `None` variant the index remains unusable until the calling code invokes a rehash operation. This is done in order to retain probing logic. For long-lived maps with many removals/tombstones, the structure provides two re-hashing operations. The [rehash()](crate::associative::probing_hash_table::HashMap::rehash) operation rehashes the entire map (freeing up tombstones) _at the current underlying structure's capacity_. This essentially allows you to permanently remove deleted entries and reduce the map's load factor. Conversely, the [shrink_to_fit()](crate::associative::probing_hash_table::HashMap::shrink_to_fit) operation rehashes the map (freeing up tombstones) down to the minimum backing capacity that still provides a load factor of _<= .5_. Both processes necessarily require _O(n)_ time where _n_ is the number of live entries in the map.
76
77Example of a map that has had many entries removed:
78
79```text
80+-------------------------------------------------+
81| ctrl | Option<Entry<K, V>>                      |
82|------+------------------------------------------|
83|    1 | Some(Entry { key: 'o', value: 2 })       |
84|  187 | None                                     |
85|    0 | None                                     |
86|  187 | None                                     |
87|    1 | Some(Entry { key: 'l', value: 2 })       |
88|  187 | None                                     |
89|    0 | None                                     |
90|    0 | None                                     |
91|    0 | None                                     |
92|  187 | None                                     |
93|  187 | None                                     |
94|    0 | None                                     |
95|    0 | None                                     |
96|    0 | None                                     |
97|    0 | None                                     |
98|    1 | Some(Entry { key: 'H', value: 1 })       |
99|    0 | None                                     |
100|  187 | None                                     |
101|    0 | None                                     |
102|    0 | None                                     |
103|    1 | Some(Entry { key: 'e', value: 1 })       |
104|  187 | None                                     |
105|    0 | None                                     |
106+-------------------------------------------------+
107```
108
109This example illustrates how the map might look after a `rehash()`:
110
111```text
112+-------------------------------------------------+
113| ctrl | Option<Entry<K, V>>                      |
114|------+------------------------------------------|
115|    1 | Some(Entry { key: 'H', value: 1 })       |
116|    1 | Some(Entry { key: 'o', value: 2 })       |
117|    0 | None                                     |
118|    0 | None                                     |
119|    0 | None                                     |
120|    0 | None                                     |
121|    0 | None                                     |
122|    0 | None                                     |
123|    0 | None                                     |
124|    0 | None                                     |
125|    0 | None                                     |
126|    0 | None                                     |
127|    1 | Some(Entry { key: 'l', value: 2 })       |
128|    0 | None                                     |
129|    0 | None                                     |
130|    0 | None                                     |
131|    0 | None                                     |
132|    0 | None                                     |
133|    0 | None                                     |
134|    0 | None                                     |
135|    1 | Some(Entry { key: 'e', value: 1 })       |
136|    0 | None                                     |
137|    0 | None                                     |
138+-------------------------------------------------+
139```
140
141This example illustrates how the map might look after a `shrink_to_fit()`:
142
143```text
144+-------------------------------------------------+
145| ctrl | Option<Entry<K, V>>                      |
146|------+------------------------------------------|
147|    1 | Some(Entry { key: 'H', value: 1 })       |
148|    0 | None                                     |
149|    0 | None                                     |
150|    1 | Some(Entry { key: 'l', value: 2 })       |
151|    1 | Some(Entry { key: 'o', value: 2 })       |
152|    0 | None                                     |
153|    0 | None                                     |
154|    1 | Some(Entry { key: 'e', value: 1 })       |
155|    0 | None                                     |
156|    0 | None                                     |
157|    0 | None                                     |
158+-------------------------------------------------+
159```
160
161# Example
162
163```rust
164
165    use dsa_rust::associative::probing_hash_table::HashMap;
166
167    //Creates a new hash map with str keys and u8 values
168    let mut map = HashMap::<&str, u8>::new();
169
170    println!("Map stats: size: {}, capacity: {}, active entries: {}",
171        map.occupied(),
172        map.capacity(),
173        map.len()
174    );
175
176    // Puts some entries into the map
177    println!("Building the map...");
178    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
179    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
180    for (k, v) in names.iter().zip(values.into_iter()) {
181        map.put(k, v);
182    }
183
184    // Checks that the map contains what we'd expect
185    if map.contains("Peter") == false {
186        panic!()
187    };
188    let val = map.get("Peter").unwrap();
189    println!("Peter is {val}");
190
191    // Replaces a value for a given key and
192    // checks that the new value took
193    let new = 41;
194
195    //let old = map.put("Peter", new).unwrap().value;
196    let old = map.get("Peter").unwrap().clone();
197    map.put("Peter", new);
198
199    println!("[Peter's age increased from {old} to {new}]");
200    let val = map.get("Peter").unwrap();
201    println!("Uhhhhh, I meant Peter is {val}");
202
203    // Shows the map and its data
204    println!(
205        "\nMap stats: size: {}, capacity: {}, active entries: {}",
206        map.occupied(),
207        map.capacity(),
208        map.len()
209    );
210    map.contents();
211
212    // Illustrates removing entries
213    println!("\nThere can be only one!");
214    names.remove(0);
215    for e in names {
216        let removed = map.remove(&e);
217        if let Some(entry) = removed {
218            println!("Remove: {}", entry.key());
219        }
220    }
221
222    // The final result
223    println!("\nMap stats: size: {}, capacity: {}, active entries: {}",
224        map.occupied(),
225        map.capacity(),
226        map.len()
227    );
228    map.contents();
229```
230
231*/
232
233use crate::associative::hash_lib;
234//use crate::maps::traits;
235use rand::Rng;
236use std::collections::hash_map::DefaultHasher;
237use std::fmt::Debug;
238use std::hash::{Hash, Hasher};
239
240#[derive(Debug, PartialEq)]
241// Contains the actual key:value pair
242pub struct Entry<K, V> {
243    key: K,
244    value: V,
245}
246impl<K, V> Entry<K, V>
247where
248    K: Debug + Hash + PartialEq,
249    V: Debug,
250{
251    /// Constructs a new Entry
252    fn new(key: K, value: V) -> Entry<K, V> {
253        Entry { key, value }
254    }
255
256    /// Returns the key from an Entry
257    pub fn key(&self) -> &K {
258        &self.key
259    }
260
261    /// Returns the value from an Entry
262    pub fn value(&self) -> &V {
263        &self.value
264    }
265}
266impl<K, V> Default for Entry<K, V>
267where
268    K: Default + Debug + Hash + PartialEq,
269    V: Debug + Default,
270{
271    fn default() -> Self {
272        Self {
273            key: K::default(),
274            value: V::default(),
275        }
276    }
277}
278
279// Byte masks were deemed more performant, but this is valuable historic pattern
280//#[derive(Debug, PartialEq)]
281//pub enum Entry<K, V> {
282//    Live { key: K, value: V },
283//    Tombstone,
284//}
285//impl<K, V> Entry<K, V>
286//where
287//    K: Debug + Hash + PartialEq,
288//    V: PartialEq,
289//{
290//    /// Constructs a new Entry
291//    fn new(key: K, value: V) -> Entry<K, V> {
292//        Entry::Live { key, value }
293//    }
294//
295//    /// Returns the key from an Entry
296//    pub fn key(&self) -> Option<&K> {
297//        match self {
298//            Entry::Live { key, .. } => Some(key),
299//            Entry::Tombstone => None,
300//        }
301//    }
302//
303//    /// Returns the value from an Entry
304//    pub fn value(&self) -> Option<&V> {
305//        match self {
306//            Entry::Live { value, .. } => Some(value),
307//            Entry::Tombstone => None,
308//        }
309//    }
310//
311//    /// Entry adapter that takes a closure to mutate the value of an entry, if it exists.
312//    pub fn mut_val<F>(&mut self, f: F)
313//    where
314//        F: FnOnce(&mut V),
315//    {
316//        if let Entry::Live { value, .. } = self {
317//            f(value);
318//        } else {
319//            // No-op: Entry is a Tombstone and that would break the probing logic.
320//        }
321//    }
322//}
323
324#[derive(Debug)]
325pub struct HashMap<K, V> {
326    pub data: Vec<Option<Entry<K, V>>>, // The primary memory backing
327    pub ctrl: Vec<u8>,                  // A byte mask to identify available positions
328    size: usize,                     // The total number of entries in the map (live + deleted)
329    live: usize,                    // The number of "live" entries in the map
330                 
331    // NOTE: Prime, scale, and shift are used by the MAD compression algorithm.
332    // These randomly-generated values must remain static for the
333    // lifetime of the backing structure; The grow() operation changes these values
334    prime: usize,
335    scale: usize,
336    shift: usize,
337}
338impl<K, V> Default for HashMap<K, V>
339where
340    K: Debug + Default + Eq + Hash + PartialEq,
341    V: Default + PartialEq + std::fmt::Debug,
342{
343    fn default() -> Self {
344        Self::new()
345    }
346}
347impl<K, V> HashMap<K, V>
348where
349    K: Debug + Eq + Hash + PartialEq,
350    V: std::fmt::Debug,
351{
352    /// Constructor for an empty map with a default capacity of 2.
353    pub fn new() -> Self {
354        let new_capacity = 2;
355        let mut table = Self {
356            data: Vec::with_capacity(new_capacity),
357            ctrl: Vec::with_capacity(new_capacity),
358            prime: 13,
359            scale: 5,
360            shift: 7,
361            size: 0,
362            live: 0,
363        };
364        // Initialize storage to ensure access
365        table.data.resize_with(new_capacity, || None);
366        table.ctrl.resize_with(new_capacity, || 0x00);
367        table
368    }
369
370    /// Constructor for an empty map with a given capacity.
371    pub fn new_with_capacity(size: usize) -> Self {
372        let mut table = Self {
373            data: Vec::with_capacity(size),
374            ctrl: Vec::with_capacity(size),
375            prime: 13,
376            scale: 5,
377            shift: 7,
378            size: 0,
379            live: 0,
380        };
381        // Initialize storage to ensure access
382        table.data.resize_with(size, || None);
383        table.ctrl.resize_with(size, || 0x00);
384        table
385    }
386
387    /// Returns the number of active entries in the map in _O(1)_ time.
388    pub fn len(&self) -> usize {
389        self.live
390    }
391
392    /// Returns the total number of entries in the map (active + deleted entries) in _O(n)_ time.
393    pub fn occupied(&self) -> usize {
394        let mut occupied = 0;
395        for e in self.ctrl.iter() {
396            if *e == 1 || *e == 187 { occupied += 1 }
397        }
398        occupied
399    }
400
401    /// Returns the total number of _deleted_ entries in the map in _O(n)_ time.
402    pub fn deleted(&self) -> usize {
403        let mut occupied = 0;
404        for e in self.ctrl.iter() {
405            if *e == 187 { occupied += 1 }
406        }
407        occupied
408    }
409
410    /// Returns the total number of available slots in the map in _O(1)_ time. 
411    ///
412    /// NOTE: The load factor is the quotient of `len() / capacity()`.
413    pub fn capacity(&self) -> usize {
414        self.data.len()
415    }
416
417    /// Returns true if the map is either empty or contains only empty slots and deleted entries.
418    pub fn is_empty(&self) -> bool {
419        self.live == 0 || self.data.is_empty()
420    }
421
422    /// Pretty-prints the map's contents.
423    pub fn contents(&self) {
424        for (e, m) in self.data.iter().zip(self.ctrl.iter()) {
425            println!("  {m:>3}: {e:?}")
426        }
427    }
428
429    /// Takes a key as a reference and returns a Boolean indicating whether its in
430    /// the map. The expected temporal complexity is _O(1)_, as the map
431    /// maintains a [laod factor](https://www.headyimage.com/cs/dsa/maps/#collision-handling-schemes) of <=.5.
432    pub fn contains<Q>(&self, key: &Q) -> bool
433    where
434        K: std::borrow::Borrow<Q>,
435        Q: Debug + Hash + Eq + ?Sized,
436    {
437        self.find_index(key) >= 0
438    }
439
440    /// Returns a reference to the value associated with a key, if the key exists.
441    pub fn get<Q>(&self, key: &Q) -> Option<&V>
442    //pub fn get(&self, key: &K) -> Option<&V> {
443    where
444        K: std::borrow::Borrow<Q>,
445        Q: Debug + Hash + Eq + ?Sized,
446    {
447        // find_index() uses quadratic probing
448        let location = self.find_index(key);
449        if location >= 0 {
450            let value = &self.data[location as usize].as_ref().unwrap().value;
451            Some(value)
452        } else {
453            None
454        }
455    }
456
457    /// Adds entry `(k, v)`, overwriting any value `v` associated with an
458    /// existing key `k`, returns old value. If a new addition increases
459    /// the map's load factor above the designated threshhold of 0.5
460    /// the map resizes.
461    pub fn put(&mut self, key: K, value: V) -> Option<Entry<K, V>> 
462    //where 
463    //    K: std::default::Default,
464    {
465        // Checks if the addition will bring the load factor above threshold
466        if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
467            self.grow();
468        }
469
470        let location = self.find_index(&key);
471
472        // Creates a new Entry and inserts it
473        let entry = Entry::new(key, value);
474        let mut old_entry: Option<Entry<K, V>> = None;
475        if location >= 0 {
476            // Replace an entry
477            //println!("COLLISION!!!! {:?}", &entry.key);
478            old_entry = self.data[location as usize].take();
479            self.data[location as usize] = Some(entry);
480        } else {
481            // Add a new entry
482            self.data[-(location + 1) as usize] = Some(entry);
483            self.ctrl[-(location + 1) as usize] = 0x01;
484            self.size += 1;
485            self.live += 1;
486        };
487        old_entry
488    }
489
490    // Attempts to add entry `(k, v)` to the map, if key `k` does not
491    // already exist. If a new addition increases
492    // the map's load factor above the designated threshhold of 0.5
493    // the map resizes.
494    pub fn try_put(&mut self, key: K, value: V) {
495        // Checks if the addition will bring the load factor above threshold
496        if ((self.size) as f64 + 1.0) / self.data.len() as f64 >= 0.5 {
497            self.grow();
498        }
499    
500        let location = self.find_index(&key);
501    
502        // Avoids duplicates all together
503        if location < 0 {
504            let entry = Entry::new(key, value);
505            self.data[-(location + 1) as usize] = Some(entry);
506            self.ctrl[-(location + 1) as usize] = 0x01;
507            self.size += 1;
508            self.live += 1;
509        }
510    }
511
512    /// Takes a key, a closure, and a default value. If the key is in the map, applies the closure
513    /// to the corresponding entry's value. If the key is not in the map, creates a new entry with
514    /// the provided value.
515    ///
516    /// Example:
517    /// ```rust
518    /// use dsa_rust::associative::probing_hash_table::HashMap;
519    /// let mut count: HashMap<char, u32> = HashMap::new();
520    /// let word: &str = "Hello";
521    /// for k in word.chars() {
522    ///     count.mut_val_or(k, |x| *x += 1, 1);
523    /// }
524    /// ```
525    pub fn mut_val_or<F>(&mut self, key: K, f: F, default: V)
526    where
527        F: FnOnce(&mut V),
528        K: std::default::Default,
529    {
530        // Find the appropriate index
531        let index = self.find_index(&key);
532
533        if index >= 0 {
534            // Found existing key, mutate value
535            if let Some(Entry { value, .. }) = self.data[index as usize].as_mut() {
536                f(value);
537            }
538        } else {
539            // Create new entry
540            self.put(key, default);
541        }
542    }
543
544    /// Removes and returns an entry from the map for a given key, if it exists in the map.
545    pub fn remove<Q>(&mut self, key: &Q) -> Option<Entry<K, V>>
546    where
547        K: std::borrow::Borrow<Q>,
548        Q: Debug + Hash + Eq + ?Sized,
549    {
550    //pub fn remove(&mut self, key: K) -> Option<Entry<K, V>> {
551        let location = self.find_index(key);
552
553        let mut entry = None;
554        if location >= 0 {
555            //entry = self.data[location as usize].replace(Entry::default());
556            entry = self.data[location as usize].take();
557            self.ctrl[location as usize] = 0xBB; // Mark with tombstone
558            self.live -= 1;
559        }
560        entry
561    }
562
563    /// Consumes self and returns a new map of the same size, but without any tombstones.
564    /// Works like a cleanup operation in _O(capacity)_ time because `into_iter` checks all positions.
565    pub fn rehash(self) -> Self {
566        let mut new = Self::new_with_capacity(self.data.len());
567        for (k, v) in self.into_iter() {
568            new.put(k, v);
569        }
570        new
571    }
572
573    /// Rebuilds the map to eliminate accumulated tombstones thereby reducing
574    /// spatial bloat. This operation runs in _O(n)_ time and is intended for
575    /// long-lived maps that have undergone many deletions.
576    ///
577    /// The operation consumes the existing map and returns a new `HashMap`
578    /// with the same live entries.
579    pub fn shrink_to_fit(self) -> Self {
580        let mut new = Self::new();
581        for (k, v) in self.into_iter() {
582            new.put(k, v);
583        }
584        new
585    }
586
587    /// Returns an iterator over borrowed values. The resulting values
588    /// appear in random order.
589    ///
590    /// Example use:
591    /// ```rust
592    /// use dsa_rust::associative::probing_hash_table::HashMap;
593    /// let mut count: HashMap<char, u8> = HashMap::new();
594    /// let mut v = Vec::new();
595    /// for e in count.iter() {
596    ///     v.push(*e.0);
597    /// }
598    /// ```
599    pub fn iter(&self) -> Iter<'_, K, V> {
600        Iter {
601            iter: self.data.iter(),
602        }
603    }
604
605    /// Returns an iterator over borrowed keys only. The keys
606    /// appear in random order.
607    ///
608    /// Example use:
609    /// ```text
610    /// for e in map.keys() {
611    ///     println!("{e}");
612    /// }
613    /// ```
614    pub fn keys(&self) -> Keys<'_, K, V> {
615        Keys {
616            iter: self.data.iter(),
617        }
618    }
619
620    /// Returns an iterator over borrowed values only. The values
621    /// appear in random order.
622    ///
623    /// Example use:
624    /// ```text
625    /// for e in map.values() {
626    ///     println!("{e}");
627    /// }
628    /// ```
629    pub fn values(&self) -> Values<'_, K, V> {
630        Values {
631            iter: self.data.iter(),
632        }
633    }
634
635    // UTILITY FUNCTIONS
636    ////////////////////
637
638    /// Finds the correct insertion location using probing
639    /// Searches the map for key:
640    /// if `>= 0`, overwrite the location and return the old Entry,
641    /// if `< 0`, insert new entry at that location, return None
642    fn find_index<Q>(&self, key: &Q) -> isize
643    where
644        K: std::borrow::Borrow<Q>,
645        Q: Debug + Hash + Eq + ?Sized,
646    {
647        let mut i: usize = 1;
648        let hash = Self::hash(&key);
649        let mut current_index = self.compress(hash);
650    
651        // Quadratic probing logic
652        loop {
653            match &self.data[current_index] {
654                Some(val) => {
655                    if val.key.borrow() == key {
656                        return current_index as isize;
657                    }
658                },
659                None => {
660                    if self.ctrl[current_index] != 0xBB {
661                        return -(current_index as isize + 1)
662                    } 
663                }
664            }
665            current_index = (current_index + i.pow(2)) % self.data.len();
666            i += 1;
667        }
668        //while let Some(entry) = &self.data[current_index] {
669        //    if entry.key.borrow() == key && self.ctrl[current_index] != 0xBB {
670        //        return current_index as isize;
671        //    } else {
672        //        current_index = (current_index + i.pow(2)) % self.data.len();
673        //        i += 1;
674        //    }
675        //}
676        //-(current_index as isize + 1)
677    }
678
679    /// Takes a reference to a type `T` and uses Rust's default hasher to return a
680    /// 64-bit digest.
681    fn hash<T: Hash + Debug + ?Sized>(key: &T) -> usize {
682        let mut hasher = DefaultHasher::new();
683        key.hash(&mut hasher); // Hash::hash()
684        hasher.finish() as usize // Hasher::finish()
685    }
686
687    // Compresses the hash using the MAD algorithm
688    fn compress(&self, hash: usize) -> usize {
689        (hash.wrapping_mul(self.scale)).wrapping_add(self.shift) % (self.prime) % (self.data.len())
690    }
691
692    /// Grows the base (storage) vector to the next prime larger than
693    /// double the length of the original vector, rehashes and compresses
694    /// hashes for new distribution.
695    fn grow(&mut self) {
696        // Create a new base vector with_capacity() and resize_with()
697        // to ensure all indexes exist, otherwise you could push to an
698        // index that doesn't exist causing a panic.
699        // NOTE: Vec::resize_with() may result in "hidden allocation"
700        // despite description that indicates that the function resizes
701        // "in place", initializes new_base with all None values.
702        let new_capacity = hash_lib::next_prime(self.data.len() * 2);
703        let mut new_base: Vec<Option<Entry<K, V>>> = Vec::with_capacity(new_capacity);
704        new_base.resize_with(new_capacity, || None);
705        let mut new_ctrl: Vec<u8> = Vec::with_capacity(new_capacity);
706        new_ctrl.resize_with(new_capacity, || 0x00);
707
708        //println!("Growing from {} to {}", self.data.len(), new_capacity);
709
710        // Reset the instance values for MAD compression
711        // Finds a prime > len, starting much larger to ensure even spread
712        self.prime = hash_lib::next_prime(new_capacity * 2);
713        let mut rng = rand::rng(); // Thread-local RNG
714        self.scale = rng.random_range(1..=self.prime - 1);
715        self.shift = rng.random_range(0..=self.prime - 1);
716
717        // Move entries from self.data into new_base
718        // For each Some entry in old table, calculate new location
719        // via hash/compression and insert the entry into new_base[location]
720        for e in &mut self.data {
721            if let Some(v) = e.take() {
722                // MAD compression algorithm
723                let mut location: usize = ((hash_lib::hash(&v.key)).wrapping_mul(self.scale))
724                    //let mut location: usize = ((hash_lib::hash(&v.key().unwrap())).wrapping_mul(self.scale))
725                    .wrapping_add(self.shift)
726                    % (self.prime)
727                    % (new_capacity);
728
729                // Handle potential collisions for new vec insert
730                let mut i: usize = 0;
731                while new_base[location].is_some() {
732                    location = (location + i.pow(2)) % new_capacity;
733                    i += 1;
734                }
735                new_base[location] = Some(v);
736                new_ctrl[location] = 0x01;
737            }
738        }
739
740        //Update the struct instance to the new collections
741        self.data = new_base;
742        self.ctrl = new_ctrl;
743    }
744}
745
746// Returns a tuple of borrowed key:value pairs (&K, &V)
747//pub struct Iter<'a, K, V> {
748//    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
749//}
750//impl<'a, K, V> Iterator for Iter<'a, K, V>
751//    where
752//        K: Debug + Hash + PartialEq,
753//        V: PartialEq,
754//{
755//
756//    type Item = (&'a K, &'a V);
757//
758//    fn next(&mut self) -> Option<Self::Item> {
759//        for opt_entry in self.iter.by_ref() {
760//            if let Some(entry) = opt_entry {
761//                return Some((&entry.key, &entry.value));
762//            }
763//        }
764//        None
765//    }
766//}
767pub struct Iter<'a, K, V> {
768    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
769}
770
771impl<'a, K, V> Iterator for Iter<'a, K, V>
772where
773    K: Debug + Hash + PartialEq,
774    V: PartialEq,
775{
776    type Item = (&'a K, &'a V);
777
778    fn next(&mut self) -> Option<Self::Item> {
779        self.iter
780            .find_map(|opt_entry| opt_entry.as_ref().map(|entry| (&entry.key, &entry.value)))
781    }
782}
783
784// Returns a tuple of owned key:value pairs (K, V)
785pub struct IntoIter<K, V> {
786    inner: std::iter::Zip<
787        std::vec::IntoIter<Option<Entry<K, V>>>,
788        std::vec::IntoIter<u8>
789    >,
790}
791impl<K, V> Iterator for IntoIter<K, V> {
792    type Item = (K, V);
793
794    fn next(&mut self) -> Option<Self::Item> {
795        for (slot, tag) in self.inner.by_ref() {
796            if tag == 1 {
797                if let Some(entry) = slot {
798                    return Some((entry.key, entry.value));
799                }
800            }
801
802        }
803        None
804    }
805}
806
807// Allows caller to consume HashMap as an iterator
808impl<K, V> IntoIterator for HashMap<K, V> {
809    type Item = (K, V);
810    type IntoIter = IntoIter<K, V>;
811
812    fn into_iter(self) -> Self::IntoIter {
813        IntoIter {
814            inner: self.data.into_iter().zip(self.ctrl),
815        }
816    }
817}
818
819pub struct Keys<'a, K, V> {
820    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
821}
822impl<'a, K, V> Iterator for Keys<'a, K, V> {
823    type Item = &'a K;
824
825    fn next(&mut self) -> Option<Self::Item> {
826        if let Some(entry) = self.iter.by_ref().flatten().next() {
827            return Some(&entry.key);
828        }
829        None
830    }
831}
832
833pub struct Values<'a, K, V> {
834    iter: std::slice::Iter<'a, Option<Entry<K, V>>>,
835}
836impl<'a, K, V> Iterator for Values<'a, K, V> {
837    type Item = &'a V;
838
839    fn next(&mut self) -> Option<Self::Item> {
840        if let Some(entry) = self.iter.by_ref().flatten().next() {
841            return Some(&entry.value);
842        }
843        None
844    }
845}
846
847// Unit tests
848/////////////
849
850#[test]
851// Generic type test
852fn probing_hash_table_test() {
853    //Creates a new hash map
854    let mut map = HashMap::<&str, u8>::new();
855
856    assert_eq!(map.data.len(), 2);
857    assert_eq!(map.ctrl.len(), 2);
858
859    // Illustrates that put() and get() work
860    map.put("Peter", 40);
861    assert_eq!(map.size, 1);
862    assert_eq!(map.data.len(), 5);
863    assert_eq!(map.ctrl.len(), 5);
864
865    let fetch = map.get("Peter").unwrap();
866    //assert_eq!(*fetch, 40 as u8);
867    assert_eq!(*fetch, 40);
868
869    // Illustrates that the map grows correctly
870    map.put("Brain", 39); // Grows the map
871    assert_eq!(map.data.len(), 5);
872    map.put("Remus", 22);
873    map.put("Bobson", 36); // Grows the map
874    assert_eq!(map.data.len(), 11);
875    map.put("Dingus", 18);
876    map.put("Dangus", 27); // Grows the map
877    assert_eq!(map.size, 6);
878    assert_eq!(map.data.len(), 23);
879
880    // Illustrates that contains() works as intended
881    assert!(map.contains("Dingus"));
882
883    // Illustrates that put() returns old values and
884    // overwrites existing values upon collision...
885    let collision = map.put("Peter", 41).unwrap();
886    assert_eq!(collision.value, 40_u8);
887    //assert_eq!(*collision.value().unwrap(), 40_u8);
888    let new_val = map.get("Peter").unwrap();
889    //assert_eq!(*new_val, 41 as u8);
890    assert_eq!(*new_val, 41);
891    // Without increasing the list size
892    assert_eq!(map.size, 6);
893
894    // Illustrates that removes entries by key and returns the value
895    assert!(map.contains("Dangus"));
896    let removed = map.remove(&"Dangus").unwrap();
897    assert_eq!(
898        removed,
899        Entry {
900            key: "Dangus",
901            value: 27
902        }
903    );
904    assert!(!map.contains("Dangus"));
905}
906
907#[test]
908// Tests the custom update value function
909fn mut_val_test() {
910    //Creates a new hash map
911    let mut count = HashMap::<char, u8>::new();
912    let phrase: &str = "Hello, sickos";
913
914    // Seeds the map with just the characters and a default value
915    for char in phrase.chars() {
916        count.put(char, 0);
917    }
918
919    eprintln!("\nInitial state:");
920    count.contents();
921
922    // Iterates through the map again, updating each value based on its occurence
923    // NOTE: Can also be used as initial mapping operation with the same code, 
924    // but was split here for illustrative purposes
925    for char in phrase.chars() {
926        count.mut_val_or(char, |x| *x += 1, 1);
927    }
928
929    // Pretty-prints the contents of the map
930    eprintln!("\nModified:");
931    count.contents();
932
933    // Uncomment to trigger debug print
934    //assert!(count.is_empty());
935}
936
937#[test]
938// Tests that the structure is iterable
939fn iter_test() {
940    //Creates a new hash map of the frequence letters in the given phrase
941    let mut map = HashMap::<char, u8>::new();
942    let phrase: &str = "Hello, sickos";
943    for char in phrase.chars() {
944        map.mut_val_or(char, |x| *x += 1, 1);
945    }
946
947    eprintln!("\nInitial map state:");
948    map.contents();
949
950    // Iterates through the map, pushing entries 
951    // to a Vec as tuples returned from the iterator
952    eprintln!("\nCompact map: ");
953    let mut vec = Vec::new();
954    for e in map.iter() {
955        vec.push(*e.0);
956        eprintln!("{e:?}")
957    }
958
959    eprintln!("\nFinal map state:");
960    map.contents();
961
962    eprintln!("\nFinal vec state:");
963    for e in vec.iter() {
964        eprintln!("{:?}", *e)
965    }
966
967    // Proves that all entries are still valid
968    // Vec contains only keys
969    assert!(vec.contains(&'s'));
970    assert!(vec.contains(&'i'));
971    assert!(vec.contains(&'c'));
972    assert!(vec.contains(&'k'));
973    assert!(vec.contains(&'o'));
974
975    // Map contains whole entries
976    assert!(map.contains(&'s'));
977    assert!(map.contains(&'i'));
978    assert!(map.contains(&'c'));
979    assert!(map.contains(&'k'));
980    assert!(map.contains(&'o'));
981
982    eprintln!("\nPrinting just the keys:");
983    for key in map.keys() {
984        eprintln!("{key}");
985    }
986
987    eprintln!("\nPrinting just the values:");
988    for value in map.values() {
989        eprintln!("{value}");
990    }
991
992    // Consume the map and transfer it to a new map
993    let mut new_map = HashMap::<char, u8>::new();
994    for e in map.into_iter() {
995        new_map.put(e.0, e.1);
996    }
997    // Illegal, map is moved/consumed!
998    //assert!(map.is_empty());
999    
1000    // But new_map contains data now
1001    assert!(new_map.contains(&'s'));
1002    assert!(new_map.contains(&'i'));
1003    assert!(new_map.contains(&'c'));
1004    assert!(new_map.contains(&'k'));
1005    assert!(new_map.contains(&'o'));
1006
1007    // Uncomment to trigger debug print
1008    //assert!(map.is_empty());
1009}
1010
1011#[test]
1012// Tests that into_iter consumes the map, 
1013// and that rehash and shrink_to_fit delete removed entries
1014fn rehash_test() {
1015    //Creates a new hash map
1016    let mut map = HashMap::<char, u8>::new();
1017    let word: &str = "Hello, sickos!";
1018    for k in word.chars() {
1019        map.mut_val_or(k, |x| *x += 1, 1);
1020    }
1021
1022    // Prints initial state of the map
1023    // and illustrates that values exist
1024    eprintln!("\nInitial state: ");
1025    map.contents();
1026    assert!(map.contains(&','));
1027    assert!(map.contains(&' '));
1028    assert!(map.contains(&'!'));
1029    assert!(map.contains(&'s'));
1030    assert_eq!(map.len(), 11);
1031    assert_eq!(map.data.len(), 23);
1032
1033    // Removes the values
1034    map.remove(&',');
1035    map.remove(&' ');
1036    map.remove(&'!');
1037    map.remove(&'s');
1038    map.remove(&'i');
1039    map.remove(&'c');
1040    map.remove(&'k');
1041
1042    // Checks that the values are removed
1043    eprintln!("\nModified: ");
1044    map.contents();
1045    assert!(!map.contains(&','));
1046    assert!(!map.contains(&' '));
1047    assert!(!map.contains(&'!'));
1048    assert!(!map.contains(&'s'));
1049    assert_eq!(map.len(), 4);
1050    assert_eq!(map.data.len(), 23);
1051
1052    // Rehashes to get rid of deleted items
1053    // map goes out of scope here
1054    let new_map = map.rehash();
1055
1056    // Illegal operation! into_iter has consumed map!
1057    // assert!(map.contains(&'H'));
1058
1059    // Checks that the values are removed
1060    // but the capacity remains the same as before
1061    eprintln!("\nRehash: ");
1062    new_map.contents();
1063    assert!(!new_map.contains(&','));
1064    assert!(!new_map.contains(&' '));
1065    assert!(!new_map.contains(&'!'));
1066    assert!(!new_map.contains(&'s'));
1067    assert_eq!(new_map.len(), 4);
1068    assert_eq!(new_map.data.len(), 23);
1069
1070    // Repeats the process for shrink_to_fit
1071    //
1072    //Creates a new hash map
1073    let mut map = HashMap::<char, u8>::new();
1074    let word: &str = "Hello, sickos!";
1075    for k in word.chars() {
1076        map.mut_val_or(k, |x| *x += 1, 1);
1077    }
1078
1079    // Prints initial state of the map
1080    // and illustrates that values exist
1081    eprintln!("\nInitial state: ");
1082    map.contents();
1083    assert!(map.contains(&','));
1084    assert!(map.contains(&' '));
1085    assert!(map.contains(&'!'));
1086    assert!(map.contains(&'s'));
1087    assert_eq!(map.len(), 11);
1088    assert_eq!(map.data.len(), 23);
1089
1090    // Removes the values
1091    map.remove(&',');
1092    map.remove(&' ');
1093    map.remove(&'!');
1094    map.remove(&'s');
1095    map.remove(&'i');
1096    map.remove(&'c');
1097    map.remove(&'k');
1098
1099    // Checks that the values are removed
1100    eprintln!("\nModified: ");
1101    map.contents();
1102    assert!(!map.contains(&','));
1103    assert!(!map.contains(&' '));
1104    assert!(!map.contains(&'!'));
1105    assert!(!map.contains(&'s'));
1106    assert_eq!(map.len(), 4);
1107    assert_eq!(map.data.len(), 23);
1108
1109    // Rehashes to get rid of deleted items
1110    // map goes out of scope here
1111    let new_map = map.shrink_to_fit();
1112
1113    // Checks that the values are removed
1114    // but the capacity remains the same as before
1115    eprintln!("\nRehash (shrink to fit): ");
1116    new_map.contents();
1117    assert!(!new_map.contains(&','));
1118    assert!(!new_map.contains(&' '));
1119    assert!(!new_map.contains(&'!'));
1120    assert!(!new_map.contains(&'s'));
1121    assert_eq!(new_map.len(), 4);
1122    assert_eq!(new_map.data.len(), 11);
1123
1124    // Uncomment to trigger debug print
1125    //panic!("MANUAL TEST FAILURE");
1126}
1127
1128
1129// TODO: This belongs in either an external runner or in an integration test module
1130pub fn example() {
1131    //Creates a new hash map
1132    let mut map = HashMap::<&str, u8>::new();
1133
1134    let s = format!(
1135        "Map stats: size: {}, capacity: {}, active entries: {}",
1136        map.size,
1137        map.data.len(),
1138        map.live
1139    );
1140    let l = "=".repeat(s.len());
1141    println!("{l}\n{s}");
1142
1143    // Puts some entries into the map
1144    println!("Building the map...");
1145    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
1146    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
1147    for (k, v) in names.iter().zip(values.into_iter()) {
1148        map.put(k, v);
1149    }
1150
1151    // Checks that the map contains what we'd expect
1152    if !map.contains("Peter") {
1153        panic!()
1154    };
1155    let val = map.get("Peter").unwrap();
1156    println!("Peter is {val}");
1157
1158    // Replaces a value for a given key and
1159    // checks that the new value took
1160    let new = 41;
1161    let old = map.put("Peter", new).unwrap().value;
1162    println!("[Peter's age increased from {old} to {new}]");
1163    let val = map.get("Peter").unwrap();
1164    println!("Uhhhhh, I meant Peter is {val}");
1165
1166    // Shows the map and its data
1167    println!(
1168        "\nMap stats: size: {}, capacity: {}, active entries: {}",
1169        map.size,
1170        map.data.len(),
1171        map.live
1172    );
1173    for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1174        println!("\t{m:>3}: {e:?}")
1175    }
1176
1177    // Illustrates removing entries
1178    println!("\nThere can be only one!");
1179    names.remove(0);
1180    for e in names {
1181        println!("Remove: {}", map.remove(&e).unwrap().key);
1182    }
1183
1184    // The final result
1185    let s = format!(
1186        "\nMap stats: size: {}, capacity: {}, active entries: {}",
1187        map.size,
1188        map.data.len(),
1189        map.live
1190    );
1191    println!("{s}");
1192    for (e, m) in map.data.iter().zip(map.ctrl.iter()) {
1193        println!("\t{m:>3}: {e:?}")
1194    }
1195
1196    let mut count: HashMap<char, u8> = HashMap::new();
1197    let word: &str = "The quick brown fox jumps over the lazy dog";
1198    for k in word.chars() {
1199        count.mut_val_or(k, |x| *x += 1, 1);
1200    }
1201    println!(
1202        "\n\nString: {word:?}
1203\nNOTE: There are 28 entries because T != t, and the space literal ' ' is included.
1204\nMap stats: size: {}, capacity: {}, active entries: {}",
1205        count.occupied(), // Total number of elements
1206        count.capacity(),
1207        count.len() // Active entries
1208
1209    );
1210    for (e, m) in count.data.iter().zip(count.ctrl.iter()) {
1211        println!("\t{m:>3}: {e:?}")
1212    }
1213
1214    let l = "=".repeat(s.len());
1215    println!("\n{l}");
1216
1217}