Module probing_hash_table

Source
Expand description

Safe, open addressing hash table with MAD compression and quadratic probing

§About

This hash map implementation uses open-addressing to take advantage of cache locality, and relies on a “multiply, add, and divide” (MAD) compression algorithm with quadratic probing to handle collisions.

This structure provides the basis for several other structures in this library, including the PriorityQueue<K, P> and HashSet<T> implementations.

§Design

This structure uses Rust’s default hasher to hash keys. The default hasher currently represents a cryptographically strong SipHash-1-3 design as of the publication date (9/2025). In practice, hashing small key objects <= 128 bytes provide similar performance as ULID generation, but hashing arbitrarily large key objects may incur performance penalties.

The primary HashMap struct uses a private byte mask (as a “control” list) to track positions with valid (“live”), empty, and removed (“tombstone”) entries. Using a conventional free list and overwriting data for a marked slot would break the quadratic probing logic.

+-------------------------------------------------+
| ctrl | Option<Entry<K, V>>                      |
|------+------------------------------------------|
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: "Dingus", value: 18 }) |
|    1 | Some(Entry { key: "Brain", value: 37 })  |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: "Remus", value: 22 })  |
|  187 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: "Bobson", value: 36 }) |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|  187 | None                                     |
+-------------------------------------------------+

§Insertion

The structure contains two primary ways to insert entries into the map.

The put() operation overwrites values for existing keys, but does not mutate the existing key. This keeps with the standard library’s implementation where two keys can be == without being identical.

    use dsa_rust::associative::probing_hash_table::HashMap;

    // Build a map with put()
    let mut map = HashMap::<&str, u8>::new();
    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
    for (k, v) in names.iter().zip(values.into_iter()) {
        map.put(k, v);
    }

The structure also provides a mut_val_or() operation for mutating the value(s) of the map or inserting a default value. This is useful with the structures iterators to make linear passes over source data during mapping.

    use dsa_rust::associative::probing_hash_table::HashMap;

    // Build a map with mut_val_or()
    let mut map = HashMap::<char, u8>::new();
    let word: &str = "Hello, sickos!";
    for k in word.chars() {
        map.mut_val_or(k, |x| *x += 1, 1);
    }

§Removal & Rehashing

The structure contains a remove() operation that returns an owned entry if it exists in the map. The remove() operation marks the index with a tombstone, and uses std::mem::take to remove and return the value. Even though take() swaps out a None variant the index remains unusable until the calling code invokes a rehash operation. This is done in order to retain probing logic. For long-lived maps with many removals/tombstones, the structure provides two re-hashing operations. The rehash() operation rehashes the entire map (freeing up tombstones) at the current underlying structure’s capacity. This essentially allows you to permanently remove deleted entries and reduce the map’s load factor. Conversely, the shrink_to_fit() operation rehashes the map (freeing up tombstones) down to the minimum backing capacity that still provides a load factor of <= .5. Both processes necessarily require O(n) time where n is the number of live entries in the map.

Example of a map that has had many entries removed:

+-------------------------------------------------+
| ctrl | Option<Entry<K, V>>                      |
|------+------------------------------------------|
|    1 | Some(Entry { key: 'o', value: 2 })       |
|  187 | None                                     |
|    0 | None                                     |
|  187 | None                                     |
|    1 | Some(Entry { key: 'l', value: 2 })       |
|  187 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|  187 | None                                     |
|  187 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'H', value: 1 })       |
|    0 | None                                     |
|  187 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'e', value: 1 })       |
|  187 | None                                     |
|    0 | None                                     |
+-------------------------------------------------+

This example illustrates how the map might look after a rehash():

+-------------------------------------------------+
| ctrl | Option<Entry<K, V>>                      |
|------+------------------------------------------|
|    1 | Some(Entry { key: 'H', value: 1 })       |
|    1 | Some(Entry { key: 'o', value: 2 })       |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'l', value: 2 })       |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'e', value: 1 })       |
|    0 | None                                     |
|    0 | None                                     |
+-------------------------------------------------+

This example illustrates how the map might look after a shrink_to_fit():

+-------------------------------------------------+
| ctrl | Option<Entry<K, V>>                      |
|------+------------------------------------------|
|    1 | Some(Entry { key: 'H', value: 1 })       |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'l', value: 2 })       |
|    1 | Some(Entry { key: 'o', value: 2 })       |
|    0 | None                                     |
|    0 | None                                     |
|    1 | Some(Entry { key: 'e', value: 1 })       |
|    0 | None                                     |
|    0 | None                                     |
|    0 | None                                     |
+-------------------------------------------------+

§Example


    use dsa_rust::associative::probing_hash_table::HashMap;

    //Creates a new hash map with str keys and u8 values
    let mut map = HashMap::<&str, u8>::new();

    println!("Map stats: size: {}, capacity: {}, active entries: {}",
        map.occupied(),
        map.capacity(),
        map.len()
    );

    // Puts some entries into the map
    println!("Building the map...");
    let mut names: Vec<&str> = vec!["Peter", "Brain", "Remus", "Bobson", "Dingus", "Dangus"];
    let values: Vec<u8> = vec![39, 37, 22, 36, 18, 27];
    for (k, v) in names.iter().zip(values.into_iter()) {
        map.put(k, v);
    }

    // Checks that the map contains what we'd expect
    if map.contains("Peter") == false {
        panic!()
    };
    let val = map.get("Peter").unwrap();
    println!("Peter is {val}");

    // Replaces a value for a given key and
    // checks that the new value took
    let new = 41;

    //let old = map.put("Peter", new).unwrap().value;
    let old = map.get("Peter").unwrap().clone();
    map.put("Peter", new);

    println!("[Peter's age increased from {old} to {new}]");
    let val = map.get("Peter").unwrap();
    println!("Uhhhhh, I meant Peter is {val}");

    // Shows the map and its data
    println!(
        "\nMap stats: size: {}, capacity: {}, active entries: {}",
        map.occupied(),
        map.capacity(),
        map.len()
    );
    map.contents();

    // Illustrates removing entries
    println!("\nThere can be only one!");
    names.remove(0);
    for e in names {
        let removed = map.remove(&e);
        if let Some(entry) = removed {
            println!("Remove: {}", entry.key());
        }
    }

    // The final result
    println!("\nMap stats: size: {}, capacity: {}, active entries: {}",
        map.occupied(),
        map.capacity(),
        map.len()
    );
    map.contents();

Structs§

Entry
HashMap
IntoIter
Iter
Keys
Values

Functions§

example