dsa_rust/sequences/
indexed_skip_list.rs

1/*! A safe, indexed skip list
2
3# About
4Skip lists are sorted, probabalistic structures made up of stacked lists of varying length to allow for truncated _O(log(n))_ navigation. Canonically linked lists are built from doubly-linked lists, but this is not a defining characteristic of the ADT. Regardless of the base list representation used, the navigational algorithm results in what is essentially a logical linked list.
5
6Properly implemented skip lists provide _O(log(n))_ expected time complexity for search, insert, and removal operations. This provides a significant advantage over keeping sorted array- or link-based list invariants, which have _worst-case O(n)_ removal (average _O(n/2)_) temporal performance. Skip lists are also simpler than self-balancing tree structures, which are commonly used for sorted list and map structures. Skip lists also generally provide easier and finer-grained control when adapted for concurrent operations. There is a reason Java's `concurrentSkipListMap` is so popular.
7
8# Design
9This design uses `Vec`-backed storage for [SkipNode]s that contain a list (tower) of "next" values, and a single "previous" value that represent indexes within the backing vector. 
10
11The list features a dynamic max height _h_ that is logarithmically proportional to the number of elements in the list _n_ such that _h = log(n) in the expected case_. The logarithmic growth ensures that the average search, insertion, and deletion operations remain efficient, typically with expected _O(log(n))_ time complexity.
12
13## The Search Algorithm
14The point of the search algorithm is to find the first node (or handle/position) `p` in skip list `S` that represents the largest comparable value <= to the search key `k`. This algorithm can be broken into two steps:
15Step 1) loop`let candidate = p.peek()`, if `candidate >= k`, return `p`, otherwise move to `p.next()`. Repeat until `p.peek()` >= `k`.
16Step 2) Drop down a level: If `S.below(p) == 0` you're at the lowest level and the search terminates.
17
18## Visual Examples
19An initial, empty skip list with one level and no data:
20```text
21S0: HEAD -> None
22```
23
24Inserting the first node triggers an automatic tower level, even if it ends up empty. This provides the algorithm with a starting point:
25```text
26S1: HEAD ----------> None
27S0: HEAD -> [ 5 ] -> None
28```
29
30After inserting `['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f']`, the list's `SkipNodes` might contain the following towers. Notice that its always possible to tell the last item in the list because its tower is empty. This makes sense, because inserting the last element can only point to `None`. As you can see by the index notation on the left-hand side of the table, the backing structure retains the insertion order; the backing structure remains unsorted.
31```text
32HEAD[0]: [1, 2, 9, 7]
33a[1]: [5]
34c[2]: [4, 4]
35e[3]: [9]
36d[4]: [3, 9]
37b[5]: [2]
38i[6]: []
39g[7]: [8, 6, 6]
40h[8]: [6]
41f[9]: [7, 7, 7]
42```
43The towers appear to contain rather nonsensical values when printed as they exist in memory, which represents insertion order. However, if you follow the indexes from the `HEAD` node, and re-arrange the nodes into _lexicographically sorted order_, which is what the navigational algorithms in the skiplist achieve, you'd get the following towers.
44```text
45HEAD[0]: [1, 2, 9, 7]
46a[1]: [5]
47b[5]: [2]
48c[2]: [4, 4]
49d[4]: [3, 9]
50e[3]: [9]
51f[9]: [7, 7, 7]
52h[8]: [6]
53i[6]: []
54g[7]: [8, 6, 6]
55```
56
57When you rotate the mapping 90 degrees you can start to visualize the skip list layers as logically linked lists formed by tower index placement. Notice that the towers roughly mirror the raw tower output from the previous output.
58```text
59L3: [ g[7] ] -> None
60L2: [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
61L1: [ c[2] ] -> [ d[4] ] -> [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
62L0: [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
63```
64Finally, if you extend each index reference to align with its sorted position within the list, a classical skip list diagram emerges.
65```text
66L3: HEAD -------------------------------------------------------------------------> [ g[7] ] -------------------------> None
67L2: HEAD -------------------------------------------------------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
68L1: HEAD -------------------------> [ c[2] ] -> [ d[4] ] -------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
69L0: HEAD -> [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
70```
71
72# Example code
73```rust
74    let mut list = SkipList::<char>::new();
75
76    // Inserts 9 values into the skip list
77    // with a consuming iterator, moving values
78    // into the list
79    let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
80    for e in l1.into_iter() {
81        list.insert(e)
82    }
83
84    // Illustrates that the list exists as a sorted invariant
85    let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
86    for (i, e) in list.iter().enumerate() {
87        assert_eq!(e, &sorted[i]);
88    }
89
90    // Illustrates the Kth function in a 0-indexed list
91    assert_eq!(list.get_kth(6).unwrap(), &'g');
92
93    // Query by range using Rust's RangeBounds semantics
94    let val = ['c', 'd', 'e', 'f'];
95    for (i, e) in list.range('c'..='f').enumerate() {
96        assert_eq!(e, &val[i])
97    }
98
99```
100*/
101
102use rand::Rng; // For coin flips
103use std::borrow::Borrow; // For passing borrowed parameters
104use std::ops::{Bound, RangeBounds}; // For range iterators
105
106const MAX_HEIGHT: usize = 32;
107
108#[derive(Debug)]
109struct SkipNode<T> {
110    value: Option<T>,                  // None for sentinel
111    next: [Option<usize>; MAX_HEIGHT], // forward links
112    prev: Option<usize>, // back links at s0 for reverse iteration
113}
114
115pub struct SkipList<T> {
116    nodes: Vec<SkipNode<T>>,
117    list_height: usize,
118}
119impl<T: Ord> Default for SkipList<T> {
120    fn default() -> Self {
121        Self::new()
122    }
123}
124impl<T: Ord> SkipList<T> {
125    pub fn new() -> Self {
126        let sentinel = SkipNode {
127            value: None,
128            next: [None; MAX_HEIGHT],
129            prev: None
130        };
131
132        Self {
133            nodes: vec![sentinel],
134            list_height: 1,
135        }
136    }
137
138    pub fn len(&self) -> usize {
139        self.nodes.len() - 1 // HEAD doesn't count!
140    }
141
142    pub fn is_empty(&self) -> bool {
143        self.nodes.len() - 1 == 0
144    }
145
146    pub fn locate<Q>(&self, key: &Q) -> Option<usize>
147    where
148        Q: Ord + ?Sized,
149        T: Borrow<Q>,
150    {
151        let update = self.find_predecessors(key);
152        let candidate = self.nodes[update[0]].next[0];
153    
154        match candidate {
155            Some(idx)
156                if self.nodes[idx].value.as_ref().unwrap().borrow() == key =>
157            {
158                Some(idx)
159            }
160            _ => None,
161        }
162    }
163
164    pub fn contains<Q>(&self, key: &Q) -> bool 
165    where 
166        Q: Ord + ?Sized,
167        T: Borrow<Q>
168    {
169        self.locate(key).is_some()
170    }
171
172    pub fn insert(&mut self, value: T)
173    where
174        T: Ord,
175    {
176        let height = self.random_height();
177        let update = self.find_predecessors(&value);
178    
179        // ... height adjustment logic ...
180    
181        let new_index = self.nodes.len();
182        
183        // 1. SET NEW NODE'S PREV
184        // update[0] is the index of the node immediately before the new one at Level 0.
185        let predecessor_idx = update[0];
186        
187        // 2. GET NEW NODE'S NEXT (at Level 0)
188        // This is the node that will now need to point back to us.
189        let successor_idx = self.nodes[predecessor_idx].next[0];
190    
191        self.nodes.push(SkipNode {
192            value: Some(value),
193            next: [None; MAX_HEIGHT],
194            prev: Some(predecessor_idx), // Points back to A
195        });
196    
197        for (level, _) in update.iter().enumerate().take(height) {
198        //for level in 0..height {
199            let prev_idx = update[level];
200            self.nodes[new_index].next[level] = self.nodes[prev_idx].next[level];
201            self.nodes[prev_idx].next[level] = Some(new_index);
202        }
203    
204        // 3. UPDATE SUCCESSOR'S PREV
205        // If there is a node after us, it must now point back to the new node.
206        if let Some(next_idx) = successor_idx {
207            self.nodes[next_idx].prev = Some(new_index);
208        }
209    }
210
211    /// Removes and returns the value for a given key, if it exists in the list.
212    pub fn remove<Q>(&mut self, key: &Q) -> Option<T>
213    where
214        Q: Ord + ?Sized,
215        T: Borrow<Q>,
216    {
217        let mut update_target = self.find_predecessors(key);
218    
219        // 1. Identify the target index
220        let target = match self.nodes[update_target[0]].next[0] {
221            //Some(idx) if self.nodes[idx].value.as_ref().map_or(false, |v| v.borrow() == key) => idx,
222            Some(idx) if self.nodes[idx].value.as_ref().is_some_and(|v| v.borrow() == key) => idx,
223            _ => return None,
224        };
225    
226        // 2. PRE-FETCH predecessors for the node that WILL move
227        // We do this BEFORE swap_remove so the indices in the skip list are still valid
228        let last_idx = self.nodes.len() - 1;
229        let mut update_moved = [0usize; MAX_HEIGHT];
230        if target != last_idx {
231            let moved_val = self.nodes[last_idx].value.as_ref().unwrap().borrow();
232            update_moved = self.find_predecessors(moved_val);
233        }
234    
235        // 3. Logical Rewiring (Remove target from the chain)
236        if let Some(next_idx) = self.nodes[target].next[0] {
237            self.nodes[next_idx].prev = self.nodes[target].prev;
238        }
239        for (level, val) in update_target.iter_mut().enumerate().take(self.list_height) {
240        //for level in 0..self.list_height {
241            if self.nodes[*val].next[level] == Some(target) {
242                self.nodes[*val].next[level] = self.nodes[target].next[level];
243            }
244        }
245    
246        // 4. Physical Swap-Remove
247        let removed_node = self.nodes.swap_remove(target);
248    
249        // 5. Fix references for the node that just moved into 'target'
250        if target < self.nodes.len() {
251            // Any node in 'update_moved' that pointed to 'last_idx' must now point to 'target'
252            for level in 0..self.list_height {
253                if update_moved[level] == last_idx {
254                    // Rare case: the moved node's predecessor was itself the target
255                    // We handle this by using the updated pointers from the target's removal
256                    update_moved[level] = update_target[level];
257                }
258                
259                if self.nodes[update_moved[level]].next[level] == Some(last_idx) {
260                    self.nodes[update_moved[level]].next[level] = Some(target);
261                }
262            }
263            // Update the successor of the moved node to point back to the new index
264            if let Some(next_idx) = self.nodes[target].next[0] {
265                self.nodes[next_idx].prev = Some(target);
266            }
267        }
268    
269        // ... clean up height ...
270        removed_node.value
271    }
272
273    /// Returns the Kth value in the list, if it exists.
274    pub fn get_kth(&self, k: usize) -> Option<&T> {
275        let mut idx = self.nodes[0].next[0];
276        let mut i = 0;
277    
278        while let Some(current) = idx {
279            if i == k {
280                return self.nodes[current].value.as_ref();
281            }
282    
283            idx = self.nodes[current].next[0];
284            i += 1;
285        }
286    
287        None
288    }
289
290    /// Returns an inclusive iterator over a range of values 
291    /// in the list from `start` to `end`.
292    pub fn range<Q, R>(&self, range: R) -> RangeIter<'_, T, Q, R>
293    where
294        Q: Ord + ?Sized,
295        T: Borrow<Q>,
296        R: RangeBounds<Q>,
297    {
298        // --- FIND FRONT ---
299        let front = match range.start_bound() {
300            Bound::Included(start) => self.nodes[self.find_predecessors(start)[0]].next[0],
301            Bound::Excluded(start) => {
302                let idx = self.nodes[self.find_predecessors(start)[0]].next[0];
303                if let Some(i) = idx {
304                    if self.nodes[i].value.as_ref().unwrap().borrow() == start {
305                        self.nodes[i].next[0]
306                    } else { Some(i) }
307                } else { None }
308            }
309            Bound::Unbounded => self.nodes[0].next[0],
310        };
311    
312        // --- FIND BACK ---
313        let back = match range.end_bound() {
314            Bound::Included(end) => {
315                // Find predecessors of 'end'. 
316                // If the element at the end of the search IS 'end', that's our back.
317                // If not, the predecessor itself is our back.
318                let update = self.find_predecessors(end);
319                let candidate = self.nodes[update[0]].next[0];
320                if let Some(idx) = candidate {
321                    if self.nodes[idx].value.as_ref().unwrap().borrow() == end {
322                        Some(idx)
323                    } else {
324                        // Predicate check: Ensure we aren't returning the sentinel (idx 0)
325                        if update[0] == 0 { None } else { Some(update[0]) }
326                    }
327                } else if update[0] == 0 { None } else { Some(update[0]) }
328            }
329            Bound::Excluded(end) => {
330                let update = self.find_predecessors(end);
331                if update[0] == 0 { None } else { Some(update[0]) }
332            }
333            Bound::Unbounded => {
334                // To find the absolute end, we find predecessors for a "theoretically infinite" value
335                // or simply walk the tallest tower to the end.
336                let mut curr = 0;
337                for level in (0..self.list_height).rev() {
338                    while let Some(next_idx) = self.nodes[curr].next[level] {
339                        curr = next_idx;
340                    }
341                }
342                if curr == 0 { None } else { Some(curr) }
343            }
344        };
345    
346        RangeIter {
347            list: self,
348            front,
349            back,
350            range,
351            _marker: std::marker::PhantomData,
352        }
353    }
354
355    pub fn iter(&self) -> Iter<'_, T> {
356        // Walk the express lanes to find the very last node in O(log n)
357        let mut tail = 0;
358        for level in (0..self.list_height).rev() {
359            while let Some(next_idx) = self.nodes[tail].next[level] {
360                tail = next_idx;
361            }
362        }
363    
364        Iter {
365            list: self,
366            next: self.nodes[0].next[0], // First node after sentinel
367            prev: if tail == 0 { None } else { Some(tail) },
368        }
369    }
370
371    // Utility functions
372    ////////////////////
373
374    fn random_height(&self) -> usize {
375        let mut level = 1;
376        let mut rng = rand::rng();
377        while level < MAX_HEIGHT && rng.random::<bool>() {
378            level += 1;
379        }
380        level
381    }
382
383    fn find_predecessors<Q>(&self, key: &Q) -> [usize; MAX_HEIGHT]
384    where
385        Q: Ord + ?Sized,
386        T: Borrow<Q>,
387    {
388        let mut update = [0usize; MAX_HEIGHT];
389        let mut idx = 0;
390    
391        for level in (0..self.list_height).rev() {
392            loop {
393                match self.nodes[idx].next[level] {
394                    None => break,
395                    Some(next_idx) => {
396                        let val = self.nodes[next_idx].value.as_ref().unwrap();
397                        if val.borrow() >= key {
398                            break;
399                        }
400                        idx = next_idx;
401                    }
402                }
403            }
404            update[level] = idx;
405        }
406    
407        update
408    }
409
410    //fn fix_moved_node_references(&mut self, new_idx: usize, old_idx: usize) {
411    //    // We need to find the neighbors of the node that is now at new_idx
412    //    // so we can tell them "I moved!"
413    //    let val = self.nodes[new_idx].value.as_ref().unwrap();
414    //    let update = self.find_predecessors(val);
415    //
416    //    // Update predecessors: they currently point to old_idx, they should point to new_idx
417    //    for level in 0..MAX_HEIGHT {
418    //        if self.nodes[update[level]].next[level] == Some(old_idx) {
419    //            self.nodes[update[level]].next[level] = Some(new_idx);
420    //        } else {
421    //            // Since towers are vertical, if we don't find a match at this level, 
422    //            // we might not find them higher up, but check all to be safe.
423    //        }
424    //    }
425    //
426    //    // Update successor: the node after the moved node needs to update its 'prev'
427    //    if let Some(next_idx) = self.nodes[new_idx].next[0] {
428    //        self.nodes[next_idx].prev = Some(new_idx);
429    //    }
430    //}
431}
432
433pub struct Iter<'a, T> {
434    list: &'a SkipList<T>,
435    next: Option<usize>,
436    prev: Option<usize>
437}
438impl<'a, T> Iterator for Iter<'a, T> {
439    type Item = &'a T;
440
441    fn next(&mut self) -> Option<Self::Item> {
442        let idx = self.next?;
443        let value = self.list.nodes[idx].value.as_ref()?;
444
445        if self.next == self.prev {
446            self.next = None;
447            self.prev = None;
448        } else {
449            self.next = self.list.nodes[idx].next[0];
450        }
451        Some(value)
452    }
453}
454impl<'a, T> DoubleEndedIterator for Iter<'a, T> {
455    fn next_back(&mut self) -> Option<Self::Item> {
456        let idx = self.prev?;
457        let value = self.list.nodes[idx].value.as_ref()?;
458
459        if self.prev == self.next {
460            self.next = None;
461            self.prev = None;
462        } else {
463            let prev = self.list.nodes[idx].prev;
464            // Sentinel check: don't yield index 0
465            self.prev = if prev == Some(0) { None } else { prev };
466        }
467        Some(value)
468    }
469}
470
471pub struct RangeIter<'a, T, Q, R> 
472where 
473    Q: ?Sized,
474    R: RangeBounds<Q>,
475{
476    list: &'a SkipList<T>,
477    front: Option<usize>, // Moves forward
478    back: Option<usize>,  // Moves backward
479    range: R,
480    _marker: std::marker::PhantomData<Q>,
481}
482impl<'a, T, Q, R> Iterator for RangeIter<'a, T, Q, R>
483where
484    Q: Ord + ?Sized,
485    T: Borrow<Q>,
486    R: RangeBounds<Q>,
487{
488    type Item = &'a T;
489
490    fn next(&mut self) -> Option<Self::Item> {
491        let idx = self.front?;
492        
493        // Boundary Check
494        let value = self.list.nodes[idx].value.as_ref().unwrap();
495        if !self.range.contains(value.borrow()) {
496            self.front = None;
497            return None;
498        }
499
500        // Meet/Cross Check: If front matches back, this is the last element
501        if self.front == self.back {
502            self.front = None;
503            self.back = None;
504        } else {
505            self.front = self.list.nodes[idx].next[0];
506        }
507
508        Some(value)
509    }
510}
511impl<'a, T, Q, R> DoubleEndedIterator for RangeIter<'a, T, Q, R>
512where
513    Q: Ord + ?Sized,
514    T: Borrow<Q>,
515    R: RangeBounds<Q>,
516{
517    fn next_back(&mut self) -> Option<Self::Item> {
518        let idx = self.back?;
519
520        // Boundary Check
521        let value = self.list.nodes[idx].value.as_ref().unwrap();
522        if !self.range.contains(value.borrow()) {
523            self.back = None;
524            return None;
525        }
526
527        // Meet/Cross Check
528        if self.back == self.front {
529            self.back = None;
530            self.front = None;
531        } else {
532            // Move back, but ensure we don't land on the sentinel (idx 0)
533            let prev = self.list.nodes[idx].prev;
534            self.back = if prev == Some(0) { None } else { prev };
535        }
536
537        Some(value)
538    }
539}
540
541impl<'a, T: Ord> IntoIterator for &'a SkipList<T> {
542    type Item = &'a T;
543    type IntoIter = Iter<'a, T>;
544
545    fn into_iter(self) -> Self::IntoIter {
546        self.iter()
547    }
548}
549
550#[test]
551fn one() {
552    let mut list = SkipList::<char>::new();
553
554    // Tests basic housekeeping on empty list
555    assert_eq!(list.len(), 0);
556    assert!(list.is_empty());
557    assert!(!list.contains(&'z'));
558
559    // Inserts 9 values into the skip list
560    // with a consuming iterator, moving values
561    // into the list
562    let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
563    for e in values.into_iter() {
564        list.insert(e)
565    }
566
567    // Tests that len gets updated properly
568    assert_eq!(list.len(), 9);
569    assert!(list.contains(&'g'));
570
571    // Tests basic ordering and iteration
572    // Basic iteration with iter()
573    // Clippy wants enumerate instead of external loop counter
574    let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
575    for (i, e) in list.iter().enumerate() {
576        assert_eq!(e, &sorted[i]);
577    }
578    // Double-ended iteration with rev()
579    // Clippy wants saturating_sub instead of loop counter
580    let mut i = 8;
581    for e in list.iter().rev() {
582        assert_eq!(e, &sorted[i]);
583        if i > 0 {
584            i = i.saturating_sub(1)
585        };
586    }
587    // Or if you wanna be fancy about it
588    // zip() stops as soon as one iterator ends, 
589    // eliminating the need for an overflow check
590    for (e, i) in list.iter().rev().zip((0..=8).rev()) {
591        assert_eq!(e, &sorted[i]);
592    }
593    // Iterator inferance using the IntoIter impl
594    let mut i = 0;
595    #[allow(clippy::explicit_counter_loop)]
596    for e in &list {
597        assert_eq!(e, &sorted[i]);
598        i += 1;
599    }
600
601    // Tests the Kth function in a 0-indexed list
602    assert_eq!(list.get_kth(6).unwrap(), &'g');
603
604    // Tests the range function
605    // NOTE: char is Copy so you dont strictly need to borrow 
606    // when setting range bounds, these tests illustrate both
607    // borrowing and not; Note that each bounds must match
608    // so no (&'a'..'f'), only ('a'..'f') or (&'a'..&'f')
609    // Midlist (exclusive)
610    let val = ['c', 'd', 'e'];
611    //for (i, e) in list.range(&'c', &'f').enumerate() {
612    for (i, e) in list.range('c'..'f').enumerate() {
613        assert_eq!(e, &val[i])
614    }
615    // Midlist (inclusive)
616    let val = ['c', 'd', 'e', 'f'];
617    //for (i, e) in list.range(&'c', &'f').enumerate() {
618    for (i, e) in list.range('c'..='f').enumerate() {
619        assert_eq!(e, &val[i])
620    }
621    // Start of list
622    let val = ['a', 'b', 'c', 'd', 'e', 'f'];
623    //for (i, e) in list.range(&'a', &'f').enumerate() {
624    for (i, e) in list.range(..&'f').enumerate() {
625        assert_eq!(e, &val[i])
626    }
627    // End of list
628    let val = ['e', 'f', 'g', 'h', 'i'];
629    for (i, e) in list.range(&'e'..).enumerate() {
630        assert_eq!(e, &val[i])
631    }
632
633    // Tests removal
634    list.remove(&'e');
635    list.remove(&'a');
636    assert!(!list.contains(&'e'));
637    assert!(!list.contains(&'a'));
638    let l2 = ['b', 'c', 'd', 'f', 'g', 'h', 'i'];
639    for (val, i) in list.iter().zip(0..=6) {
640        assert_eq!(val, &l2[i]);
641    }
642
643    // Cant remove what isn't there!
644    assert!(list.remove(&'z').is_none());
645
646    // Everything below here is for display/debug purposes
647    //////////////////////////////////////////////////////
648
649    // Debug print
650    // Prints the tower contents
651    println!("Tower contents by insertion order, NOT sorted order:");
652    for (i, e) in list.nodes.iter().enumerate() {
653        // Collect only the Some values into a new Vec
654        let values: Vec<_> = e.next.iter().filter_map(|&x| x).collect();
655        if let Some(val) = e.value {
656            let v = &val.to_string();
657            println!("{v}[{i}]: {values:?}");
658        } else {
659            println!("HEAD[0]: {values:?}");
660        }
661    }
662    println!();
663
664    // Debug print
665    // Prints the sorted values in the list
666    print!("List values:\n   ");
667    for e in list.iter() {
668        print!("{e:#?} ")
669    }
670    println!();
671
672    //panic!();
673
674}