dsa_rust/sequences/
indexed_skip_list.rs

1/*! A safe, indexed skip list
2
3# About
4Skip lists are sorted, probabalistic structures made up of stacked lists of varying length to allow for truncated _O(log(n))_ navigation. Canonically linked lists are built from doubly-linked lists, but this is not a defining characteristic of the ADT. Regardless of the base list representation used, the navigational algorithm results in what is essentially a logical linked list.
5
6Properly implemented skip lists provide _O(log(n))_ expected time complexity for search, insert, and removal operations. This provides a significant advantage over keeping sorted array- or link-based list invariants, which have _worst-case O(n)_ removal (average _O(n/2)_) temporal performance. Skip lists are also simpler than self-balancing tree structures, which are commonly used for sorted list and map structures. Skip lists also generally provide easier and finer-grained control when adapted for concurrent operations. There is a reason Java's `concurrentSkipListMap` is so popular.
7
8# Design
9This design uses `Vec`-backed storage for [SkipNode]s that contain a list (tower) of "next" values, and a single "previous" value that represent indexes within the backing vector.
10
11The list features a dynamic max height _h_ that is logarithmically proportional to the number of elements in the list _n_ such that _h = log(n) in the expected case_. The logarithmic growth ensures that the average search, insertion, and deletion operations remain efficient, typically with expected _O(log(n))_ time complexity.
12
13## The Search Algorithm
14The point of the search algorithm is to find the first node (or handle/position) `p` in skip list `S` that represents the largest comparable value <= to the search key `k`. This algorithm can be broken into two steps:
15Step 1) loop`let candidate = p.peek()`, if `candidate >= k`, return `p`, otherwise move to `p.next()`. Repeat until `p.peek()` >= `k`.
16Step 2) Drop down a level: If `S.below(p) == 0` you're at the lowest level and the search terminates.
17
18## Visual Examples
19An initial, empty skip list with one level and no data:
20```text
21S0: HEAD -> None
22```
23
24Inserting the first node triggers an automatic tower level, even if it ends up empty. This provides the algorithm with a starting point:
25```text
26S1: HEAD ----------> None
27S0: HEAD -> [ 5 ] -> None
28```
29
30After inserting `['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f']`, the list's `SkipNodes` might contain the following towers. 
31```text
32HEAD[0]: [1, 2, 9, 7]
33a[1]: [5]
34c[2]: [4, 4]
35e[3]: [9]
36d[4]: [3, 9]
37b[5]: [2]
38i[6]: []
39g[7]: [8, 6, 6]
40h[8]: [6]
41f[9]: [7, 7, 7]
42```
43Note that its always possible to tell the last item in the list because its tower is empty. This makes sense, because the last element within the sorted arrangement can only point to `None`. As you can see by the index notation on the left-hand side of the table, the backing structure retains the insertion order; the backing structure remains unsorted.
44
45The structure simply appends elements to the backing structure, so when printed the list retains its insertion order, not its sorted arrangement. As a result, the towers appear to contain rather nonsensical values. However, if you follow the indexes from the `HEAD` node, and re-arrange the nodes into _lexicographically sorted order_, which is what the navigational algorithms in the skiplist achieve, you get the following towers.
46```text
47HEAD[0]: [1, 2, 9, 7]
48a[1]: [5]
49b[5]: [2]
50c[2]: [4, 4]
51d[4]: [3, 9]
52e[3]: [9]
53f[9]: [7, 7, 7]
54g[7]: [8, 6, 6]
55h[8]: [6]
56i[6]: []
57```
58
59When you rotate the mapping 90 degrees you can start to visualize the skip list layers as logically linked lists formed by "next" element indexes. 
60```text
61L3: [ g[7] ] -> None
62L2: [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
63L1: [ c[2] ] -> [ d[4] ] -> [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
64L0: [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
65```
66Finally, if you extend each "next" index reference to align with its sorted position within the list, a classical skip list diagram of towers emerges.
67```text
68L3: HEAD -------------------------------------------------------------------------> [ g[7] ] -------------------------> None
69L2: HEAD -------------------------------------------------------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
70L1: HEAD -------------------------> [ c[2] ] -> [ d[4] ] -------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
71L0: HEAD -> [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
72```
73
74# Example code
75```rust
76    let mut list = dsa_rust::sequences::indexed_skip_list::SkipList::<char>::new();
77
78    // An unsorted list of values and a sorted version to compare against
79    let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
80    let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
81
82    // Inserts unsorted values into the skip list with a consuming iterator
83    for e in values.into_iter() {
84        list.insert(e)
85    }
86
87    // Illustrates that the list exists as a sorted invariant
88    for (i, e) in list.iter().enumerate() {
89        assert_eq!(e, &sorted[i]);
90    }
91
92    // Illustrates the Kth function in a 0-indexed list.
93    // That is, e occupies the 2 index for insertion order,
94    // but is the 4th element in the 0-indexed sorted arrangement.
95    assert_eq!(list.get_kth(4).unwrap(), &'e');
96
97    // Query by range using Rust's RangeBounds semantics
98    let val = ['c', 'd', 'e', 'f'];
99    for (i, e) in list.range('c'..='f').enumerate() {
100        assert_eq!(e, &val[i])
101    }
102
103```
104*/
105
106use rand::Rng; // For coin flips
107use std::borrow::Borrow; // For passing borrowed parameters
108use std::ops::{Bound, RangeBounds}; // For range iterators
109
110const MAX_HEIGHT: usize = 32;
111
112#[derive(Debug)]
113struct SkipNode<T> {
114    value: Option<T>,                  // None for sentinel
115    next: [Option<usize>; MAX_HEIGHT], // forward links
116    prev: Option<usize>,               // back links at s0 for reverse iteration
117}
118
119pub struct SkipList<T> {
120    nodes: Vec<SkipNode<T>>,
121    list_height: usize,
122}
123impl<T: Ord> Default for SkipList<T> {
124    fn default() -> Self {
125        Self::new()
126    }
127}
128impl<T: Ord> SkipList<T> {
129    pub fn new() -> Self {
130        let sentinel = SkipNode {
131            value: None,
132            next: [None; MAX_HEIGHT],
133            prev: None,
134        };
135
136        Self {
137            nodes: vec![sentinel],
138            list_height: 1,
139        }
140    }
141
142    pub fn len(&self) -> usize {
143        self.nodes.len() - 1 // HEAD doesn't count!
144    }
145
146    pub fn is_empty(&self) -> bool {
147        self.nodes.len() - 1 == 0
148    }
149
150    pub fn locate<Q>(&self, key: &Q) -> Option<usize>
151    where
152        Q: Ord + ?Sized,
153        T: Borrow<Q>,
154    {
155        let update = self.find_predecessors(key);
156        let candidate = self.nodes[update[0]].next[0];
157
158        match candidate {
159            Some(idx) if self.nodes[idx].value.as_ref().unwrap().borrow() == key => Some(idx),
160            _ => None,
161        }
162    }
163
164    pub fn contains<Q>(&self, key: &Q) -> bool
165    where
166        Q: Ord + ?Sized,
167        T: Borrow<Q>,
168    {
169        self.locate(key).is_some()
170    }
171
172    pub fn insert(&mut self, value: T)
173    where
174        T: Ord,
175    {
176        let height = self.random_height();
177        let update = self.find_predecessors(&value);
178
179        // ... height adjustment logic ...
180
181        let new_index = self.nodes.len();
182
183        // 1. SET NEW NODE'S PREV
184        // update[0] is the index of the node immediately before the new one at Level 0.
185        let predecessor_idx = update[0];
186
187        // 2. GET NEW NODE'S NEXT (at Level 0)
188        // This is the node that will now need to point back to us.
189        let successor_idx = self.nodes[predecessor_idx].next[0];
190
191        self.nodes.push(SkipNode {
192            value: Some(value),
193            next: [None; MAX_HEIGHT],
194            prev: Some(predecessor_idx), // Points back to A
195        });
196
197        for (level, _) in update.iter().enumerate().take(height) {
198            //for level in 0..height {
199            let prev_idx = update[level];
200            self.nodes[new_index].next[level] = self.nodes[prev_idx].next[level];
201            self.nodes[prev_idx].next[level] = Some(new_index);
202        }
203
204        // 3. UPDATE SUCCESSOR'S PREV
205        // If there is a node after us, it must now point back to the new node.
206        if let Some(next_idx) = successor_idx {
207            self.nodes[next_idx].prev = Some(new_index);
208        }
209    }
210
211    /// Removes and returns the value for a given key, if it exists in the list.
212    pub fn remove<Q>(&mut self, key: &Q) -> Option<T>
213    where
214        Q: Ord + ?Sized,
215        T: Borrow<Q>,
216    {
217        let mut update_target = self.find_predecessors(key);
218
219        // 1. Identify the target index
220        let target = match self.nodes[update_target[0]].next[0] {
221            //Some(idx) if self.nodes[idx].value.as_ref().map_or(false, |v| v.borrow() == key) => idx,
222            Some(idx)
223                if self.nodes[idx]
224                    .value
225                    .as_ref()
226                    .is_some_and(|v| v.borrow() == key) =>
227            {
228                idx
229            }
230            _ => return None,
231        };
232
233        // 2. PRE-FETCH predecessors for the node that WILL move
234        // We do this BEFORE swap_remove so the indices in the skip list are still valid
235        let last_idx = self.nodes.len() - 1;
236
237        // 3. Logical Rewiring (Remove target from the chain)
238        if let Some(next_idx) = self.nodes[target].next[0] {
239            self.nodes[next_idx].prev = self.nodes[target].prev;
240        }
241        for (level, val) in update_target.iter_mut().enumerate().take(self.list_height) {
242            //for level in 0..self.list_height {
243            if self.nodes[*val].next[level] == Some(target) {
244                self.nodes[*val].next[level] = self.nodes[target].next[level];
245            }
246        }
247
248        // 4. Physical Swap-Remove
249        let removed_node = self.nodes.swap_remove(target);
250
251        // 5. Fix references for the node that just moved into 'target'
252        if target < self.nodes.len() {
253            for node in &mut self.nodes {
254                for next in node.next.iter_mut().take(self.list_height) {
255                    if *next == Some(last_idx) {
256                        *next = Some(target);
257                    }
258                }
259            }
260
261            // Fix prev pointer
262            if let Some(next_idx) = self.nodes[target].next[0] {
263                self.nodes[next_idx].prev = Some(target);
264            }
265        }
266
267        // clean up height
268        removed_node.value
269    }
270
271    /// Returns the Kth value in the list, if it exists.
272    pub fn get_kth(&self, k: usize) -> Option<&T> {
273        let mut idx = self.nodes[0].next[0];
274        let mut i = 0;
275
276        while let Some(current) = idx {
277            if i == k {
278                return self.nodes[current].value.as_ref();
279            }
280
281            idx = self.nodes[current].next[0];
282            i += 1;
283        }
284
285        None
286    }
287
288    /// Returns an inclusive iterator over a range of values
289    /// in the list from `start` to `end`.
290    pub fn range<Q, R>(&self, range: R) -> RangeIter<'_, T, Q, R>
291    where
292        Q: Ord + ?Sized,
293        T: Borrow<Q>,
294        R: RangeBounds<Q>,
295    {
296        // --- FIND FRONT ---
297        let front = match range.start_bound() {
298            Bound::Included(start) => self.nodes[self.find_predecessors(start)[0]].next[0],
299            Bound::Excluded(start) => {
300                let idx = self.nodes[self.find_predecessors(start)[0]].next[0];
301                if let Some(i) = idx {
302                    if self.nodes[i].value.as_ref().unwrap().borrow() == start {
303                        self.nodes[i].next[0]
304                    } else {
305                        Some(i)
306                    }
307                } else {
308                    None
309                }
310            }
311            Bound::Unbounded => self.nodes[0].next[0],
312        };
313
314        // --- FIND BACK ---
315        let back = match range.end_bound() {
316            Bound::Included(end) => {
317                // Find predecessors of 'end'.
318                // If the element at the end of the search IS 'end', that's our back.
319                // If not, the predecessor itself is our back.
320                let update = self.find_predecessors(end);
321                let candidate = self.nodes[update[0]].next[0];
322                if let Some(idx) = candidate {
323                    if self.nodes[idx].value.as_ref().unwrap().borrow() == end {
324                        Some(idx)
325                    } else {
326                        // Predicate check: Ensure we aren't returning the sentinel (idx 0)
327                        if update[0] == 0 {
328                            None
329                        } else {
330                            Some(update[0])
331                        }
332                    }
333                } else if update[0] == 0 {
334                    None
335                } else {
336                    Some(update[0])
337                }
338            }
339            Bound::Excluded(end) => {
340                let update = self.find_predecessors(end);
341                if update[0] == 0 {
342                    None
343                } else {
344                    Some(update[0])
345                }
346            }
347            Bound::Unbounded => {
348                // To find the absolute end, we find predecessors for a "theoretically infinite" value
349                // or simply walk the tallest tower to the end.
350                let mut curr = 0;
351                for level in (0..self.list_height).rev() {
352                    while let Some(next_idx) = self.nodes[curr].next[level] {
353                        curr = next_idx;
354                    }
355                }
356                if curr == 0 {
357                    None
358                } else {
359                    Some(curr)
360                }
361            }
362        };
363
364        RangeIter {
365            list: self,
366            front,
367            back,
368            range,
369            _marker: std::marker::PhantomData,
370        }
371    }
372
373    pub fn iter(&self) -> Iter<'_, T> {
374        // Walk the express lanes to find the very last node in O(log n)
375        let mut tail = 0;
376        for level in (0..self.list_height).rev() {
377            while let Some(next_idx) = self.nodes[tail].next[level] {
378                tail = next_idx;
379            }
380        }
381
382        Iter {
383            list: self,
384            next: self.nodes[0].next[0], // First node after sentinel
385            prev: if tail == 0 { None } else { Some(tail) },
386        }
387    }
388
389    // Utility functions
390    ////////////////////
391
392    fn random_height(&self) -> usize {
393        let mut level = 1;
394        let mut rng = rand::rng();
395        while level < MAX_HEIGHT && rng.random::<bool>() {
396            level += 1;
397        }
398        level
399    }
400
401    fn find_predecessors<Q>(&self, key: &Q) -> [usize; MAX_HEIGHT]
402    where
403        Q: Ord + ?Sized,
404        T: Borrow<Q>,
405    {
406        let mut update = [0usize; MAX_HEIGHT];
407        let mut idx = 0;
408
409        for level in (0..self.list_height).rev() {
410            loop {
411                match self.nodes[idx].next[level] {
412                    None => break,
413                    Some(next_idx) => {
414                        let val = self.nodes[next_idx].value.as_ref().unwrap();
415                        if val.borrow() >= key {
416                            break;
417                        }
418                        idx = next_idx;
419                    }
420                }
421            }
422            update[level] = idx;
423        }
424
425        update
426    }
427
428    //fn fix_moved_node_references(&mut self, new_idx: usize, old_idx: usize) {
429    //    // We need to find the neighbors of the node that is now at new_idx
430    //    // so we can tell them "I moved!"
431    //    let val = self.nodes[new_idx].value.as_ref().unwrap();
432    //    let update = self.find_predecessors(val);
433    //
434    //    // Update predecessors: they currently point to old_idx, they should point to new_idx
435    //    for level in 0..MAX_HEIGHT {
436    //        if self.nodes[update[level]].next[level] == Some(old_idx) {
437    //            self.nodes[update[level]].next[level] = Some(new_idx);
438    //        } else {
439    //            // Since towers are vertical, if we don't find a match at this level,
440    //            // we might not find them higher up, but check all to be safe.
441    //        }
442    //    }
443    //
444    //    // Update successor: the node after the moved node needs to update its 'prev'
445    //    if let Some(next_idx) = self.nodes[new_idx].next[0] {
446    //        self.nodes[next_idx].prev = Some(new_idx);
447    //    }
448    //}
449}
450
451pub struct Iter<'a, T> {
452    list: &'a SkipList<T>,
453    next: Option<usize>,
454    prev: Option<usize>,
455}
456impl<'a, T> Iterator for Iter<'a, T> {
457    type Item = &'a T;
458
459    fn next(&mut self) -> Option<Self::Item> {
460        let idx = self.next?;
461        let value = self.list.nodes[idx].value.as_ref()?;
462
463        if self.next == self.prev {
464            self.next = None;
465            self.prev = None;
466        } else {
467            self.next = self.list.nodes[idx].next[0];
468        }
469        Some(value)
470    }
471}
472impl<'a, T> DoubleEndedIterator for Iter<'a, T> {
473    fn next_back(&mut self) -> Option<Self::Item> {
474        let idx = self.prev?;
475        let value = self.list.nodes[idx].value.as_ref()?;
476
477        if self.prev == self.next {
478            self.next = None;
479            self.prev = None;
480        } else {
481            let prev = self.list.nodes[idx].prev;
482            // Sentinel check: don't yield index 0
483            self.prev = if prev == Some(0) { None } else { prev };
484        }
485        Some(value)
486    }
487}
488
489pub struct RangeIter<'a, T, Q, R>
490where
491    Q: ?Sized,
492    R: RangeBounds<Q>,
493{
494    list: &'a SkipList<T>,
495    front: Option<usize>, // Moves forward
496    back: Option<usize>,  // Moves backward
497    range: R,
498    _marker: std::marker::PhantomData<Q>,
499}
500impl<'a, T, Q, R> Iterator for RangeIter<'a, T, Q, R>
501where
502    Q: Ord + ?Sized,
503    T: Borrow<Q>,
504    R: RangeBounds<Q>,
505{
506    type Item = &'a T;
507
508    fn next(&mut self) -> Option<Self::Item> {
509        let idx = self.front?;
510
511        // Boundary Check
512        let value = self.list.nodes[idx].value.as_ref().unwrap();
513        if !self.range.contains(value.borrow()) {
514            self.front = None;
515            return None;
516        }
517
518        // Meet/Cross Check: If front matches back, this is the last element
519        if self.front == self.back {
520            self.front = None;
521            self.back = None;
522        } else {
523            self.front = self.list.nodes[idx].next[0];
524        }
525
526        Some(value)
527    }
528}
529impl<'a, T, Q, R> DoubleEndedIterator for RangeIter<'a, T, Q, R>
530where
531    Q: Ord + ?Sized,
532    T: Borrow<Q>,
533    R: RangeBounds<Q>,
534{
535    fn next_back(&mut self) -> Option<Self::Item> {
536        let idx = self.back?;
537
538        // Boundary Check
539        let value = self.list.nodes[idx].value.as_ref().unwrap();
540        if !self.range.contains(value.borrow()) {
541            self.back = None;
542            return None;
543        }
544
545        // Meet/Cross Check
546        if self.back == self.front {
547            self.back = None;
548            self.front = None;
549        } else {
550            // Move back, but ensure we don't land on the sentinel (idx 0)
551            let prev = self.list.nodes[idx].prev;
552            self.back = if prev == Some(0) { None } else { prev };
553        }
554
555        Some(value)
556    }
557}
558
559impl<'a, T: Ord> IntoIterator for &'a SkipList<T> {
560    type Item = &'a T;
561    type IntoIter = Iter<'a, T>;
562
563    fn into_iter(self) -> Self::IntoIter {
564        self.iter()
565    }
566}
567
568#[test]
569fn one() {
570    let mut list = SkipList::<char>::new();
571
572    // Tests basic housekeeping on empty list
573    assert_eq!(list.len(), 0);
574    assert!(list.is_empty());
575    assert!(!list.contains(&'z'));
576
577    // Inserts 9 values into the skip list
578    // with a consuming iterator, moving values
579    // into the list
580    let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
581    for e in values.into_iter() {
582        list.insert(e)
583    }
584
585    // Tests that len gets updated properly
586    assert_eq!(list.len(), 9);
587    assert!(list.contains(&'g'));
588
589    // Tests basic ordering and iteration
590    // Basic iteration with iter()
591    // Clippy wants enumerate instead of external loop counter
592    let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
593    for (i, e) in list.iter().enumerate() {
594        assert_eq!(e, &sorted[i]);
595    }
596    // Double-ended iteration with rev()
597    // Clippy wants saturating_sub instead of loop counter
598    let mut i = 8;
599    for e in list.iter().rev() {
600        assert_eq!(e, &sorted[i]);
601        if i > 0 {
602            i = i.saturating_sub(1)
603        };
604    }
605    // Or if you wanna be fancy about it
606    // zip() stops as soon as one iterator ends,
607    // eliminating the need for an overflow check
608    for (e, i) in list.iter().rev().zip((0..=8).rev()) {
609        assert_eq!(e, &sorted[i]);
610    }
611
612    // Iterator inferance using the IntoIter impl
613    let mut i = 0;
614    #[allow(clippy::explicit_counter_loop)]
615    for e in &list {
616        assert_eq!(e, &sorted[i]);
617        i += 1;
618    }
619
620    // Tests the Kth function in a 0-indexed list
621    assert_eq!(list.get_kth(6).unwrap(), &'g');
622
623    // Tests the range function
624    // NOTE: char is Copy so you dont strictly need to borrow
625    // when setting range bounds, these tests illustrate both
626    // borrowing and not; Note that each bounds must match
627    // so no (&'a'..'f'), only ('a'..'f') or (&'a'..&'f')
628    // Midlist (exclusive)
629    let val = ['c', 'd', 'e'];
630    //for (i, e) in list.range(&'c', &'f').enumerate() {
631    for (i, e) in list.range('c'..'f').enumerate() {
632        assert_eq!(e, &val[i])
633    }
634    // Midlist (inclusive)
635    let val = ['c', 'd', 'e', 'f'];
636    //for (i, e) in list.range(&'c', &'f').enumerate() {
637    for (i, e) in list.range('c'..='f').enumerate() {
638        assert_eq!(e, &val[i])
639    }
640    // Start of list
641    let val = ['a', 'b', 'c', 'd', 'e', 'f'];
642    //for (i, e) in list.range(&'a', &'f').enumerate() {
643    for (i, e) in list.range(..&'f').enumerate() {
644        assert_eq!(e, &val[i])
645    }
646    // End of list
647    let val = ['e', 'f', 'g', 'h', 'i'];
648    for (i, e) in list.range(&'e'..).enumerate() {
649        assert_eq!(e, &val[i])
650    }
651
652    // Tests removal
653    list.remove(&'e');
654    list.remove(&'a');
655    assert!(!list.contains(&'e'));
656    assert!(!list.contains(&'a'));
657
658    let l2 = ['b', 'c', 'd', 'f', 'g', 'h', 'i'];
659    for (val, i) in list.iter().zip(0..=6) {
660        assert_eq!(val, &l2[i]);
661    }
662
663    // Cant remove what isn't there!
664    assert!(list.remove(&'z').is_none());
665
666    // Everything below here is for display/debug purposes
667    //////////////////////////////////////////////////////
668
669    // Debug print
670    // Prints the tower contents
671    println!("Tower contents by insertion order, NOT sorted order:");
672    for (i, e) in list.nodes.iter().enumerate() {
673        // Collect only the Some values into a new Vec
674        let values: Vec<_> = e.next.iter().filter_map(|&x| x).collect();
675        if let Some(val) = e.value {
676            let v = &val.to_string();
677            println!("{v}[{i}]: {values:?}");
678        } else {
679            println!("HEAD[0]: {values:?}");
680        }
681    }
682    println!();
683
684    // Debug print
685    // Prints the sorted values in the list
686    print!("List values:\n   ");
687    for e in list.iter() {
688        print!("{e:#?} ")
689    }
690    println!();
691
692    //panic!();
693}