dsa_rust/sequences/indexed_skip_list.rs
1/*! A safe, indexed skip list
2
3# About
4Skip lists are sorted, probabalistic structures made up of stacked lists of varying length to allow for truncated _O(log(n))_ navigation. Canonically linked lists are built from doubly-linked lists, but this is not a defining characteristic of the ADT. Regardless of the base list representation used, the navigational algorithm results in what is essentially a logical linked list.
5
6Properly implemented skip lists provide _O(log(n))_ expected time complexity for search, insert, and removal operations. This provides a significant advantage over keeping sorted array- or link-based list invariants, which have _worst-case O(n)_ removal (average _O(n/2)_) temporal performance. Skip lists are also simpler than self-balancing tree structures, which are commonly used for sorted list and map structures. Skip lists also generally provide easier and finer-grained control when adapted for concurrent operations. There is a reason Java's `concurrentSkipListMap` is so popular.
7
8# Design
9This design uses `Vec`-backed storage for [SkipNode]s that contain a list (tower) of "next" values, and a single "previous" value that represent indexes within the backing vector.
10
11The list features a dynamic max height _h_ that is logarithmically proportional to the number of elements in the list _n_ such that _h = log(n) in the expected case_. The logarithmic growth ensures that the average search, insertion, and deletion operations remain efficient, typically with expected _O(log(n))_ time complexity.
12
13## The Search Algorithm
14The point of the search algorithm is to find the first node (or handle/position) `p` in skip list `S` that represents the largest comparable value <= to the search key `k`. This algorithm can be broken into two steps:
15Step 1) loop`let candidate = p.peek()`, if `candidate >= k`, return `p`, otherwise move to `p.next()`. Repeat until `p.peek()` >= `k`.
16Step 2) Drop down a level: If `S.below(p) == 0` you're at the lowest level and the search terminates.
17
18## Visual Examples
19An initial, empty skip list with one level and no data:
20```text
21S0: HEAD -> None
22```
23
24Inserting the first node triggers an automatic tower level, even if it ends up empty. This provides the algorithm with a starting point:
25```text
26S1: HEAD ----------> None
27S0: HEAD -> [ 5 ] -> None
28```
29
30After inserting `['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f']`, the list's `SkipNodes` might contain the following towers. Notice that its always possible to tell the last item in the list because its tower is empty. This makes sense, because inserting the last element can only point to `None`. As you can see by the index notation on the left-hand side of the table, the backing structure retains the insertion order; the backing structure remains unsorted.
31```text
32HEAD[0]: [1, 2, 9, 7]
33a[1]: [5]
34c[2]: [4, 4]
35e[3]: [9]
36d[4]: [3, 9]
37b[5]: [2]
38i[6]: []
39g[7]: [8, 6, 6]
40h[8]: [6]
41f[9]: [7, 7, 7]
42```
43The towers appear to contain rather nonsensical values when printed as they exist in memory, which represents insertion order. However, if you follow the indexes from the `HEAD` node, and re-arrange the nodes into _lexicographically sorted order_, which is what the navigational algorithms in the skiplist achieve, you'd get the following towers.
44```text
45HEAD[0]: [1, 2, 9, 7]
46a[1]: [5]
47b[5]: [2]
48c[2]: [4, 4]
49d[4]: [3, 9]
50e[3]: [9]
51f[9]: [7, 7, 7]
52h[8]: [6]
53i[6]: []
54g[7]: [8, 6, 6]
55```
56
57When you rotate the mapping 90 degrees you can start to visualize the skip list layers as logically linked lists formed by tower index placement. Notice that the towers roughly mirror the raw tower output from the previous output.
58```text
59L3: [ g[7] ] -> None
60L2: [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
61L1: [ c[2] ] -> [ d[4] ] -> [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
62L0: [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
63```
64Finally, if you extend each index reference to align with its sorted position within the list, a classical skip list diagram emerges.
65```text
66L3: HEAD -------------------------------------------------------------------------> [ g[7] ] -------------------------> None
67L2: HEAD -------------------------------------------------------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
68L1: HEAD -------------------------> [ c[2] ] -> [ d[4] ] -------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
69L0: HEAD -> [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
70```
71
72# Example code
73```rust
74 let mut list = SkipList::<char>::new();
75
76 // Inserts 9 values into the skip list
77 // with a consuming iterator, moving values
78 // into the list
79 let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
80 for e in l1.into_iter() {
81 list.insert(e)
82 }
83
84 // Illustrates that the list exists as a sorted invariant
85 let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
86 for (i, e) in list.iter().enumerate() {
87 assert_eq!(e, &sorted[i]);
88 }
89
90 // Illustrates the Kth function in a 0-indexed list
91 assert_eq!(list.get_kth(6).unwrap(), &'g');
92
93 // Query by range using Rust's RangeBounds semantics
94 let val = ['c', 'd', 'e', 'f'];
95 for (i, e) in list.range('c'..='f').enumerate() {
96 assert_eq!(e, &val[i])
97 }
98
99```
100*/
101
102use rand::Rng; // For coin flips
103use std::borrow::Borrow; // For passing borrowed parameters
104use std::ops::{Bound, RangeBounds}; // For range iterators
105
106const MAX_HEIGHT: usize = 32;
107
108#[derive(Debug)]
109struct SkipNode<T> {
110 value: Option<T>, // None for sentinel
111 next: [Option<usize>; MAX_HEIGHT], // forward links
112 prev: Option<usize>, // back links at s0 for reverse iteration
113}
114
115pub struct SkipList<T> {
116 nodes: Vec<SkipNode<T>>,
117 list_height: usize,
118}
119impl<T: Ord> Default for SkipList<T> {
120 fn default() -> Self {
121 Self::new()
122 }
123}
124impl<T: Ord> SkipList<T> {
125 pub fn new() -> Self {
126 let sentinel = SkipNode {
127 value: None,
128 next: [None; MAX_HEIGHT],
129 prev: None
130 };
131
132 Self {
133 nodes: vec![sentinel],
134 list_height: 1,
135 }
136 }
137
138 pub fn len(&self) -> usize {
139 self.nodes.len() - 1 // HEAD doesn't count!
140 }
141
142 pub fn is_empty(&self) -> bool {
143 self.nodes.len() - 1 == 0
144 }
145
146 pub fn locate<Q>(&self, key: &Q) -> Option<usize>
147 where
148 Q: Ord + ?Sized,
149 T: Borrow<Q>,
150 {
151 let update = self.find_predecessors(key);
152 let candidate = self.nodes[update[0]].next[0];
153
154 match candidate {
155 Some(idx)
156 if self.nodes[idx].value.as_ref().unwrap().borrow() == key =>
157 {
158 Some(idx)
159 }
160 _ => None,
161 }
162 }
163
164 pub fn contains<Q>(&self, key: &Q) -> bool
165 where
166 Q: Ord + ?Sized,
167 T: Borrow<Q>
168 {
169 self.locate(key).is_some()
170 }
171
172 pub fn insert(&mut self, value: T)
173 where
174 T: Ord,
175 {
176 let height = self.random_height();
177 let update = self.find_predecessors(&value);
178
179 // ... height adjustment logic ...
180
181 let new_index = self.nodes.len();
182
183 // 1. SET NEW NODE'S PREV
184 // update[0] is the index of the node immediately before the new one at Level 0.
185 let predecessor_idx = update[0];
186
187 // 2. GET NEW NODE'S NEXT (at Level 0)
188 // This is the node that will now need to point back to us.
189 let successor_idx = self.nodes[predecessor_idx].next[0];
190
191 self.nodes.push(SkipNode {
192 value: Some(value),
193 next: [None; MAX_HEIGHT],
194 prev: Some(predecessor_idx), // Points back to A
195 });
196
197 for (level, _) in update.iter().enumerate().take(height) {
198 //for level in 0..height {
199 let prev_idx = update[level];
200 self.nodes[new_index].next[level] = self.nodes[prev_idx].next[level];
201 self.nodes[prev_idx].next[level] = Some(new_index);
202 }
203
204 // 3. UPDATE SUCCESSOR'S PREV
205 // If there is a node after us, it must now point back to the new node.
206 if let Some(next_idx) = successor_idx {
207 self.nodes[next_idx].prev = Some(new_index);
208 }
209 }
210
211 /// Removes and returns the value for a given key, if it exists in the list.
212 pub fn remove<Q>(&mut self, key: &Q) -> Option<T>
213 where
214 Q: Ord + ?Sized,
215 T: Borrow<Q>,
216 {
217 let mut update_target = self.find_predecessors(key);
218
219 // 1. Identify the target index
220 let target = match self.nodes[update_target[0]].next[0] {
221 //Some(idx) if self.nodes[idx].value.as_ref().map_or(false, |v| v.borrow() == key) => idx,
222 Some(idx) if self.nodes[idx].value.as_ref().is_some_and(|v| v.borrow() == key) => idx,
223 _ => return None,
224 };
225
226 // 2. PRE-FETCH predecessors for the node that WILL move
227 // We do this BEFORE swap_remove so the indices in the skip list are still valid
228 let last_idx = self.nodes.len() - 1;
229 let mut update_moved = [0usize; MAX_HEIGHT];
230 if target != last_idx {
231 let moved_val = self.nodes[last_idx].value.as_ref().unwrap().borrow();
232 update_moved = self.find_predecessors(moved_val);
233 }
234
235 // 3. Logical Rewiring (Remove target from the chain)
236 if let Some(next_idx) = self.nodes[target].next[0] {
237 self.nodes[next_idx].prev = self.nodes[target].prev;
238 }
239 for (level, val) in update_target.iter_mut().enumerate().take(self.list_height) {
240 //for level in 0..self.list_height {
241 if self.nodes[*val].next[level] == Some(target) {
242 self.nodes[*val].next[level] = self.nodes[target].next[level];
243 }
244 }
245
246 // 4. Physical Swap-Remove
247 let removed_node = self.nodes.swap_remove(target);
248
249 // 5. Fix references for the node that just moved into 'target'
250 if target < self.nodes.len() {
251 // Any node in 'update_moved' that pointed to 'last_idx' must now point to 'target'
252 for level in 0..self.list_height {
253 if update_moved[level] == last_idx {
254 // Rare case: the moved node's predecessor was itself the target
255 // We handle this by using the updated pointers from the target's removal
256 update_moved[level] = update_target[level];
257 }
258
259 if self.nodes[update_moved[level]].next[level] == Some(last_idx) {
260 self.nodes[update_moved[level]].next[level] = Some(target);
261 }
262 }
263 // Update the successor of the moved node to point back to the new index
264 if let Some(next_idx) = self.nodes[target].next[0] {
265 self.nodes[next_idx].prev = Some(target);
266 }
267 }
268
269 // ... clean up height ...
270 removed_node.value
271 }
272
273 /// Returns the Kth value in the list, if it exists.
274 pub fn get_kth(&self, k: usize) -> Option<&T> {
275 let mut idx = self.nodes[0].next[0];
276 let mut i = 0;
277
278 while let Some(current) = idx {
279 if i == k {
280 return self.nodes[current].value.as_ref();
281 }
282
283 idx = self.nodes[current].next[0];
284 i += 1;
285 }
286
287 None
288 }
289
290 /// Returns an inclusive iterator over a range of values
291 /// in the list from `start` to `end`.
292 pub fn range<Q, R>(&self, range: R) -> RangeIter<'_, T, Q, R>
293 where
294 Q: Ord + ?Sized,
295 T: Borrow<Q>,
296 R: RangeBounds<Q>,
297 {
298 // --- FIND FRONT ---
299 let front = match range.start_bound() {
300 Bound::Included(start) => self.nodes[self.find_predecessors(start)[0]].next[0],
301 Bound::Excluded(start) => {
302 let idx = self.nodes[self.find_predecessors(start)[0]].next[0];
303 if let Some(i) = idx {
304 if self.nodes[i].value.as_ref().unwrap().borrow() == start {
305 self.nodes[i].next[0]
306 } else { Some(i) }
307 } else { None }
308 }
309 Bound::Unbounded => self.nodes[0].next[0],
310 };
311
312 // --- FIND BACK ---
313 let back = match range.end_bound() {
314 Bound::Included(end) => {
315 // Find predecessors of 'end'.
316 // If the element at the end of the search IS 'end', that's our back.
317 // If not, the predecessor itself is our back.
318 let update = self.find_predecessors(end);
319 let candidate = self.nodes[update[0]].next[0];
320 if let Some(idx) = candidate {
321 if self.nodes[idx].value.as_ref().unwrap().borrow() == end {
322 Some(idx)
323 } else {
324 // Predicate check: Ensure we aren't returning the sentinel (idx 0)
325 if update[0] == 0 { None } else { Some(update[0]) }
326 }
327 } else if update[0] == 0 { None } else { Some(update[0]) }
328 }
329 Bound::Excluded(end) => {
330 let update = self.find_predecessors(end);
331 if update[0] == 0 { None } else { Some(update[0]) }
332 }
333 Bound::Unbounded => {
334 // To find the absolute end, we find predecessors for a "theoretically infinite" value
335 // or simply walk the tallest tower to the end.
336 let mut curr = 0;
337 for level in (0..self.list_height).rev() {
338 while let Some(next_idx) = self.nodes[curr].next[level] {
339 curr = next_idx;
340 }
341 }
342 if curr == 0 { None } else { Some(curr) }
343 }
344 };
345
346 RangeIter {
347 list: self,
348 front,
349 back,
350 range,
351 _marker: std::marker::PhantomData,
352 }
353 }
354
355 pub fn iter(&self) -> Iter<'_, T> {
356 // Walk the express lanes to find the very last node in O(log n)
357 let mut tail = 0;
358 for level in (0..self.list_height).rev() {
359 while let Some(next_idx) = self.nodes[tail].next[level] {
360 tail = next_idx;
361 }
362 }
363
364 Iter {
365 list: self,
366 next: self.nodes[0].next[0], // First node after sentinel
367 prev: if tail == 0 { None } else { Some(tail) },
368 }
369 }
370
371 // Utility functions
372 ////////////////////
373
374 fn random_height(&self) -> usize {
375 let mut level = 1;
376 let mut rng = rand::rng();
377 while level < MAX_HEIGHT && rng.random::<bool>() {
378 level += 1;
379 }
380 level
381 }
382
383 fn find_predecessors<Q>(&self, key: &Q) -> [usize; MAX_HEIGHT]
384 where
385 Q: Ord + ?Sized,
386 T: Borrow<Q>,
387 {
388 let mut update = [0usize; MAX_HEIGHT];
389 let mut idx = 0;
390
391 for level in (0..self.list_height).rev() {
392 loop {
393 match self.nodes[idx].next[level] {
394 None => break,
395 Some(next_idx) => {
396 let val = self.nodes[next_idx].value.as_ref().unwrap();
397 if val.borrow() >= key {
398 break;
399 }
400 idx = next_idx;
401 }
402 }
403 }
404 update[level] = idx;
405 }
406
407 update
408 }
409
410 //fn fix_moved_node_references(&mut self, new_idx: usize, old_idx: usize) {
411 // // We need to find the neighbors of the node that is now at new_idx
412 // // so we can tell them "I moved!"
413 // let val = self.nodes[new_idx].value.as_ref().unwrap();
414 // let update = self.find_predecessors(val);
415 //
416 // // Update predecessors: they currently point to old_idx, they should point to new_idx
417 // for level in 0..MAX_HEIGHT {
418 // if self.nodes[update[level]].next[level] == Some(old_idx) {
419 // self.nodes[update[level]].next[level] = Some(new_idx);
420 // } else {
421 // // Since towers are vertical, if we don't find a match at this level,
422 // // we might not find them higher up, but check all to be safe.
423 // }
424 // }
425 //
426 // // Update successor: the node after the moved node needs to update its 'prev'
427 // if let Some(next_idx) = self.nodes[new_idx].next[0] {
428 // self.nodes[next_idx].prev = Some(new_idx);
429 // }
430 //}
431}
432
433pub struct Iter<'a, T> {
434 list: &'a SkipList<T>,
435 next: Option<usize>,
436 prev: Option<usize>
437}
438impl<'a, T> Iterator for Iter<'a, T> {
439 type Item = &'a T;
440
441 fn next(&mut self) -> Option<Self::Item> {
442 let idx = self.next?;
443 let value = self.list.nodes[idx].value.as_ref()?;
444
445 if self.next == self.prev {
446 self.next = None;
447 self.prev = None;
448 } else {
449 self.next = self.list.nodes[idx].next[0];
450 }
451 Some(value)
452 }
453}
454impl<'a, T> DoubleEndedIterator for Iter<'a, T> {
455 fn next_back(&mut self) -> Option<Self::Item> {
456 let idx = self.prev?;
457 let value = self.list.nodes[idx].value.as_ref()?;
458
459 if self.prev == self.next {
460 self.next = None;
461 self.prev = None;
462 } else {
463 let prev = self.list.nodes[idx].prev;
464 // Sentinel check: don't yield index 0
465 self.prev = if prev == Some(0) { None } else { prev };
466 }
467 Some(value)
468 }
469}
470
471pub struct RangeIter<'a, T, Q, R>
472where
473 Q: ?Sized,
474 R: RangeBounds<Q>,
475{
476 list: &'a SkipList<T>,
477 front: Option<usize>, // Moves forward
478 back: Option<usize>, // Moves backward
479 range: R,
480 _marker: std::marker::PhantomData<Q>,
481}
482impl<'a, T, Q, R> Iterator for RangeIter<'a, T, Q, R>
483where
484 Q: Ord + ?Sized,
485 T: Borrow<Q>,
486 R: RangeBounds<Q>,
487{
488 type Item = &'a T;
489
490 fn next(&mut self) -> Option<Self::Item> {
491 let idx = self.front?;
492
493 // Boundary Check
494 let value = self.list.nodes[idx].value.as_ref().unwrap();
495 if !self.range.contains(value.borrow()) {
496 self.front = None;
497 return None;
498 }
499
500 // Meet/Cross Check: If front matches back, this is the last element
501 if self.front == self.back {
502 self.front = None;
503 self.back = None;
504 } else {
505 self.front = self.list.nodes[idx].next[0];
506 }
507
508 Some(value)
509 }
510}
511impl<'a, T, Q, R> DoubleEndedIterator for RangeIter<'a, T, Q, R>
512where
513 Q: Ord + ?Sized,
514 T: Borrow<Q>,
515 R: RangeBounds<Q>,
516{
517 fn next_back(&mut self) -> Option<Self::Item> {
518 let idx = self.back?;
519
520 // Boundary Check
521 let value = self.list.nodes[idx].value.as_ref().unwrap();
522 if !self.range.contains(value.borrow()) {
523 self.back = None;
524 return None;
525 }
526
527 // Meet/Cross Check
528 if self.back == self.front {
529 self.back = None;
530 self.front = None;
531 } else {
532 // Move back, but ensure we don't land on the sentinel (idx 0)
533 let prev = self.list.nodes[idx].prev;
534 self.back = if prev == Some(0) { None } else { prev };
535 }
536
537 Some(value)
538 }
539}
540
541impl<'a, T: Ord> IntoIterator for &'a SkipList<T> {
542 type Item = &'a T;
543 type IntoIter = Iter<'a, T>;
544
545 fn into_iter(self) -> Self::IntoIter {
546 self.iter()
547 }
548}
549
550#[test]
551fn one() {
552 let mut list = SkipList::<char>::new();
553
554 // Tests basic housekeeping on empty list
555 assert_eq!(list.len(), 0);
556 assert!(list.is_empty());
557 assert!(!list.contains(&'z'));
558
559 // Inserts 9 values into the skip list
560 // with a consuming iterator, moving values
561 // into the list
562 let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
563 for e in values.into_iter() {
564 list.insert(e)
565 }
566
567 // Tests that len gets updated properly
568 assert_eq!(list.len(), 9);
569 assert!(list.contains(&'g'));
570
571 // Tests basic ordering and iteration
572 // Basic iteration with iter()
573 // Clippy wants enumerate instead of external loop counter
574 let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
575 for (i, e) in list.iter().enumerate() {
576 assert_eq!(e, &sorted[i]);
577 }
578 // Double-ended iteration with rev()
579 // Clippy wants saturating_sub instead of loop counter
580 let mut i = 8;
581 for e in list.iter().rev() {
582 assert_eq!(e, &sorted[i]);
583 if i > 0 {
584 i = i.saturating_sub(1)
585 };
586 }
587 // Or if you wanna be fancy about it
588 // zip() stops as soon as one iterator ends,
589 // eliminating the need for an overflow check
590 for (e, i) in list.iter().rev().zip((0..=8).rev()) {
591 assert_eq!(e, &sorted[i]);
592 }
593 // Iterator inferance using the IntoIter impl
594 let mut i = 0;
595 #[allow(clippy::explicit_counter_loop)]
596 for e in &list {
597 assert_eq!(e, &sorted[i]);
598 i += 1;
599 }
600
601 // Tests the Kth function in a 0-indexed list
602 assert_eq!(list.get_kth(6).unwrap(), &'g');
603
604 // Tests the range function
605 // NOTE: char is Copy so you dont strictly need to borrow
606 // when setting range bounds, these tests illustrate both
607 // borrowing and not; Note that each bounds must match
608 // so no (&'a'..'f'), only ('a'..'f') or (&'a'..&'f')
609 // Midlist (exclusive)
610 let val = ['c', 'd', 'e'];
611 //for (i, e) in list.range(&'c', &'f').enumerate() {
612 for (i, e) in list.range('c'..'f').enumerate() {
613 assert_eq!(e, &val[i])
614 }
615 // Midlist (inclusive)
616 let val = ['c', 'd', 'e', 'f'];
617 //for (i, e) in list.range(&'c', &'f').enumerate() {
618 for (i, e) in list.range('c'..='f').enumerate() {
619 assert_eq!(e, &val[i])
620 }
621 // Start of list
622 let val = ['a', 'b', 'c', 'd', 'e', 'f'];
623 //for (i, e) in list.range(&'a', &'f').enumerate() {
624 for (i, e) in list.range(..&'f').enumerate() {
625 assert_eq!(e, &val[i])
626 }
627 // End of list
628 let val = ['e', 'f', 'g', 'h', 'i'];
629 for (i, e) in list.range(&'e'..).enumerate() {
630 assert_eq!(e, &val[i])
631 }
632
633 // Tests removal
634 list.remove(&'e');
635 list.remove(&'a');
636 assert!(!list.contains(&'e'));
637 assert!(!list.contains(&'a'));
638 let l2 = ['b', 'c', 'd', 'f', 'g', 'h', 'i'];
639 for (val, i) in list.iter().zip(0..=6) {
640 assert_eq!(val, &l2[i]);
641 }
642
643 // Cant remove what isn't there!
644 assert!(list.remove(&'z').is_none());
645
646 // Everything below here is for display/debug purposes
647 //////////////////////////////////////////////////////
648
649 // Debug print
650 // Prints the tower contents
651 println!("Tower contents by insertion order, NOT sorted order:");
652 for (i, e) in list.nodes.iter().enumerate() {
653 // Collect only the Some values into a new Vec
654 let values: Vec<_> = e.next.iter().filter_map(|&x| x).collect();
655 if let Some(val) = e.value {
656 let v = &val.to_string();
657 println!("{v}[{i}]: {values:?}");
658 } else {
659 println!("HEAD[0]: {values:?}");
660 }
661 }
662 println!();
663
664 // Debug print
665 // Prints the sorted values in the list
666 print!("List values:\n ");
667 for e in list.iter() {
668 print!("{e:#?} ")
669 }
670 println!();
671
672 //panic!();
673
674}