dsa_rust/sequences/indexed_skip_list.rs
1/*! A safe, indexed skip list
2
3# About
4Skip lists are sorted, probabalistic structures made up of stacked lists of varying length to allow for truncated _O(log(n))_ navigation. Canonically linked lists are built from doubly-linked lists, but this is not a defining characteristic of the ADT. Regardless of the base list representation used, the navigational algorithm results in what is essentially a logical linked list.
5
6Properly implemented skip lists provide _O(log(n))_ expected time complexity for search, insert, and removal operations. This provides a significant advantage over keeping sorted array- or link-based list invariants, which have _worst-case O(n)_ removal (average _O(n/2)_) temporal performance. Skip lists are also simpler than self-balancing tree structures, which are commonly used for sorted list and map structures. Skip lists also generally provide easier and finer-grained control when adapted for concurrent operations. There is a reason Java's `concurrentSkipListMap` is so popular.
7
8# Design
9This design uses `Vec`-backed storage for [SkipNode]s that contain a list (tower) of "next" values, and a single "previous" value that represent indexes within the backing vector.
10
11The list features a dynamic max height _h_ that is logarithmically proportional to the number of elements in the list _n_ such that _h = log(n) in the expected case_. The logarithmic growth ensures that the average search, insertion, and deletion operations remain efficient, typically with expected _O(log(n))_ time complexity.
12
13## The Search Algorithm
14The point of the search algorithm is to find the first node (or handle/position) `p` in skip list `S` that represents the largest comparable value <= to the search key `k`. This algorithm can be broken into two steps:
15Step 1) loop`let candidate = p.peek()`, if `candidate >= k`, return `p`, otherwise move to `p.next()`. Repeat until `p.peek()` >= `k`.
16Step 2) Drop down a level: If `S.below(p) == 0` you're at the lowest level and the search terminates.
17
18## Visual Examples
19An initial, empty skip list with one level and no data:
20```text
21S0: HEAD -> None
22```
23
24Inserting the first node triggers an automatic tower level, even if it ends up empty. This provides the algorithm with a starting point:
25```text
26S1: HEAD ----------> None
27S0: HEAD -> [ 5 ] -> None
28```
29
30After inserting `['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f']`, the list's `SkipNodes` might contain the following towers.
31```text
32HEAD[0]: [1, 2, 9, 7]
33a[1]: [5]
34c[2]: [4, 4]
35e[3]: [9]
36d[4]: [3, 9]
37b[5]: [2]
38i[6]: []
39g[7]: [8, 6, 6]
40h[8]: [6]
41f[9]: [7, 7, 7]
42```
43Note that its always possible to tell the last item in the list because its tower is empty. This makes sense, because the last element within the sorted arrangement can only point to `None`. As you can see by the index notation on the left-hand side of the table, the backing structure retains the insertion order; the backing structure remains unsorted.
44
45The structure simply appends elements to the backing structure, so when printed the list retains its insertion order, not its sorted arrangement. As a result, the towers appear to contain rather nonsensical values. However, if you follow the indexes from the `HEAD` node, and re-arrange the nodes into _lexicographically sorted order_, which is what the navigational algorithms in the skiplist achieve, you get the following towers.
46```text
47HEAD[0]: [1, 2, 9, 7]
48a[1]: [5]
49b[5]: [2]
50c[2]: [4, 4]
51d[4]: [3, 9]
52e[3]: [9]
53f[9]: [7, 7, 7]
54g[7]: [8, 6, 6]
55h[8]: [6]
56i[6]: []
57```
58
59When you rotate the mapping 90 degrees you can start to visualize the skip list layers as logically linked lists formed by "next" element indexes.
60```text
61L3: [ g[7] ] -> None
62L2: [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
63L1: [ c[2] ] -> [ d[4] ] -> [ f[9] ] -> [ g[7] ] -> [ i[6] ] -> None
64L0: [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
65```
66Finally, if you extend each "next" index reference to align with its sorted position within the list, a classical skip list diagram of towers emerges.
67```text
68L3: HEAD -------------------------------------------------------------------------> [ g[7] ] -------------------------> None
69L2: HEAD -------------------------------------------------------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
70L1: HEAD -------------------------> [ c[2] ] -> [ d[4] ] -------------> [ f[9] ] -> [ g[7] ] -------------> [ i[6] ] -> None
71L0: HEAD -> [ a[1] ] -> [ b[5] ] -> [ c[2] ] -> [ d[4] ] -> [ e[3] ] -> [ f[9] ] -> [ g[7] ] -> [ h[8] ] -> [ i[6] ] -> None
72```
73
74# Example code
75```rust
76 let mut list = dsa_rust::sequences::indexed_skip_list::SkipList::<char>::new();
77
78 // An unsorted list of values and a sorted version to compare against
79 let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
80 let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
81
82 // Inserts unsorted values into the skip list with a consuming iterator
83 for e in values.into_iter() {
84 list.insert(e)
85 }
86
87 // Illustrates that the list exists as a sorted invariant
88 for (i, e) in list.iter().enumerate() {
89 assert_eq!(e, &sorted[i]);
90 }
91
92 // Illustrates the Kth function in a 0-indexed list.
93 // That is, e occupies the 2 index for insertion order,
94 // but is the 4th element in the 0-indexed sorted arrangement.
95 assert_eq!(list.get_kth(4).unwrap(), &'e');
96
97 // Query by range using Rust's RangeBounds semantics
98 let val = ['c', 'd', 'e', 'f'];
99 for (i, e) in list.range('c'..='f').enumerate() {
100 assert_eq!(e, &val[i])
101 }
102
103```
104*/
105
106use rand::Rng; // For coin flips
107use std::borrow::Borrow; // For passing borrowed parameters
108use std::ops::{Bound, RangeBounds}; // For range iterators
109
110const MAX_HEIGHT: usize = 32;
111
112#[derive(Debug)]
113struct SkipNode<T> {
114 value: Option<T>, // None for sentinel
115 next: [Option<usize>; MAX_HEIGHT], // forward links
116 prev: Option<usize>, // back links at s0 for reverse iteration
117}
118
119pub struct SkipList<T> {
120 nodes: Vec<SkipNode<T>>,
121 list_height: usize,
122}
123impl<T: Ord> Default for SkipList<T> {
124 fn default() -> Self {
125 Self::new()
126 }
127}
128impl<T: Ord> SkipList<T> {
129 pub fn new() -> Self {
130 let sentinel = SkipNode {
131 value: None,
132 next: [None; MAX_HEIGHT],
133 prev: None,
134 };
135
136 Self {
137 nodes: vec![sentinel],
138 list_height: 1,
139 }
140 }
141
142 pub fn len(&self) -> usize {
143 self.nodes.len() - 1 // HEAD doesn't count!
144 }
145
146 pub fn is_empty(&self) -> bool {
147 self.nodes.len() - 1 == 0
148 }
149
150 pub fn locate<Q>(&self, key: &Q) -> Option<usize>
151 where
152 Q: Ord + ?Sized,
153 T: Borrow<Q>,
154 {
155 let update = self.find_predecessors(key);
156 let candidate = self.nodes[update[0]].next[0];
157
158 match candidate {
159 Some(idx) if self.nodes[idx].value.as_ref().unwrap().borrow() == key => Some(idx),
160 _ => None,
161 }
162 }
163
164 pub fn contains<Q>(&self, key: &Q) -> bool
165 where
166 Q: Ord + ?Sized,
167 T: Borrow<Q>,
168 {
169 self.locate(key).is_some()
170 }
171
172 pub fn insert(&mut self, value: T)
173 where
174 T: Ord,
175 {
176 let height = self.random_height();
177 let update = self.find_predecessors(&value);
178
179 // ... height adjustment logic ...
180
181 let new_index = self.nodes.len();
182
183 // 1. SET NEW NODE'S PREV
184 // update[0] is the index of the node immediately before the new one at Level 0.
185 let predecessor_idx = update[0];
186
187 // 2. GET NEW NODE'S NEXT (at Level 0)
188 // This is the node that will now need to point back to us.
189 let successor_idx = self.nodes[predecessor_idx].next[0];
190
191 self.nodes.push(SkipNode {
192 value: Some(value),
193 next: [None; MAX_HEIGHT],
194 prev: Some(predecessor_idx), // Points back to A
195 });
196
197 for (level, _) in update.iter().enumerate().take(height) {
198 //for level in 0..height {
199 let prev_idx = update[level];
200 self.nodes[new_index].next[level] = self.nodes[prev_idx].next[level];
201 self.nodes[prev_idx].next[level] = Some(new_index);
202 }
203
204 // 3. UPDATE SUCCESSOR'S PREV
205 // If there is a node after us, it must now point back to the new node.
206 if let Some(next_idx) = successor_idx {
207 self.nodes[next_idx].prev = Some(new_index);
208 }
209 }
210
211 /// Removes and returns the value for a given key, if it exists in the list.
212 pub fn remove<Q>(&mut self, key: &Q) -> Option<T>
213 where
214 Q: Ord + ?Sized,
215 T: Borrow<Q>,
216 {
217 let mut update_target = self.find_predecessors(key);
218
219 // 1. Identify the target index
220 let target = match self.nodes[update_target[0]].next[0] {
221 //Some(idx) if self.nodes[idx].value.as_ref().map_or(false, |v| v.borrow() == key) => idx,
222 Some(idx)
223 if self.nodes[idx]
224 .value
225 .as_ref()
226 .is_some_and(|v| v.borrow() == key) =>
227 {
228 idx
229 }
230 _ => return None,
231 };
232
233 // 2. PRE-FETCH predecessors for the node that WILL move
234 // We do this BEFORE swap_remove so the indices in the skip list are still valid
235 let last_idx = self.nodes.len() - 1;
236
237 // 3. Logical Rewiring (Remove target from the chain)
238 if let Some(next_idx) = self.nodes[target].next[0] {
239 self.nodes[next_idx].prev = self.nodes[target].prev;
240 }
241 for (level, val) in update_target.iter_mut().enumerate().take(self.list_height) {
242 //for level in 0..self.list_height {
243 if self.nodes[*val].next[level] == Some(target) {
244 self.nodes[*val].next[level] = self.nodes[target].next[level];
245 }
246 }
247
248 // 4. Physical Swap-Remove
249 let removed_node = self.nodes.swap_remove(target);
250
251 // 5. Fix references for the node that just moved into 'target'
252 if target < self.nodes.len() {
253 for node in &mut self.nodes {
254 for next in node.next.iter_mut().take(self.list_height) {
255 if *next == Some(last_idx) {
256 *next = Some(target);
257 }
258 }
259 }
260
261 // Fix prev pointer
262 if let Some(next_idx) = self.nodes[target].next[0] {
263 self.nodes[next_idx].prev = Some(target);
264 }
265 }
266
267 // clean up height
268 removed_node.value
269 }
270
271 /// Returns the Kth value in the list, if it exists.
272 pub fn get_kth(&self, k: usize) -> Option<&T> {
273 let mut idx = self.nodes[0].next[0];
274 let mut i = 0;
275
276 while let Some(current) = idx {
277 if i == k {
278 return self.nodes[current].value.as_ref();
279 }
280
281 idx = self.nodes[current].next[0];
282 i += 1;
283 }
284
285 None
286 }
287
288 /// Returns an inclusive iterator over a range of values
289 /// in the list from `start` to `end`.
290 pub fn range<Q, R>(&self, range: R) -> RangeIter<'_, T, Q, R>
291 where
292 Q: Ord + ?Sized,
293 T: Borrow<Q>,
294 R: RangeBounds<Q>,
295 {
296 // --- FIND FRONT ---
297 let front = match range.start_bound() {
298 Bound::Included(start) => self.nodes[self.find_predecessors(start)[0]].next[0],
299 Bound::Excluded(start) => {
300 let idx = self.nodes[self.find_predecessors(start)[0]].next[0];
301 if let Some(i) = idx {
302 if self.nodes[i].value.as_ref().unwrap().borrow() == start {
303 self.nodes[i].next[0]
304 } else {
305 Some(i)
306 }
307 } else {
308 None
309 }
310 }
311 Bound::Unbounded => self.nodes[0].next[0],
312 };
313
314 // --- FIND BACK ---
315 let back = match range.end_bound() {
316 Bound::Included(end) => {
317 // Find predecessors of 'end'.
318 // If the element at the end of the search IS 'end', that's our back.
319 // If not, the predecessor itself is our back.
320 let update = self.find_predecessors(end);
321 let candidate = self.nodes[update[0]].next[0];
322 if let Some(idx) = candidate {
323 if self.nodes[idx].value.as_ref().unwrap().borrow() == end {
324 Some(idx)
325 } else {
326 // Predicate check: Ensure we aren't returning the sentinel (idx 0)
327 if update[0] == 0 {
328 None
329 } else {
330 Some(update[0])
331 }
332 }
333 } else if update[0] == 0 {
334 None
335 } else {
336 Some(update[0])
337 }
338 }
339 Bound::Excluded(end) => {
340 let update = self.find_predecessors(end);
341 if update[0] == 0 {
342 None
343 } else {
344 Some(update[0])
345 }
346 }
347 Bound::Unbounded => {
348 // To find the absolute end, we find predecessors for a "theoretically infinite" value
349 // or simply walk the tallest tower to the end.
350 let mut curr = 0;
351 for level in (0..self.list_height).rev() {
352 while let Some(next_idx) = self.nodes[curr].next[level] {
353 curr = next_idx;
354 }
355 }
356 if curr == 0 {
357 None
358 } else {
359 Some(curr)
360 }
361 }
362 };
363
364 RangeIter {
365 list: self,
366 front,
367 back,
368 range,
369 _marker: std::marker::PhantomData,
370 }
371 }
372
373 pub fn iter(&self) -> Iter<'_, T> {
374 // Walk the express lanes to find the very last node in O(log n)
375 let mut tail = 0;
376 for level in (0..self.list_height).rev() {
377 while let Some(next_idx) = self.nodes[tail].next[level] {
378 tail = next_idx;
379 }
380 }
381
382 Iter {
383 list: self,
384 next: self.nodes[0].next[0], // First node after sentinel
385 prev: if tail == 0 { None } else { Some(tail) },
386 }
387 }
388
389 // Utility functions
390 ////////////////////
391
392 fn random_height(&self) -> usize {
393 let mut level = 1;
394 let mut rng = rand::rng();
395 while level < MAX_HEIGHT && rng.random::<bool>() {
396 level += 1;
397 }
398 level
399 }
400
401 fn find_predecessors<Q>(&self, key: &Q) -> [usize; MAX_HEIGHT]
402 where
403 Q: Ord + ?Sized,
404 T: Borrow<Q>,
405 {
406 let mut update = [0usize; MAX_HEIGHT];
407 let mut idx = 0;
408
409 for level in (0..self.list_height).rev() {
410 loop {
411 match self.nodes[idx].next[level] {
412 None => break,
413 Some(next_idx) => {
414 let val = self.nodes[next_idx].value.as_ref().unwrap();
415 if val.borrow() >= key {
416 break;
417 }
418 idx = next_idx;
419 }
420 }
421 }
422 update[level] = idx;
423 }
424
425 update
426 }
427
428 //fn fix_moved_node_references(&mut self, new_idx: usize, old_idx: usize) {
429 // // We need to find the neighbors of the node that is now at new_idx
430 // // so we can tell them "I moved!"
431 // let val = self.nodes[new_idx].value.as_ref().unwrap();
432 // let update = self.find_predecessors(val);
433 //
434 // // Update predecessors: they currently point to old_idx, they should point to new_idx
435 // for level in 0..MAX_HEIGHT {
436 // if self.nodes[update[level]].next[level] == Some(old_idx) {
437 // self.nodes[update[level]].next[level] = Some(new_idx);
438 // } else {
439 // // Since towers are vertical, if we don't find a match at this level,
440 // // we might not find them higher up, but check all to be safe.
441 // }
442 // }
443 //
444 // // Update successor: the node after the moved node needs to update its 'prev'
445 // if let Some(next_idx) = self.nodes[new_idx].next[0] {
446 // self.nodes[next_idx].prev = Some(new_idx);
447 // }
448 //}
449}
450
451pub struct Iter<'a, T> {
452 list: &'a SkipList<T>,
453 next: Option<usize>,
454 prev: Option<usize>,
455}
456impl<'a, T> Iterator for Iter<'a, T> {
457 type Item = &'a T;
458
459 fn next(&mut self) -> Option<Self::Item> {
460 let idx = self.next?;
461 let value = self.list.nodes[idx].value.as_ref()?;
462
463 if self.next == self.prev {
464 self.next = None;
465 self.prev = None;
466 } else {
467 self.next = self.list.nodes[idx].next[0];
468 }
469 Some(value)
470 }
471}
472impl<'a, T> DoubleEndedIterator for Iter<'a, T> {
473 fn next_back(&mut self) -> Option<Self::Item> {
474 let idx = self.prev?;
475 let value = self.list.nodes[idx].value.as_ref()?;
476
477 if self.prev == self.next {
478 self.next = None;
479 self.prev = None;
480 } else {
481 let prev = self.list.nodes[idx].prev;
482 // Sentinel check: don't yield index 0
483 self.prev = if prev == Some(0) { None } else { prev };
484 }
485 Some(value)
486 }
487}
488
489pub struct RangeIter<'a, T, Q, R>
490where
491 Q: ?Sized,
492 R: RangeBounds<Q>,
493{
494 list: &'a SkipList<T>,
495 front: Option<usize>, // Moves forward
496 back: Option<usize>, // Moves backward
497 range: R,
498 _marker: std::marker::PhantomData<Q>,
499}
500impl<'a, T, Q, R> Iterator for RangeIter<'a, T, Q, R>
501where
502 Q: Ord + ?Sized,
503 T: Borrow<Q>,
504 R: RangeBounds<Q>,
505{
506 type Item = &'a T;
507
508 fn next(&mut self) -> Option<Self::Item> {
509 let idx = self.front?;
510
511 // Boundary Check
512 let value = self.list.nodes[idx].value.as_ref().unwrap();
513 if !self.range.contains(value.borrow()) {
514 self.front = None;
515 return None;
516 }
517
518 // Meet/Cross Check: If front matches back, this is the last element
519 if self.front == self.back {
520 self.front = None;
521 self.back = None;
522 } else {
523 self.front = self.list.nodes[idx].next[0];
524 }
525
526 Some(value)
527 }
528}
529impl<'a, T, Q, R> DoubleEndedIterator for RangeIter<'a, T, Q, R>
530where
531 Q: Ord + ?Sized,
532 T: Borrow<Q>,
533 R: RangeBounds<Q>,
534{
535 fn next_back(&mut self) -> Option<Self::Item> {
536 let idx = self.back?;
537
538 // Boundary Check
539 let value = self.list.nodes[idx].value.as_ref().unwrap();
540 if !self.range.contains(value.borrow()) {
541 self.back = None;
542 return None;
543 }
544
545 // Meet/Cross Check
546 if self.back == self.front {
547 self.back = None;
548 self.front = None;
549 } else {
550 // Move back, but ensure we don't land on the sentinel (idx 0)
551 let prev = self.list.nodes[idx].prev;
552 self.back = if prev == Some(0) { None } else { prev };
553 }
554
555 Some(value)
556 }
557}
558
559impl<'a, T: Ord> IntoIterator for &'a SkipList<T> {
560 type Item = &'a T;
561 type IntoIter = Iter<'a, T>;
562
563 fn into_iter(self) -> Self::IntoIter {
564 self.iter()
565 }
566}
567
568#[test]
569fn one() {
570 let mut list = SkipList::<char>::new();
571
572 // Tests basic housekeeping on empty list
573 assert_eq!(list.len(), 0);
574 assert!(list.is_empty());
575 assert!(!list.contains(&'z'));
576
577 // Inserts 9 values into the skip list
578 // with a consuming iterator, moving values
579 // into the list
580 let values = ['a', 'c', 'e', 'd', 'b', 'i', 'g', 'h', 'f'];
581 for e in values.into_iter() {
582 list.insert(e)
583 }
584
585 // Tests that len gets updated properly
586 assert_eq!(list.len(), 9);
587 assert!(list.contains(&'g'));
588
589 // Tests basic ordering and iteration
590 // Basic iteration with iter()
591 // Clippy wants enumerate instead of external loop counter
592 let sorted = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'];
593 for (i, e) in list.iter().enumerate() {
594 assert_eq!(e, &sorted[i]);
595 }
596 // Double-ended iteration with rev()
597 // Clippy wants saturating_sub instead of loop counter
598 let mut i = 8;
599 for e in list.iter().rev() {
600 assert_eq!(e, &sorted[i]);
601 if i > 0 {
602 i = i.saturating_sub(1)
603 };
604 }
605 // Or if you wanna be fancy about it
606 // zip() stops as soon as one iterator ends,
607 // eliminating the need for an overflow check
608 for (e, i) in list.iter().rev().zip((0..=8).rev()) {
609 assert_eq!(e, &sorted[i]);
610 }
611
612 // Iterator inferance using the IntoIter impl
613 let mut i = 0;
614 #[allow(clippy::explicit_counter_loop)]
615 for e in &list {
616 assert_eq!(e, &sorted[i]);
617 i += 1;
618 }
619
620 // Tests the Kth function in a 0-indexed list
621 assert_eq!(list.get_kth(6).unwrap(), &'g');
622
623 // Tests the range function
624 // NOTE: char is Copy so you dont strictly need to borrow
625 // when setting range bounds, these tests illustrate both
626 // borrowing and not; Note that each bounds must match
627 // so no (&'a'..'f'), only ('a'..'f') or (&'a'..&'f')
628 // Midlist (exclusive)
629 let val = ['c', 'd', 'e'];
630 //for (i, e) in list.range(&'c', &'f').enumerate() {
631 for (i, e) in list.range('c'..'f').enumerate() {
632 assert_eq!(e, &val[i])
633 }
634 // Midlist (inclusive)
635 let val = ['c', 'd', 'e', 'f'];
636 //for (i, e) in list.range(&'c', &'f').enumerate() {
637 for (i, e) in list.range('c'..='f').enumerate() {
638 assert_eq!(e, &val[i])
639 }
640 // Start of list
641 let val = ['a', 'b', 'c', 'd', 'e', 'f'];
642 //for (i, e) in list.range(&'a', &'f').enumerate() {
643 for (i, e) in list.range(..&'f').enumerate() {
644 assert_eq!(e, &val[i])
645 }
646 // End of list
647 let val = ['e', 'f', 'g', 'h', 'i'];
648 for (i, e) in list.range(&'e'..).enumerate() {
649 assert_eq!(e, &val[i])
650 }
651
652 // Tests removal
653 list.remove(&'e');
654 list.remove(&'a');
655 assert!(!list.contains(&'e'));
656 assert!(!list.contains(&'a'));
657
658 let l2 = ['b', 'c', 'd', 'f', 'g', 'h', 'i'];
659 for (val, i) in list.iter().zip(0..=6) {
660 assert_eq!(val, &l2[i]);
661 }
662
663 // Cant remove what isn't there!
664 assert!(list.remove(&'z').is_none());
665
666 // Everything below here is for display/debug purposes
667 //////////////////////////////////////////////////////
668
669 // Debug print
670 // Prints the tower contents
671 println!("Tower contents by insertion order, NOT sorted order:");
672 for (i, e) in list.nodes.iter().enumerate() {
673 // Collect only the Some values into a new Vec
674 let values: Vec<_> = e.next.iter().filter_map(|&x| x).collect();
675 if let Some(val) = e.value {
676 let v = &val.to_string();
677 println!("{v}[{i}]: {values:?}");
678 } else {
679 println!("HEAD[0]: {values:?}");
680 }
681 }
682 println!();
683
684 // Debug print
685 // Prints the sorted values in the list
686 print!("List values:\n ");
687 for e in list.iter() {
688 print!("{e:#?} ")
689 }
690 println!();
691
692 //panic!();
693}