Collections
Collections group multiple elements of a similar or heterogeneous type into one unit. The collections library represents Rust-specific data structures for convenience.
- Sequences
Vec<T>
VecDeque<T>
LinkedList<T>
- Maps
- HashMap
- BTreeMap
- Sets
- HashSet
- BTreeSet
- Misc
- BinaryHeap
Vectors
Vectors are variable-length sets of values of the same type (any), though if described within enums they can take a mix of types. The only major caveat is that all types must be specified at compile time. Unlike tuples and arrays, vectors can grow or shrink in size. Vectors can also be sized with variable identifiers, which means they are the only option if we dont know how many indexes we need at compile time. Vectors are implemented with generics which is covered elsewhere.
Vectors can be declared either with the new()
method,
let v: Vec<i32> = Vec::new();
or directly with a macro. The macro can infer type so they do not need to be explicitly stated. In this example, the value types are inferred to be of type i32
because it is the default.
let v = vec![1, 2, 3];
The macro can also be used with explicit type declaration.
let v: Vec<usize> = vec![1, 2, 3];
Vectors, just like most things in Rust, are immutable by default. Mutable vectors can be updated using appropriate methods. Notice in the example that the push()
method can only handle one index at a time.
let mut v = vec![1, 2, 3];v.push(5);v.push(6);
It is possible to retrieve values from vectors using two mechanism; indexing and the get()
method. Indexing names references to values and the get method is, well, a method to get the value at a specified index. Vectors are 0-indexed, meaning that the first index in a vector is 0. The major difference in these mechanisms is how the program responds to requests for out-of-bounds indexes. Attempting to access an out-of-bounds index with the reference mechanism will panic the program, shutting it down. The get()
method implements the Option<&T>
type which allows us to gracefully handle out-of-bounds requests.
let v = vec![1, 2, 3]; //Vector declaration & instantiationlet oob1: Option<&i32> = v.get(3); //Graceful OOB handlingmatch oob1 { Some(oob1) => println!("{oob1}"), None => println!("Out-of-bounds!"),}let oob2: &i32 = &v[3]; //Direct index reference
This results in one graceful message, followed by a panic and program shutdown. The panic message provides a helpful hint that 3 is out of bounds on a vector of length 3. There are three indexes named 0-2, so index 3 is out-of-bounds.
Out-of-bounds!thread 'main' panicked at src/cncpt/types/compound/collections.rs:121:24:index out of bounds: the len is 3 but the index is 3note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Generally speaking its best if our program doesn’t panic, so it may be wise to stick with the methods. This example takes an integer n
, creates a vector of n
size, and then returns the value at the (approximation) of the middle index. There is no 11.5 index, so the program returns 12, which is at the 13th index because vectors are 0-indexed.
pub fn vec_test_4(n: usize) { let mut i = n; let mut v: Vec<usize> = Vec::new(); while i >= 1 { v.push(i.try_into().unwrap()); i -= 1; } let half = v.get(&n/2); //Gets the "middle" index match half { Some(half) => println!("{}", half), None => println!("I guess there wasn't anything available!"), }}fn main() { vec_test_4(23);}
12
Vectors are allocated to the heap, so the same ownership rules apply. Take the following example. First we create an immutable borrow with &v[2]
, then we create a mutable borrow with the push()
method. This mutable borrow could potentially deallocate the first immutable borrow (if there is not enough room at the current heap location) resulting in the println!()
statement trying to reference a deallocated memory location. To prevent this mess, the borrow checker simply makes this process illegal.
pub fn vector_ref() { let mut v = vec![1, 2, 3]; let a = &v[2]; //Immutable borrow v.push(4); //Mutable borrow causes immutable borrow to get deallocated println!("{}", a) //Variable a has been dropped!}
To solve the issue we have some options. We can clone the vector to use the immutable borrow later, but we already clearned that cloning can be expensive. We can also reorder the statements so that there is no interference.
pub fn vector_ref() { let mut v = vec![1, 2, 3]; v.push(4); //Mutable borrow let a = &v[2]; //Immutable borrow println!("{}", a) //Legal use of immutable borrow}
But what if the statement order is critical? It is also possible to avoid all of the reordering if we simply stick to purely mutable borrows throughout the function. Just make sure the value you want is the value you get with mutable references!
pub fn vector_ref() { let mut v = vec![1, 2, 3]; let a = v[2]; //Mutable borrow v.push(4); //Mutable borrow println!("{}", a) //Legal use of mutable borrow}
Hash Maps
Hash maps are less commonly used data structures in Rust. Their type is not common enough to be included in the standard prelude, so we must incorporate a use
statement to include them.
use std::collections::HashMap;
Hash maps consist of key-value pairs. Any type can be used as a key or value pair so long as the type implements the Eq
and Hash
trait for the key, and Eq
trait for the value. By default, this includes integer, bool
, char
, String
, and &str
types, as well as some compound types. Creating a new hash map is just as easy as any of the other types in this collections list.
let mut scores = HashMap::new();
Inserting values is equally easy. This example uses String
types for keys, and i32
types for values.
scores.insert(String::from("Democrat"), 37); scores.insert(String::from("Republican"), 48); scores.insert(String::from("Independent"), 15);
However, retrieving values is a little wild. The get()
method returns an Option<&V>
, which returns None
if there are no more values. This example handles the Option
by calling the unwrap_or()
method. The method is given a zero value in case None
is found. This example also uses the copied()
method to skirt ownership on the Option
handler.
//Names a key and retrieves its value let party = String::from("Democrat"); let score = scores.get(&party).copied().unwrap_or(0); println!("The D's got: {}% of the vote.", score);
//Iterates over the whole map and returns all values in arbitrary order for (key, value) in &scores { println!("The {} party has {}% of the vote.", key, value); }
Naturally everything is owned. The hash map takes ownership of inserted values that do not implement the Copy
trait.
let mut hash = HashMap::new(); let key = String::from("Democrats"); let value = String::from("dirty"); hash.insert(k1, v1); //Passes ownership of key and value println!("{}", &key); //Illegal borrow after move println!("{}", &value); //Illegal borrow after move
References can be passed to hash maps, but they must be valid as long as the hash map is. This is covered in the section on lifetimes.
Updating the hash map
It is possible to overwrite, ignore an update, or add to a value in a hash map.
To overwrite a value simply use the insert()
method with the same key. This is sort of like shadowing, but instead the program simply overwrites the first insert. Retrieving the value from the independent
key after these two inserts would result in a value of 19.
scores.insert(String::from("Independent"), 15); scores.insert(String::from("Independent"), 12);
To prevent overwriting values we can use the or_insert()
method with the entry
API. The entry
takes a key argument and checks if it exists and if it has a value, if it doesn’t have a value, it adds the argument we pass to the or_insert()
method. If the key already has an associated value, the new value will not be written. The or_insert()
method returns a mutable reference to the new value.
scores.entry(String::from("Independent")).or_insert(12); scores.entry(String::from("Socialist")).or_insert(3);
The following example uses a hash map as a counter to count how many times each word appears in the supplied text.
use std::collections::HashMap;fn main() { let text = "hello world wonderful world"; let mut map = HashMap::new(); for word in text.split_whitespace() { let count = map.entry(word).or_insert(0); *count += 1; } println!("{:?}", map);}