Rust Reference

The Rust team has made the decision to keep the language’s standard library relatively small. This keeps things lightweight but also means you’ll have to rely on many third-party crates for functionality that is included in many other language’s standard libraries.

The Standard Library

This section covers some general-purpose elements that the rest of the Rust material doesn’t cover.

Mapping

There are several prominent map functions in the standard library that follow functional programming concepts. Indeed the concept of the map functions are more aligned with transforming elements, rather than associating values with addresses or keys in a classical data structure sense.

Iterator::map

The map function defined on the Iterator trait, “takes a closure and creates an iterator which calls that closure on each element.” This allows you to transform each element yielded by the iterator.

let numbers = vec![1, 2, 3];
let doubled: Vec<i32> = numbers.iter().map(|x| x * 2).collect();
assert_eq!(doubled, vec![2, 4, 6])

The map method on Iterator is commonly used with the collect trait method. The collect method turns the mapped iterator back into an explicitly typed collection. In this example the collect method returns a Vec<i32> as specified in the doubled type annotation.

See the official documentation for more information.

Option::map

The map function defined on the Option enum, “maps an Option<T> to Option<U> by applying a function to a contained value (if Some) or returns None (if None).” This can be used to change the type and/or value within an Option<T>. Unlike a cast this method is used for transforming the underlying data, and much more flexible. The map method on takes Option<T> by value, which means the underlying data may be changed depending on the closure. What happens inside the closure determines whether the underlying data remains unchanged, is moved, or is modified.

let greeting_option = Some("Hello, World!".to_string());
assert_eq!(Some("Hello, World!".to_string()), greeting_option);
let len = greeting_option.map(|s| s.len());
assert_eq!(len, Some(13));

See the official documentation for more information.

Result::map

The map function associated with the Result enum is very similar to the one on the Option enum (described above). This mapping function only affects the Ok variant of the sum type and leaves the Err result alone.

1   let line = "1\n2\n3\n4\n";
2   for (i, val) in line.lines().enumerate() {
3       match val.parse::<usize>().map(|i| i * 3) {
4           Ok(n) => {
5               assert_eq!(n, (i + 1) * 3);
6               println!("Ok({n})")
7           },
8           Err(_) => {}
9       }
10  }

Line 3 attempts to parse the String into a usize to match the default type of the enumerate tuple. The assertion at line 5 adds 1 to i because the index is 0-indexed and off-by-one errors are the most common. Both match arms return the unit type, because why not?

See the official documentation for more information.

Memory management

The std::mem crate has all sorts of useful methods for dealing with memory management. From the standard library documentation:

The replace method swaps a value with a passed value, returning the old value.
The swap method swaps the values at two mutable locations, without deinitializing either one.
The take method swaps with a dummy value, or a default value if the type implements Default.

HTTP Functionality

The choice of HTTP libraries depends on your use case and how low-level you want to get. The reqwest crate represents a happy medium for complexity and feature sets. It is built on top of the low-level hyper crate, providing simple, ergonomic APIs for common functionality. The axum crate is a “kitchen sink” inclusive library built on tokio. The tokio base gives axum a strong async offering. Concurrency is discussed in more depth later on in this section.

Passwords

Use rpassword to take passwords from console.

Serialization

This section examines some common JSON operations using both the serde and serde_json crates. These crates represent a portmanteau of the functions they deal with; namely serialization (ser) and deserialization (de). Serialization and deserialization are processes commonly used in data exchange and storage.

Serialization is the process of converting data structures or objects into a format that can be easily stored or transmitted. This format is typically a sequence of bytes that can be written to disk or sent over a network. Serialization allows complex data structures to be represented in a compact, platform-independent way. Serialized data can later be deserialized back into its original form. JSON, YAML, or bincode are considered serialized data formats.
Deserialization involves converting serialized data back into its original data structure or object. Deserialization is essential for reading data that has been previously serialized, allowing it to be used within a program. Deserialization involves parsing, validation, mapping, and finally conversion of data to an object.

In the context of programming, crates like serde provide tools for serialization and deserialization of data structures into common formats. More specialized crates like serde_json focus specifically on JSON, and other similar crates like serde_yaml handle YAML, bincode handles bincode, and… you get the idea. These libraries automate the process, making it easier for developers to work with serialized data in their applications. While they perform similar actions, it’s important to distinguish that serde is a general object serializer/deserializer, and serde_json is specifically used with JSON strings.

JSON

JSON has taken over the internet… and my heart. This section focuses on the serde_json crate. At the heart of the serde_json crate is the Value enum type. The Value type provides a convenient intermediary type used to process JSON data. You dont always need to use the Value type, but it is helpful for passing JSON data internally in your program as well as extracting and mutating JSON structures. For example, instead of having to check and parse an incoming JSON-formatted string with every operation, all you need to do is serialize the data to an intermediate Value type to work with it. Easy!

According to the JSON standard detailed in RFC 8259, “An object is an unordered collection of zero or more name/value pairs…” In situations where you need to preserve order, such as when mutating data within the structure, include the preserve_order feature in the project’s cargo.toml file.

[dependencies]
serde_json = { version = "1.0.116", features = ["preserve_order"] }

JSON marshaling

One of the most common operations is marshaling data structures to JSON and back to structs. This example starts with two structs that create a nested structure. To serialize the struct(s) you need to implement serde::Serialize trait for the struct. The opposite is true for deserializing JSON to your data structures with the serde::Deserialize trait. This simple example uses the convenient #[derive()] macro for both trait implementations. In rare and complex situations you may need to manually implement these traits, but the convenient macro is often enough. The actual JSON serialization is handled by one of the functions on the serde_json crate, and includes functions to serialize to_string(), to_string_pretty(), to_vec(), and to I/O streams with to_writer(). Choosing which of these functions depends on what you intend to do with the serialized data. This example simply serializes to a pretty-formatted JSON string for printing.

    use serde::{Serialize, Deserialize};

    // Declares, implements, and instantiates a nested structure
    #[derive(Debug, Serialize, Deserialize)]
    struct Person {
        name: Option<String>,
        age: Option<u8>,
        birthplace: Option<Birthplace>,
    }
    #[derive(Debug, Serialize, Deserialize)]
    struct Birthplace {
        city: Option<String>,
        state: Option<String>,
    }
    impl Person {
        pub fn new(name: String, age: u8, city: String, state: u32) -> Person {
            let birthplace = Birthplace {
                city: Some(city),
                state: Some(state),
            };
            let person = Person {
                name: Some(name),
                age: Some(age),
                birthplace: Some(birthplace),
            };
            return person;
        }
    }

    let some_guy = Person::new(
        "Peter".to_string(),
        40,
        "Iowa City".to_string(),
        "Iowa".to_string(),
    );

    // Serialize the struct instance to a string
    let person: String = serde_json::to_string_pretty(&some_guy)
        .unwrap_or_else(|e| panic!("Error parsing struct: {e}"));

    println!("{person}");

{
  "name": "Peter",
  "age": 40,
  "birthplace": {
    "city": "Iowa City",
    "state": "Iowa"
  }
}

This toy example only pretty-prints the struct (with some weak error handling). Its possible to convert raw structs to the intermediary serde_json::Value with the serde_json::to_value(&T) function to extract and insert data as JSON internally in the program, however its probably more common to only deal with JSON conversions at the edge of the program. In cases where you need to convert a struct to JSON for transmission it’s probably better to convert them directly to a byte stream with to_vec() or an I/O stream with to_writer(). The following example converts a raw struct to a serde_json::Value two different ways and also into a byte vector. All three styles are recoverable (do not panic) and return Null values for invalid JSON. The design decision to panic if you dont get a valid JSON string is up to you. A panic is actually simpler because you can use unwrap() instead. Each statement uses an instantiated struct called person like the Person struct in the previous code snippet.

    // Converts an instantianted struct as "person" to a serde_json::Value object
    let json_value: serde_json::Value = match serde_json::to_value(&person) {
        Ok(json) => json,
        Err(e) => {
            eprintln!("Error: {e}");
            serde_json::Value::Null
        }
    };
    // Does the same thing but more idiomatically
    let json_value: serde_json::Value = serde_json::to_value(&person).unwrap_or_else(|e| {
        eprintln!("Error serializing struct: {e}");
        serde_json::Value::Null
    });

    // Converts the instantiated struct to a byte vector instead of a serde_json::Value
    let json_value: Vec<u8> = serde_json::to_vec(&person).unwrap_or_else(|e| {
        eprintln!("Error serializing struct: {e}");
        vec![]
    });

JSON data processing

Ok, so you know how to JSON-serialize a struct object as both a string and a Value object. Great! But what about the other side of things? You’re mostly going to run into JSON from something like a web API. How do you extract and/or re-insert values at specific nodes within the JSON data? Luckily the Value type has some convenient ways to perform these operations.

This example illustrates how to extract and substitute data from some JSON-formatted string value as commonly received from a web API. To start, assume the incoming data is processed by an HTTP function and converted to a String or string (slice). To parse incoming JSON literals you can use the serde_json::from_str() function to deserialize either a &str or String to a Value. This example illustrates a recoverable statement and a statement that panics. Again, the decision whether to panic is up to you!

    // Simulates a JSON-formatted string literal you might get as an input
    let json_string: String = String::from(
        r#"
        {
            "name": "Peter",
            "age": 40,
            "birthplace": {
              "city": "Iowa City",
              "state": "Iowa"
            }
        }
        "#,
    );

    // Recoverable statement to parse and deserialize the string to serde_json::Value
    let val: serde_json::Value = serde_json::from_str(&json_string).unwrap_or_else(|e| {
        eprintln!("Error parsing JSON string: {e}");
        serde_json::Value::Null
    });
    // Does the same thing, but panics
    let val: serde_json::Value = serde_json::from_str(&json_string).unwrap();

After the data is converted to a Value you can get data or use get_mut to insert data into the serialized structure. It’s possible to get pretty creative with extractions and insertions. The next two examples illustrate getting and mutating the JSON value respectively

This example covers a couple possible ways to extract data from a serialized Value object. Note that you must get each level within the JSON structure to retrieve the data nested inside. Because name is a top-level value, it only needs one get, but because city is nested in birthplace you need two get operations.

    // Extracts data from the serde_json::Value
    // As &str
    let name: &str = json_value
        .get("name")
        .and_then(|v| v.as_str())
        .unwrap_or_else(|| panic!("Error extracting name"));
    assert_eq!(name, "Peter");

    let hometown: &str = json_value
        .get("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get("city")
        .and_then(|v| v.as_str())
        .unwrap_or_else(|| panic!("Error extracting city"));
    assert_eq!(hometown, "Iowa City");

    // As Number
    let age: &serde_json::Number = json_value
        .get("age")
        .and_then(|v| v.as_number())
        .unwrap_or_else(|| panic!("Error extracting age"));
    let parsed: u64 = age.as_u64().unwrap();
    assert_eq!(parsed, 40);

    // As a specific type of number
    let age: u64 = json_value
        .get("age")
        .and_then(|v| v.as_u64())
        .unwrap_or_else(|| panic!("Error extracting age"));
    assert_eq!(age, 40);

This example illustrates how to mutate a value in the JSON Value object using the get_mut() function. To do this you need to start with a mutable Value type. All examples thus far have used immutable JSON objects, but this operation is as easy as adding a mut to the initial object creation. The next step involves mutably borrowing the node value you want to change. The example includes two approaches to first illustrate the typing and then presents a more ergonomic approach. The first approach creates a mutable borrow of the JSON object, then creates the new value, and finally binds it to the proper node which involves de-referencing the original mutable borrow to access the object’s memory location. The second approach does the same thing much more compactly and elegantly. The second approach combines the mutable borrow, value creation, and assignment into one compact statement. The biggest change is how the second approach creates the Value by using the convenience json! macro. Both approaches prove that they work with an assertion statement. The second approach reduces the LOC by 25%, but may not be quite as flexible or readable.

    // Creates a mutable JSON object
    let mut mutable_json_value: serde_json::Value = serde_json::to_value(&me).unwrap_or_else(|e| {
        eprintln!("Error serializing struct: {e}");
        serde_json::Value::Null
    });
    // Ensures that the JSON is indeed borked (with some super gross typing)
    let state: &Value = mutable_json_value
        .get("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get("state")
        .unwrap_or_else(|| panic!("Error extracting state"));
    assert_eq!(state, &serde_json::Value::Number(serde_json::Number::from(69)));

    // First approach corrects the type
    // Creates a mutable borrow of the specific JSON node to replace
    let mutable_borrow: &mut Value = mutable_json_value
        .get_mut("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get_mut("state")
        .unwrap_or_else(|| panic!("Error extracting state"));
    // Creates a new JSON-formatted String value and converts it to Value type
    let new_state_string: String = "\"California\"".to_string();
    let new_state: serde_json::Value = serde_json::from_str(&new_state_string).unwrap();
    // Assigns the new Value to the mutable borrow
    *mutable_borrow = new_state;

    // Checks that the change took
    let state: &Value = mutable_json_value
        .get("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get("state")
        .unwrap_or_else(|| panic!("Error extracting state"));
    assert_eq!(state, &serde_json::to_value("California").unwrap());

    // Second, easier approach corrects the value and does so the easy way
    // by using the json! macro instead of creating a Value
    *mutable_json_value
        .get_mut("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get_mut("state")
        .unwrap_or_else(|| panic!("Error extracting birthplace")) = json!("Iowa");

    // Checks that the change took
    let state: &Value = mutable_json_value
        .get("birthplace")
        .unwrap_or_else(|| panic!("Error extracting birthplace"))
        .get("state")
        .unwrap_or_else(|| panic!("Error extracting state"));
    assert_eq!(state, &serde_json::to_value("Iowa").unwrap());

From here it’s easy to mix and match operations to process the data as necessary. Congratulations! You know how to work with JSON in Rust!

Custom pretty-formatting ordered JSON

The serde_json crate does not guarantee ordering by default. To preserve ordering with pretty-formatting we need add the preserver_order feature on the serde_json crate. The default pretty formatter uses 4 spaces which can be a little much for me at times. To dictate the level of indentation you can use the PrettyFormatter. This example specifies 3 spaces instead of 4, which I find to be about perfect to my eye in most situations. I have left all crate paths intact in the code to show what is being used where. In production, you’ll probably want to move these into a use block for cleaner presentation.

[dependencies]
serde_json = { version = "1.0.116", features = ["preserve_order"] }

//use serde_json::ser::{PrettyFormatter, Serializer};

/** Simple pretty-formatter for JSON responses;
* Guarantees ordering with the optional preserve_order feature in serde_json */
pub fn pretty_format(raw_json: &str) -> String {

    // Parses the raw JSON into a serde_json::Value object
    // Without assignment this requires additional turbofish type annotation
    match serde_json::from_str::<serde_json::Value>(raw_json) {
        Ok(json_value) => {

            // Creates a Vec<u8> buffer to hold the formatted string
            let mut buffer = Vec::new();

            // Instantiates a PrettyFormatter object with custom indents
            let formatter = serde_json::ser::PrettyFormatter::with_indent(b"   ");

            // Instantiates a Serializer object that writes to the buffer
            let mut serializer = serde_json::ser::Serializer::with_formatter(&mut buffer, formatter);

            // Serializes the serde_json::Value object to the buffer
            match json_value.serialize(&mut serializer) {
                Ok(serialized_json) => serialized_json,
                Err(e) => println!("Error serializing JSON: {}", e)
            }

            // Converts the buffer into a String
            match String::from_utf8(buffer) {
                Ok(formatted_string) => formatted_string,
                Err(e) => format!("Error converting buffer to String: {}", e)
            }
        }
        Err(e) => format!("Error parsing JSON: {}", e)
    }
}

Concurrency

[an attempt to explain concurrency vs parallelism]

Concurrency and/or parallelism involve the simultaneous execution of code (or the appearance thereof). Concurrency is implemented with concurrency primitives that make up the basic building blocks of concurrency and/or parallel operations. They are called primitives because they provide low-level operations that manage the coordination, synchronization, and communication between multiple threads or processes. These operations often interact closely with the operating system or runtime environment to ensure safe and efficient execution of concurrent tasks. Concurrency primitives should not be confused with language primitives. In fact there are very few actual Rust language principals that deal with concurrency. Most concurrent operations are handled in either the standard library or external crates.

Examples of concurrency primitives include:

Threads: The smallest unit of execution in a program. See the Systems section for more information on threads and processes.
Channels: Channels facilitate communication between threads by passing data as messages between transmitters and receivers in a single-ownership model.
Mutexes & Locks: Similar to handles and ownership flags. These primitives ensure that only one thread can access a resource at a time.
Semaphores: Control access to a resource by multiple threads.
Atomics: Atomic types provide primitive shared-memory communication between threads, and are the building blocks of other concurrent types.
Condition Variables: Allow threads to wait for certain conditions to be met.
Futures and Promises: Represent asynchronous computations.
Barriers: Synchronize multiple threads at a certain point of execution.

Memory models

The Rust book likes to highlight the concurrency section from Go’s style guide.

“Do not communicate by sharing memory; instead, share memory by communicating.”

Concurrency can be handled with single or multiple owners of data. In single ownership models you use channels to pass messages between threads with transmitters and receivers. In shared-state concurrency you use mutual exclusion (mutex) operations.Threads request data access to a mutex’s lock to perform operations on the data. In this way data is said to be guarded from multiple concurrent mutations.

Language features

There are only three concurrency features baked into the Rust language. these are the std::marker module and the Sync and Send marker traits.

A type that implements the Send trait indicates that it is type safe. This means that ownership of the type’s values can be transferred between threads safely. Most types are thread safe, but not all, including the Rc<T> type (discussed in the mutex section below). Any type composed entirely of Send types is automatically marked as Send as well. Almost all primitive types are Send, aside from raw pointers.

The Sync marker trait indicates that it is safe for the type implementing Sync to be referenced from multiple threads. In other words, any type T is Sync if &T (an immutable reference to T) is Send, meaning the reference can be sent safely to another thread. Similar to Send, primitive types are Sync, and types composed entirely of types that are Sync are also Sync.

Concurrency with Rust’s std lib

Rust provides some basic primitives for concurrent programming. The standard library moves much slower than community development. To get the latest and greatest from all the hard-working nerds out there it’s probably best to use an outside crate such as the low level mio or Tokio which is a fully-featured and higher-level library based on mio.

⚠️ As of 5/24 this page only covers the standard library.

Basic thread management

In order to use a thread you must spawn it first. Spawn threads using the std::thread::spawn() method. The spawn() method takes a closure that captures data and operations to execute on the new thread.

    let mut v1 = vec![1, 2, 3, 4, 5, 6, 7, 8, 9];

    let t_1 = thread::spawn(|| {
        for e in v1.iter_mut() {
            *e *= *e;
            println!("{e} ");
        }
    });

Use the move keyword to capture ownership in the newly spawned thread.

        let mut v1 = vec![1, 2, 3, 4, 5, 6, 7, 8, 9];

        let t_1 = thread::spawn(move || {
            for e in v1.iter_mut() {
                *e *= *e;
                println!("{e} ");
            }
        });

Use join() to ensure that all threads complete before the outer scope drops operations.

pub fn concurrency_1() {

    // Data from main thread
    let mut v1 = vec![1, 2, 3, 4, 5, 6, 7, 8, 9];
    let v2 = vec!["a", "b", "c", "d", "e"];
    let v3 = vec!["one", "two", "three", "four", "five"];

    // Thread 1
    let t_1 = thread::spawn(move || {
        for e in v1.iter_mut() {
            *e *= *e;
            println!("Thread 1: {e} ");
            thread::sleep(Duration::from_secs(1))
        }
    });

    // Thread 2
    let t_2 = thread::spawn(move || {
        for e in v3.iter() {
            println!("Thread 2: {e} ");
            thread::sleep(Duration::from_secs(1));
        }
    });

    // Main thread logic
    for e in v2.iter() {
        println!("Main: {e} ");
        thread::sleep(Duration::from_secs(1))
    }

    // Ensures that all threads finish before exiting the function
    t_1.join().expect("Uh oh");
    t_2.join().expect("Uh oh");

    println!("All done!")
}

Message passing with channels

Rust implements channels for message passing between threads. Channels are directional and consist of transmitters and receivers. Channels are considered closed when the coe drops either the transmitter or receiver.

Channels, in any language, are a similar approach to utilizing single ownership. One a value is passed down a channel you should no longer use that value.

Shared-state concurrency

This approach offers an alternative to channels. This model looks something a bit like multiple ownership because mutliple threads can access the same memory location at the same time.

Mutex

A mutex is a portmanteau for mutual exclusion. Mutexes use locks to determine which thread can access data and prevent more than one access point at any given time. The lock mechanism blocks the thread until it can obtain the lock. Mutexes are capable of interior mutability and require smart pointers like Arc<T>. You can use Mutex<T> to mutate values inside an Arc<T> type the same way you can use RefCell<T> to mutate the values inside an Rc<T> type.

The Arc<T> type is a smart pointer like Rc<T> except that it is thread-safe. The Arc<T> type acts like an object wrapper to ensure that only one thread has access to it at a time. Using this strategy, you can divide a calculation into independent parts, split those parts across threads, and then use a Mutex<T> to have each thread update the final result with its part. Using Mutex<T> can lead to operations that need to lock resources across two threads that each have acquired locks causing them to wait forever in something known as a deadlock.

use std::sync::{Arc, Mutex};
pub fn mutexes() {
    let counter = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            let mut num = counter.lock().unwrap();

            *num += 1;
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Result: {}", *counter.lock().unwrap());
}

std::net

Building a server

This section is ripped straight out of chapter 20 of the Rust book.

Create a TCP listener: A single stream represents an open connection (the full request and response process) between the client and the server. A specific address and port combination, known as a socket address, can have a number of connections depending on OS limits, network configurations, system resources like CPU and memory limitations, as well as application desing. The duration of the connection is contextual based on application design, but an open connection allows for stateful communication. Only one server instance can listen for connections on a specified socket address at a time.
Handle successful connections with a thread pool. The pool has a finite number of connectsion which prevents a denial of service (DoS) attack by preventing some client from locking up the server with an absurd number of requests. The thread pool uses a queue (FIFO) structure. Alternative approaches include the _fork/join, single-threaded async I/O, and multi-threaded async I/O models.

use std::net::TcpListener;

fn server() {
    // Creates a TCP listener
    let listener = TcpListener::bind("127.0.0.1:7878").unwrap();

    // Iterates over connection attempts;
    // Sends successful streams to the connection handler
    for stream in listener.incoming() {
        let stream = stream.unwrap();
        connection_handler(stream);
    }
}
fn connection_handler(mut stream: TcpStream) {
    // Creates a buffered reader from the mutable reference to the stream
    let buf_reader = BufReader::new(&mut stream);

    // Reads only the first line of the request
    let request_line = buf_reader.lines().next().unwrap().unwrap();

    // Matches the first line of the request
    let (status_line, filename) = match &request_line[..] {
        "GET / HTTP/1.1" => {
            ("HTTP/1.1 200 OK", "html/hello.html")
        },
        "GET /sleep HTTP/1.1" => {
            thread::sleep(Duration::from_secs(5)); // lol
            ("HTTP/1.1 200 OK", "html/hello.html")
        },
        _ => ("HTTP/1.1 404 NOT FOUND", "html/404.html"),
    };

    // Reads in appropriate HTML file to serve and provides it's length
    let contents = fs::read_to_string(filename).unwrap();
    let length = contents.len();

    // Formats and issues HTTP response as byte slice to the mutable stream
    let response =
        format!("{status_line}\r\nContent-Length: {length}\r\n\r\n{contents}");
    stream.write_all(response.as_bytes()).unwrap();
}