Unsafe Rust

The Rust compiler is conservative by design. It is better to reject potentially correct code than to allow potentially problematic code. As a result, Rust includes an option to declare “unsafe” blocks that allow for the following operations that aren’t allowed otherwise:

Dereference a raw pointer
Call an unsafe function or method
Access or modify a mutable static variable
Implement an unsafe trait
Access fields of unions

Unsafe Rust isn’t magic, but it does require attention to detail. Rust recommends keeping unsafe scopes to a minimum to make it easier to debug them later. Furthermore, Rust recommends encapsulating any unsafe code within a safe abstraction to guarantee safe APIs. Smart.

Raw Pointers

For a refresher on pointers in Rust see the Reference Types and Ownership sections.

Raw pointers, similar to references, can be either immutable or mutable as *const T and *mut T, respectively. Mutable raw pointers can mutate the data they point to. Naturally immutable raw pointers cannot. Raw pointers themselves simply point to the stack frame object of the reference used to create them. To access the underlying data raw pointers must be dereferenced with a dereference operator in an unsafe block. Raw pointers allow for the following characteristics.

Allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
Aren’t guaranteed to point to valid memory
Are allowed to be null
Don’t implement any automatic cleanup

You can declare raw pointers anywhere. You can even access the addresses they point to in safe code.

    let mut num = 5;
    let r1 = &num as *const i32; // Raw pointer!
    let r2 = &mut num as *mut i32; // Raw pointer!

    println!("{:#?}", r1);
    println!("{:#?}", r2);

0x00007ff7b59e77dc
0x00007ff7b59e77dc

However, if you want to actually touch the data you need to dereference the raw pointers which requires an unsafe block.

    unsafe {
        println!("{:?}", *r1);
        println!("{:?}", *r2);
    }

5
5

This example illustrates slightly more complex data object using String.

    let val = String::from("Peter"); // Creates heap data and stack object
    let val_ptr: *const String = &val; // Creates immutable raw pointer from object reference
    unsafe {
        println!("{}", *val_ptr) // Dereferences raw pointer to access underlying data
    }

Immutable raw pointers cant reassign data.

    let val = "Peter".to_string();
    let p: *const String = &val;
    unsafe {
        // Illegal: cant assign pointer to data through an immutable raw pointer
        *p = "".to_string();
    }

Did you get all that? No? Still confused about what a raw pointer actually does? How about a deep dive to explain whats happening using some specific terminology that goes through both an immutable reference and an immutable raw pointer to illustrate the difference.

This example starts by creating a String object and binds it to val on line 3. The operation actually involves creating two memory segments described in the Ownership section. The first segment allocates the actual string data on the heap. This process returns a pointer to the heap data. The second segment is a stack frame that contains the String object. The String object contains the returned pointer to the data on the heap and information about the string’s size. The example next creates an immutable reference &val on line 6 and binds it to val_ref. This reference points to String object on the stack. This object allows you to access the underlying information for things like printing the string because you know its address and length (line 8). Next, the example creates another immutable reference on line 12 and binds it to a raw pointer val_ptr. The raw pointer type is declared with the *const T syntax. The raw pointer points to the same String object on the stack as the reference did on line 6. If you try printing the raw pointer you get the (stack) object’s 64-bit stack memory address (because this was run on a 64-big operating system), not the underlying heap data. This is safe because you’re not actually touching the data. Actually accessing the data is unsafe, so it requires an unsafe block on line 15. The unsafe block dereferences the raw pointer, following the stack object’s pointer to access the data on line 17. In this way the String object on the stack is like a smart pointer that contains a raw pointer to the heap address where the string data is stored.

1 pub fn references_and_pointers() {
2     // Heap-allocated value
3     let mut val = String::from("Peter");
4
5     // Creates an immutable reference to the value
6     let val_ref: &String = &val;
7     // Prints the referenced object
8     println!("Reference: {}", val_ref);
9
10    // Creates a raw pointer to the value
11    // Points to the underlying String object named in the reference &val
12    let val_ptr: *const String = &val;
13    // Pretty prints the raw pointer as a memory address
14    println!("Raw pointer address: {:#?}", val_ptr);
15    unsafe {
16        // Dereferences the raw pointer to print the value at the address
17        println!("Dereferenced raw pointer value: {}", *val_ptr)
18    }
19 }

Reference: Peter
Raw pointer address: 0x00007ff7b0b5e688
Dereferenced raw pointer value: Peter

Unsafe Functions

Similar to raw pointers, you must call unsafe functions from unsafe blocks.

    unsafe fn foot_gun() {
       ...
    }

    unsafe {
        foot_gun();
    }

With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s documentation, we understand how to use it properly, and we’ve verified that we’re fulfilling the contract of the function.

If you have to call unsafe functions it is best to create safe abstractions over them to call from safe Rust. To do this simply add graceful panic checks to ensure that the abstraction is safe.

Foreign Function Interfaces

So you wanna call functions in languages that aren’t Rust. Sounds like a blast. To guarantee safety you’re gonna have to build a foreign function interface (FFI). To do this you’ll need the keyword extern to name the function. You must wrap the actual function call in an unsafe block. In this example the "C" part defines the application binary interface (ABI) that the external function uses. The ABI defines how the call is made at the assembler level.

extern "C" {
    fn abs(input: i32) -> i32;
}

fn main() {
    unsafe {
        println!("Absolute value of -3 according to C: {}", abs(-3));
    }
}

Accessing Mutable Static Variables

Rust has a couple of different global variable options. In Rust there are constants which are inlined at compile time. This means that the compiler replaces all named constants with their computed values at compile time, meaning that Rust duplicates the values wherever they’re used. Constants are always immutable, and the program must be recompiled when they are changed. Rust also has static variables. Both constants and static variables have a 'static lifetimes which means they are available throughout the lifetime of the program. Unlike constants, static variables aren’t inlined at compile time. Instead, they represent a single memory location and all mentions of them are translated as references at compile time. Rust allows for both immutable and mutable static variables. Because of the shared memory location, accessing immutable global static variables is safe, but accessing mutable global static variables can cause data races. This means that accessing mutable global static variables requires unsafe blocks. Mutable static values may be required for low-level programming concepts or FFIs to track global state.

This code compiles and prints COUNTER: 3 because its single threaded, but may result in abnormalities (data races) in a multithreaded situation. Where possible, use thread-safe concurrency patterns and smart pointers!

static mut COUNTER: u32 = 0;

fn add_to_count(i: u32) {
    unsafe {
        COUNTER += i;
    }
}

fn main() {
    add_to_count(3);
    unsafe {
        println!("COUNTER: {COUNTER}");
    }
}

Unsafe Traits

Traits are considered unsafe when at least one of its methods has some invariant that the compiler cannot verify. Thats cool, just slap an unsafe on that bad boy and call it a day.

unsafe trait Foo {
    // methods go here
}

unsafe impl Foo for i32 {
    // method implementations go here
}

Unions

Did you know Rust has unions? Well, it does. But its mostly for interoperability with C. This section is so short that it just quotes the book’s entire description.

The final action that works only with unsafe is accessing fields of a union. A union is similar to a struct, but only one declared field is used in a particular instance at one time. Unions are primarily used to interface with unions in C code. Accessing union fields is unsafe because Rust can’t guarantee the type of the data currently being stored in the union instance.