Skip to content

Unsafe Rust

The Rust compiler is conservative by design. It is better to reject potentially correct code than to allow potentially problematic code. As a result, Rust includes an option to declare “unsafe” blocks that allow for the following operations that aren’t allowed otherwise:

  • Dereference a raw pointer
  • Call an unsafe block of code
  • Access or modify a mutable static variable
  • Implement an unsafe trait
  • Access fields of union types

Unsafe Rust isn’t magic, but it does require a special attention to detail. Unfortunately, those details are incredibly deep. This page only covers some basics. For a true look into the complex nature of safety in Rust, head over to the Rustnomicon.

As a general rule, its wise to keep unsafe scopes as small as possible to make it easier to debug them later. This includes even single lines of unsafe blocks in some cases. Furthermore, Rust recommends encapsulating any/all unsafe code within a safe abstraction to guarantee safe APIs.

Raw Pointers

For a refresher on pointers in Rust see the Reference Types and Ownership sections.

Raw pointers, similar to references, can be either immutable or mutable as *const T and *mut T, respectively. Mutable raw pointers can mutate the data they point to. Naturally immutable raw pointers cannot. Raw pointers themselves simply point to the stack frame object of the reference used to create them. To access the underlying data raw pointers must be dereferenced with a dereference operator in an unsafe block. Raw pointers allow for the following characteristics.

  • Allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
  • Aren’t guaranteed to point to valid memory
  • Are allowed to be null
  • Don’t implement any automatic cleanup

You can declare raw pointers anywhere. You can even access the addresses they point to in safe code.

let mut num = 5;
let r1 = &num as *const i32; // Raw pointer!
let r2 = &mut num as *mut i32; // Raw pointer!
println!("{:#?}", r1);
println!("{:#?}", r2);
0x00007ff7b59e77dc
0x00007ff7b59e77dc

However, if you want to actually touch the data you need to dereference the raw pointers which requires an unsafe block.

unsafe {
println!("{:?}", *r1);
println!("{:?}", *r2);
}
5
5

This example illustrates slightly more complex data object using String.

let val = String::from("Peter"); // Creates heap data and stack object
let val_ptr: *const String = &val; // Creates immutable raw pointer from object reference
unsafe {
println!("{}", *val_ptr) // Dereferences raw pointer to access underlying data
}

Immutable raw pointers cant reassign data.

let val = "Peter".to_string();
let p: *const String = &val;
unsafe {
// Illegal: cant assign pointer to data through an immutable raw pointer
*p = "".to_string();
}

Did you get all that? No? Still confused about what a raw pointer actually does? How about a deep dive to explain whats happening using some specific terminology that goes through both an immutable reference and an immutable raw pointer to illustrate the difference.

This example starts by creating a String object and binds it to val on line 3. The operation actually involves creating two memory segments described in the Ownership section. The first segment allocates the actual string data on the heap. This process returns a pointer to the heap data. The second segment is a stack frame that contains the String object. The String object contains the returned pointer to the data on the heap and information about the string’s size. The example next creates an immutable reference &val on line 6 and binds it to val_ref. This reference points to String object on the stack. This object allows you to access the underlying information for things like printing the string because you know its address and length (line 8). Next, the example creates another immutable reference on line 12 and binds it to a raw pointer val_ptr. The raw pointer type is declared with the *const T syntax. The raw pointer points to the same String object on the stack as the reference did on line 6. If you try printing the raw pointer you get the (stack) object’s 64-bit stack memory address (because this was run on a 64-big operating system), not the underlying heap data. This is safe because you’re not actually touching the data. Actually accessing the data is unsafe, so it requires an unsafe block on line 15. The unsafe block dereferences the raw pointer, following the stack object’s pointer to access the data on line 17. In this way the String object on the stack is like a smart pointer that contains a raw pointer to the heap address where the string data is stored.

1 pub fn references_and_pointers() {
2 // Heap-allocated value
3 let mut val = String::from("Peter");
4
5 // Creates an immutable reference to the value
6 let val_ref: &String = &val;
7 // Prints the referenced object
8 println!("Reference: {}", val_ref);
9
10 // Creates a raw pointer to the value
11 // Points to the underlying String object named in the reference &val
12 let val_ptr: *const String = &val;
13 // Pretty prints the raw pointer as a memory address
14 println!("Raw pointer address: {:#?}", val_ptr);
15 unsafe {
16 // Dereferences the raw pointer to print the value at the address
17 println!("Dereferenced raw pointer value: {}", *val_ptr)
18 }
19 }
Reference: Peter
Raw pointer address: 0x00007ff7b0b5e688
Dereferenced raw pointer value: Peter

The NonNull Type

Rust includes a NonNull type which provides additional guarantees over raw pointers, while still requiring unsafe handling.

There are two primary reasons to choose NonNull over *mut T: size optimization and variance.

Size Optimization

This is the most immediate practical difference. Most real-world code wraps raw pointers in an Option type, allowing you to check for None on the type to avoid a null pointer error. However, wrapping a wrap pointer as Option<*mut T> means Rust has to store a Boolean flag alongside the pointer to track if it is Some or None. Granted, the Boolean flag is a single byte, but this still takes up more space than a single pointer. Because Rust knows that NonNull can never be 0 (null), it can use the “0” value to represent None. The result is that Option<NonNull<T>> is the exact same size as a raw pointer (e.g. 64 bits on a 64-bit machine). This single byte is admittedly small, but can add up, especially in large data structures for memory-constrained embedded systems.

Subtyping, Variance, and Lifetimes

This section presents a rather simplistic view of subtypes, variance, and the general use of NonNull. For a deeper dive on variance see the Rustnomicon’s section on Subtyping and Variance.

The official Rust documentation informs us that NonNull is covariant over T, while *mut T is invariant over T. Because Rust does not include the concept of inheritance common in other OOP languages, covariance in Rust mostly deals with lifetimes. That is, two equivalent types are covariant if they have different lifetimes. For context, both &T and Box<T> are also covariant. Covariance is typically desirable when writing data structures like Vec or some sort of List type that owns its data, as covariance allows your structures to be more flexible with lifetimes. Covariance allows read-only references inside T to be safely reinterpreted with longer lifetimes.

Invariance, on the other hand, would not allow you to provide either a subtype or supertype to code that requires a specific type. For example, *mut T and &mut T are invariant, which means that if your function takes a pointer to an Animal type, you MUST provide an Animal. A Cat is not safe. This is strict because if you provide a *mut Animal that was actually pointing to Cat storage, it would be possible to write a Dog into that memory location, which would cause memory corruption. This may be necessary or desirable for niche situations where you need interior mutability in a way that bypasses Rust’s borrow checker.

Unsafe Functions

Similar to raw pointers, you must call unsafe functions from unsafe blocks.

unsafe fn foot_gun() {
...
}
// SAFETY: This could blow your shit smoov off
unsafe {
foot_gun();
}
With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s documentation, we understand how to use it properly, and we’ve verified that we’re fulfilling the contract of the function.

If you have to call unsafe functions it is best to create safe abstractions over them to call from safe Rust. To do this simply add graceful panic checks to ensure that the abstraction is safe.

Foreign Function Interfaces

So you wanna call functions in languages that aren’t Rust. Sounds like a blast. To guarantee safety you’re gonna have to build a foreign function interface (FFI). To do this you’ll need the keyword extern to name the function. You must wrap the actual function call in an unsafe block. In this example the "C" part defines the application binary interface (ABI) that the external function uses. The ABI defines how the call is made at the assembler level.

extern "C" {
fn abs(input: i32) -> i32;
}
fn main() {
unsafe {
println!("Absolute value of -3 according to C: {}", abs(-3));
}
}

Accessing Mutable Static Variables

Rust has a couple of different global variable options. In Rust there are constants which are inlined at compile time. This means that the compiler replaces all named constants with their computed values at compile time, meaning that Rust duplicates the values wherever they’re used. Constants are always immutable, and the program must be recompiled when they are changed. Rust also has static variables. Both constants and static variables have a 'static lifetimes which means they are available throughout the lifetime of the program. Unlike constants, static variables aren’t inlined at compile time. Instead, they represent a single memory location and all mentions of them are translated as references at compile time. Rust allows for both immutable and mutable static variables. Because of the shared memory location, accessing immutable global static variables is safe, but accessing mutable global static variables can cause data races. This means that accessing mutable global static variables requires unsafe blocks. Mutable static values may be required for low-level programming concepts or FFIs to track global state.

This code compiles and prints COUNTER: 3 because its single threaded, but may result in abnormalities (data races) in a multithreaded situation. Where possible, use thread-safe concurrency patterns and smart pointers!

static mut COUNTER: u32 = 0;
fn add_to_count(i: u32) {
unsafe {
COUNTER += i;
}
}
fn main() {
add_to_count(3);
unsafe {
println!("COUNTER: {COUNTER}");
}
}

Unsafe Traits

Traits are considered unsafe when at least one of its methods has some invariant that the compiler cannot verify. Thats cool, just slap an unsafe on that bad boy and call it a day.

unsafe trait Foo {
// methods go here
}
unsafe impl Foo for i32 {
// method implementations go here
}

Unions

Did you know Rust has unions? Well, it does. But its mostly for interoperability with C. This section is so short that it just quotes the book’s entire description.

The final action that works only with unsafe is accessing fields of a union. A union is similar to a struct, but only one declared field is used in a particular instance at one time. Unions are primarily used to interface with unions in C code. Accessing union fields is unsafe because Rust can’t guarantee the type of the data currently being stored in the union instance.