Unsafe Rust
The Rust compiler is conservative by design. It is better to reject potentially correct code than to allow potentially problematic code. As a result, Rust includes an option to declare “unsafe” blocks that allow for the following operations that aren’t allowed otherwise:
- Dereference a raw pointer
- Call an unsafe function or method
- Access or modify a mutable static variable
- Implement an unsafe trait
- Access fields of unions
Unsafe Rust isn’t magic, but it does require attention to detail. Rust recommends keeping unsafe
scopes to a minimum to make it easier to debug them later. Furthermore, Rust recommends encapsulating any unsafe code within a safe abstraction to guarantee safe APIs. Smart.
Raw Pointers
For a refresher on pointers in Rust see the Reference Types and Ownership sections.
Raw pointers, similar to references, can be either immutable or mutable as *const T
and *mut T
, respectively. Mutable raw pointers can mutate the data they point to. Naturally immutable raw pointers cannot. Raw pointers themselves simply point to the stack frame object of the reference used to create them. To access the underlying data raw pointers must be dereferenced with a dereference operator in an unsafe
block. Raw pointers allow for the following characteristics.
- Allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
- Aren’t guaranteed to point to valid memory
- Are allowed to be null
- Don’t implement any automatic cleanup
You can declare raw pointers anywhere. You can even access the addresses they point to in safe code.
let mut num = 5; let r1 = &num as *const i32; // Raw pointer! let r2 = &mut num as *mut i32; // Raw pointer!
println!("{:#?}", r1); println!("{:#?}", r2);
0x00007ff7b59e77dc0x00007ff7b59e77dc
However, if you want to actually touch the data you need to dereference the raw pointers which requires an unsafe
block.
unsafe { println!("{:?}", *r1); println!("{:?}", *r2); }
55
This example illustrates slightly more complex data object using String.
let val = String::from("Peter"); // Creates heap data and stack object let val_ptr: *const String = &val; // Creates immutable raw pointer from object reference unsafe { println!("{}", *val_ptr) // Dereferences raw pointer to access underlying data }
Immutable raw pointers cant reassign data.
let val = "Peter".to_string(); let p: *const String = &val; unsafe { // Illegal: cant assign pointer to data through an immutable raw pointer *p = "".to_string(); }
Did you get all that? No? Still confused about what a raw pointer actually does? How about a deep dive to explain whats happening using some specific terminology that goes through both an immutable reference and an immutable raw pointer to illustrate the difference.
This example starts by creating a String
object and binds it to val
on line 3. The operation actually involves creating two memory segments described in the Ownership section. The first segment allocates the actual string data on the heap. This process returns a pointer to the heap data. The second segment is a stack frame that contains the String
object. The String
object contains the returned pointer to the data on the heap and information about the string’s size. The example next creates an immutable reference &val
on line 6 and binds it to val_ref
. This reference points to String
object on the stack. This object allows you to access the underlying information for things like printing the string because you know its address and length (line 8). Next, the example creates another immutable reference on line 12 and binds it to a raw pointer val_ptr
. The raw pointer type is declared with the *const T
syntax. The raw pointer points to the same String
object on the stack as the reference did on line 6. If you try printing the raw pointer you get the (stack) object’s 64-bit stack memory address (because this was run on a 64-big operating system), not the underlying heap data. This is safe because you’re not actually touching the data. Actually accessing the data is unsafe, so it requires an unsafe
block on line 15. The unsafe
block dereferences the raw pointer, following the stack object’s pointer to access the data on line 17. In this way the String
object on the stack is like a smart pointer that contains a raw pointer to the heap address where the string data is stored.
1 pub fn references_and_pointers() {2 // Heap-allocated value3 let mut val = String::from("Peter");45 // Creates an immutable reference to the value6 let val_ref: &String = &val;7 // Prints the referenced object8 println!("Reference: {}", val_ref);910 // Creates a raw pointer to the value11 // Points to the underlying String object named in the reference &val12 let val_ptr: *const String = &val;13 // Pretty prints the raw pointer as a memory address14 println!("Raw pointer address: {:#?}", val_ptr);15 unsafe {16 // Dereferences the raw pointer to print the value at the address17 println!("Dereferenced raw pointer value: {}", *val_ptr)18 }19 }
Reference: PeterRaw pointer address: 0x00007ff7b0b5e688Dereferenced raw pointer value: Peter
Unsafe Functions
Similar to raw pointers, you must call unsafe functions from unsafe blocks.
unsafe fn foot_gun() { ... }
unsafe { foot_gun(); }
With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s documentation, we understand how to use it properly, and we’ve verified that we’re fulfilling the contract of the function.
If you have to call unsafe functions it is best to create safe abstractions over them to call from safe Rust. To do this simply add graceful panic checks to ensure that the abstraction is safe.
Foreign Function Interfaces
So you wanna call functions in languages that aren’t Rust. Sounds like a blast. To guarantee safety you’re gonna have to build a foreign function interface (FFI). To do this you’ll need the keyword extern
to name the function. You must wrap the actual function call in an unsafe
block. In this example the "C"
part defines the application binary interface (ABI) that the external function uses. The ABI defines how the call is made at the assembler level.
extern "C" { fn abs(input: i32) -> i32;}
fn main() { unsafe { println!("Absolute value of -3 according to C: {}", abs(-3)); }}
Accessing Mutable Static Variables
Rust has a couple of different global variable options. In Rust there are constants which are inlined at compile time. This means that the compiler replaces all named constants with their computed values at compile time, meaning that Rust duplicates the values wherever they’re used. Constants are always immutable, and the program must be recompiled when they are changed. Rust also has static
variables. Both constants and static variables have a 'static
lifetimes which means they are available throughout the lifetime of the program. Unlike constants, static variables aren’t inlined at compile time. Instead, they represent a single memory location and all mentions of them are translated as references at compile time. Rust allows for both immutable and mutable static variables. Because of the shared memory location, accessing immutable global static
variables is safe, but accessing mutable global static
variables can cause data races. This means that accessing mutable global static variables requires unsafe
blocks. Mutable static values may be required for low-level programming concepts or FFIs to track global state.
This code compiles and prints COUNTER: 3
because its single threaded, but may result in abnormalities (data races) in a multithreaded situation. Where possible, use thread-safe concurrency patterns and smart pointers!
static mut COUNTER: u32 = 0;
fn add_to_count(i: u32) { unsafe { COUNTER += i; }}
fn main() { add_to_count(3); unsafe { println!("COUNTER: {COUNTER}"); }}
Unsafe Traits
Traits are considered unsafe when at least one of its methods has some invariant that the compiler cannot verify. Thats cool, just slap an unsafe
on that bad boy and call it a day.
unsafe trait Foo { // methods go here}
unsafe impl Foo for i32 { // method implementations go here}
Unions
Did you know Rust has unions? Well, it does. But its mostly for interoperability with C. This section is so short that it just quotes the book’s entire description.
The final action that works only with unsafe is accessing fields of a union. A union is similar to a struct, but only one declared field is used in a particular instance at one time. Unions are primarily used to interface with unions in C code. Accessing union fields is unsafe because Rust can’t guarantee the type of the data currently being stored in the union instance.