Ownership in Rust

Rust does not have a garbage collector, nor does it force you to manage memory manually. Instead, Rust has a unique system called ownership. Ownership has many rules and concepts, but the result can be summarized with two primary goals; to ensure that all data is available for no longer than we need it and no two sources can write to the same memory address at the same time.

Memory Model

To understand ownership we must understand how memory is handled. This section talks about two main types of memory segments that we deal with in Rust programs; the stack and the heap. These segments become available to our programs at runtime. There are actually four types of memory that are handled by the operating system’s loader, but two of those (code and static) segments will be discussed later.

The stack

Think of the stack as a stack of paper. The stack grows and shrinks as the program progresses. Sticking with the paper stack analogy each piece of paper is referred to as a “stack frame”. Stack frames contain information for each function in the program as its encountered. Like the stack of paper, the data structure stack manages space in a last in, first out (LIFO) way. Each new piece of memory is added to the top of the stack, and as elements are no longer needed the system starts freeing frames when they are no longer required, thus freeing those memory addresses. Adding data is called pushing onto the stack and removing data is called popping off the stack. All data on the stack must have a known, fixed size. This makes sense if we think about it. We cannot allocate x bytes of memory for variable i and then later assign x + n bytes to that location without causing problems. As a result, the stack frames include things like variable identifiers, variable values with known/fixed sizes (such as primitives), and heap memory addresses for variables of unknown size. Each primitive occupies the same space regardless of the value. An unsigned 32-bit integer value never takes up more than 32-bits (4 bytes) whether its -2,147,483,648 to 2,147,483,647 and all values in between.

The heap

Data with unknown or variable size is stored on the heap. The heap is less organized, as the name suggests. When you need heap memory, the memory allocator finds a piece of memory big enough and returns a pointer which represents a memory address for the given data. This process is called allocating on the heap. The pointer is a known, fixed size (generally the same bit width as your operating system), so it can be stored on the stack. For example, if you have a 64-bit operating system, your memory addresses and pointers will be 64-bit numbers. Actually accessing the memory requires following the pointer to the heap memory address. The memory allocator does take some time, so using stack memory is faster because no searching is necessary; the computer just pushes data onto the stack.

Using Memory

Only the heap uses memory allocation. The stack is configured automatically by the system based on the fixed elements the program encounters. The process of managing memory involves first allocating, then freeing the memory on the heap. For systems languages this has traditionally involved first calling some function to allocate the memory (or construct an object). After we use the memory/data we must manually deallocate the memory. The de-allocation is the tricky part. If the de-allocation is not performed we risk wasting memory. If we deallocate too early, we introduce an invalid variable that may crash the program. Deallocating the memory more than once also results in undefined behavior (UB). The hard part is knowing when and where to free the memory. Alternatively, programming languages with garbage collection (GC) use a system that automatically frees memory. This typically happens in a non-deterministic fashion that scans the code in large chunks for memory allocations that are no longer required. This can take time and result in a performance hit. It also does not necessarily allow the programmer to specify when to deallocate memory (deconstruct objects). Rust uses a third path wherein the heap memory for a given variable is automatically deallocated when the variable goes out of scope. This is similar to the Resource Acquisition Is Initialization (RAII) pattern used in the C++ language.

Allocating memory

When you declare a String (or any type with a dynamic size, or a type that needs to outlive the scope it’s declared in) you actually create two different segments of data. Consider both of these segments as a sort of complete set of data required to use your string (or other heap-allocated data). Consider the following declaration,

let val = String::from("Peter");
// Can also be expressed as converting a string literal to a string object
let val = "Peter".to_string();

This declaration creates two segments. The first segment contains a pointer, the string’s length, and the string’s capacity. This segment is pushed to the stack as an object. The object’s pointer is a reference to the heap memory address where the contents of the string are actually stored. The object’s length is a value, in bytes, that the contents of the string are currently using. The object’s capacity is the total amount of bytes that the string has received from the allocator.

name	value
ptr (pointer)	(memory address)
len (length)	5
capacity	5

The second segment of data represents the string contents itself (along with some memory allocator metadata). The string’s contents are stored as a mapped set of indexes and values.

index	value
0	P
1	e
2	t
3	e
4	r

In this example there is only one owner of the heap-allocated memory. Unless you change the lifetime or ownership of val, the system deallocates the memory when it goes out of scope. Additionally, the stack frame reference is also deallocated as part of the stack unwinding process.

Bitwise copies

Consider the following example.

let a: i32 = 12;
let b: i32 = a;
println!("a = {}, and b = {}.", a, b);

a = 12, and b = 12.

The first statement declares a variable called a and binds the value 12. The second statement declares another variable called b and binds a copied value from the first variable to the second variable. This is possible for a couple of reasons. The most pressing is that the i32 type implements the Copy trait. We’ll talk about traits later, for now just know that the i32 type, and any type that implements the Copy trait, can be copied cheaply and implicitly. These shallow copies are new stack frames and are represented by different memory addresses. The deeper logical reason is that the i32 type has a known size at compile time; 32 bits. Regardless of the values assigned to them, the total possible size of a and b is 64 bits. This allows us to easily push fixed-width frames to the stack because we know they will never need any more memory.

We can prove that these are two different elements by changing one of them.

    let mut a: i32 = 12;
    let b: i32 = a;
    println!("a = {}, and b = {}.", a, b);
    a = 23;
    println!("a = {}, and b = {}.", a, b);

a = 12, and b = 12.
a = 23, and b = 12.

This example hard codes the value, but works the same with unknown input. It also works on all other fixed-width primitive types that implement the Copy trait such as bool, integer, float, char, or str. Tuples may also be copied if the composition types fall into the previous list.

fn str_type() {
    let s: &str = "This is a str type";
    let t: &str = s;
    let r: &str = t;
    println!("{r}");
}

This is a str type

Moving references

The way memory is allocated for a struct such as String (covered later) is different because of the heap memory implementation and the variables as references to that data. The String type doesn’t implement Copy, so using the same pattern as our previous two examples with heap-allocated types does not work.

let x: String = String::from("Peter");
let y: String = x;
println!("x = {}, and y = {}.", x, y);

error[E0382]: borrow of moved value: `x`
  --> src/bin/ownership.rs:31:37
   |
29 |     let x: String = String::from("Peter");
   |         - move occurs because `x` has type `String`, which does not implement the `Copy` trait
30 |     let y: String = x;
   |                     - value moved here
31 |     println!("x = {}, and y = {}.", x, y);
   |                                     ^ value borrowed here after move
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
   |
30 |     let y: String = x.clone();
   |                      ++++++++

Look at the compiler error. The error indicates that a borrowed value (x) has been moved, and highlights the move at line 30. But what does this mean? Because the String type doesn’t implement the Copy trait, it cannot be copied implicitly. Instead, the compiler tries to move ownership of the underlying data at line 30, where we assign x to y. Later, on line 31, we attempt to read x, but cannot do so because x no longer owns the data. The ownership has been transferred to y.

This is where Rust’s dataflow analysis and famous “borrow checker” come in. These are static (compile time) analyses that create and update a (borrow) graph structure that represents the relationships between variables and their references to track ownership throughout the logical flow of the program.

Making copies of these complex types can be more resource/time intensive due to the allocator processes. Instead of blindly copying both the variable’s stack and heap data when creating y, the borrow checker moves ownership of the data from x to y and invalidates the x reference. This results in only one stack-allocated reference to the actual heap-allocated values. The heap-allocated data doesn’t actually move anywhere. In this way it is said that the value has moved, or changed ownership. It might be more appropriate to say that the reference has moved, but c’est la vie.

To get rid of the compiler error we can either delete the reference to the original value or explicitly “copy” the information to y with a clone() operation. The clone() is essentially the same as a copy, but instead of a simple bitwise copy of the value it actually copies the whole structure of the type and its data. This is known as a “deep” copy. The clone() operation is required explicitly.

let x: String = String::from("Peter");
let mut y: String = x.clone();
y.push_str(" Schmitz");
println!("x = {}, and y = {}.", x, y);

x = Peter, and y = Peter Schmitz.

Ownership And Scope

We’ve already learned that Rust tracks ownership through the logical flow of the program with the dataflow analysis and the borrow checker. Like we saw with both shadowing and the copy vs clone issue, Rust invalidates variables and references based on logical path of the program. Consider the following example.

fn main() {
    let s: String = String::from("Hello");
    println!("Modified greeting: {}", var_scope_complex(s));
    println!("Original greeting: {s}");
}
fn var_scope_complex(greeting: String) -> String {
    let mut a: String = greeting;
    a.push_str(" world");
    return a;
}

One might think the output is two lines consisting of Hello world and followed by a line that simply prints Hello. This is not the case in Rust. The above example throws a compile error.

error[E0382]: borrow of moved value: `s`
  --> src/bin/ownership.rs:76:34
   |
73 |     let s: String = String::from("Hello");
   |         - move occurs because `s` has type `String`, which does not implement the `Copy` trait
74 |     //let cloned_var: String = s.clone();
75 |     println!("Modified greeting: {}", var_scope_complex(s));
   |                                                         - value moved here
76 |     println!("Original greeting: {s}");
   |                                  ^^^ value borrowed here after move
   |
note: consider changing this parameter type in function `var_scope_complex` to borrow instead if owning the value isn't necessary
  --> src/bin/ownership.rs:49:25
   |
49 | fn var_scope_complex(i: String) -> String {
   |    -----------------    ^^^^^^ this parameter takes ownership of the value
   |    |
   |    in this function
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
   |
75 |     println!("Modified greeting: {}", var_scope_complex(s.clone()));
   |                                                          ++++++++

The compiler gives us some fantastic tips on how to remedy the issue. The first hint is that the original variable s does not implement the Copy trait. Next we see where s is moved. In the first print macro we pass the String variable s to the variable_scope() function. The function takes ownership, but does not give it back. Instead, it invalidates the reference when the new variable a is bound and returns a, not s. This is all fine, but the real trouble comes in the next line (76) that indicates that we tried to use a value after it was moved! At this point the s variable in main() has been invalidated leaving the second s borrow pointing at nothing!

The error message points out that we either have to change the variable_scope() function to take and return an immutable borrowed reference (as &String) instead of owning the variable, or clone the variable for use. It is not possible to pass an immutable reference to push_str() method (because its a mutator), so we have to clone the string.

fn main() {
    let s: String = String::from("Hello");
    let cloned_var: String = s.clone();
    println!("Modified greeting: {}", var_scope_complex(cloned_var));
    println!("Original greeting: {s}");
}
fn var_scope_complex(i: String) -> String {
    let mut a: String = i;
    a.push_str(" world");
    return a;
}

Now the program compiles, runs, and prints the following expected output.

Modified greeting: Hello world
Original greeting: Hello

All of this is due to the way the heap-allocated types are handled in Rust. To prove this, we can write a similar program but use primitive types instead. The following example works just fine because the i32 type implements the Copy trait, so passing primitives in this situation produces a new stack frame copy implicitly. The program does not need to reallocate heap memory to “move” the values when they change.

fn main() {
    let n: i32 = 30;
    println!("Modified number: {}", var_scope_primitive(n));
    println!("Original number: {n}");
}
fn var_scope_primitive(i: i32) -> i32 {
    let a: i32 = i + 10;
    return a;
}

Modified number: 40
Original number: 30

This principle also extends to passing values to functions. Consider the following example to illustrate the concept of moving again.

fn main() {
    let s1 = String::from("Hello"); //1.
    let s2 = takes_and_gives_back(s1); //2.
    println!("{s2}"); //3.
}
fn takes_and_gives_back(a_string: String) -> String {
    a_string
}

The variable s1 comes into scope.
The variable s1 is passed to the takes_and_gives_back() function. This creates a new memory location for s2 and invalidates the initial s1 address, thereby moving the variable to the takes_and_gives_back() function.
The new s2 address is passed to the println! macro for output.

This principle can lead to some rather inconvenient issues. For example, what if we want to calculate the length of a string?

// INVALID!
fn main() {
    let s1: String = String::from("Peter"); //1.
    let s2: usize = calc_len(s1); //2.
    println!("The string {s1} is {s2} characters long.");  //3.
}
fn calc_len(s: String) -> usize {
    let i: usize = s.len();
    return i
}

This might work in another language, but not in Rust! Let’s go through the compiler’s “borrowed after move” error.

The variable s1 comes into scope.
s1 is moved to s2.
s1 has been dropped and is no longer valid.

To fix this we must either code a clone() on s1.

//let s2: usize = calc_len(s1);
let s2: usize = calc_len(s1.clone()); //Fixes the issue, but its expensive

It is also possible to create a tuple to solve this situation. To create a tuple we add a bunch of code.

fn main() {
    let s1: String = String::from("Peter");
    let tup: (String, usize) = calc_len(s1); //Creates a tuple
    let s2: String = tup.0; //These two lines access the tuple's values
    let len: usize = tup.1;
    println!("The string {s2} is {len} characters long.");
}
fn calc_len(s: String) -> (String, usize) {
    let i: usize = s.len();
    return (s, i)
}

It is possible to simplify the tuple a bit.

    let tup: (String, usize) = calc_len(s1);
    let s2: String = tup.0;
    let len: usize = tup.1;
    println!("The string {s2} is {len} characters long.");

Becomes

    let (s2, len) = calc_len(s1);  //Does the same as above
    println!("The string {s2} is {len} characters long.");

But this is still a pain because we have to return the String to the calling function to use it afterwards because the String itself was moved into the function. Essentially calc_len() takes ownership and gives it back to the let function to use in the println! macro. To solve this properly we need to introduce references.

References and Borrowing

A reference is like a pointer. References are indicated by the ampersand (&) character. References are variables that live on the stack as addresses to heap data and technically don’t have ownership. References are how we get around moving (“deep” copying) values while keeping ownership clean in Rust. In Rust we can have either a number of immutable references or one mutable reference.

Lets take another look at a program we’ve seen before. This program prints the length of a String using references rather than the expensive clone() solution.

fn main() {
    let s1: String = String::from("Peter");
    let s2: usize = calc_len(&s1);     //Create a reference type with &
    println!("The string {s1} is {s2} characters long.");
}
fn calc_len(s: &String) -> usize {     //The function takes a reference type
    let i: usize = s.len();  //More explicit than simple (and valid) s.len()
    return i
}

The differences here is that we are now creating and passing a reference instead of moving or cloning the original s1 value. The calc_len() function should also expect a reference type instead of either a moved or cloned String. This is exactly what reference types allow; to refer to a value (in the heap) without taking ownership. The idea of taking a reference instead of ownership is called borrowing because the function uses the value without actually owning it.

Conversely, we can dereference a value with the asterisk (*) character.

Reference mutability

So what happens if we try to mutate (change) a referenced (borrowed) value? Well, you probably already know because everything is immutable by default in Rust, but let’s find out.

fn main() {
    let s: String = String::from("Hello");
    change(&s);
}
fn change(a: &String) {
    a.push_str(", world");
}

error[E0596]: cannot borrow `*a` as mutable, as it is behind a `&` reference
  --> src/bin/ownership.rs:65:5
   |
65 |     a.push_str(", world");
   |     ^ `a` is a `&` reference, so the data it refers to cannot be borrowed as mutable
   |
help: consider changing this to be a mutable reference
   |
64 | fn change(a: &mut String) {
   |               +++

For more information about this error, try `rustc --explain E0596`.
error: could not compile `learning_rust` (bin "ownership") due to previous error

Surprise! This shit doesn’t work because reference follows the same rules as the variable being referenced. To mutate a reference, the base variable must be made mutable. Because Rust is more responsible than we are we must now change the reference to match the variable’s mutability in all calling arguments and function parameters to match what we’re trying to use/mutate.

fn main() {
    let mut s: String = String::from("Hello"); //Makes s mutable
    change(&mut s); //Mutable argument
}
fn change(a: &mut String) {  //Mutable parameter
    a.push_str(", world");
}

By the same logic it is also not legal to have a mixture of immutable and mutable references. This makes sense because we dont want immutable data to suddenly change due to another borrowed reference.

    let mut s = String::from( "Hello" );
    let r1 = &s;
    let r2 = &s;
    let r3 = &mut s;
    println!( "{r1} and {r2} and {r3} " );

error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable
  --> src/bin/ownership.rs:70:14
   |
68 |     let r1 = &s;
   |              -- immutable borrow occurs here
69 |     let r2 = &s;
70 |     let r3 = &mut s;
   |              ^^^^^^ mutable borrow occurs here
71 |     println!( "{r1} and {r2} and {r3} " );
   |                ---- immutable borrow later used here

We can include multiple immutable references within the same scope because they are not expected to change. However, if we have more than one mutable reference in scope then not all references are guaranteed to be of the same value. Therefore, only ONE reference to a mutable value in scope at the same time. Allowing only one mutable reference variable in scope helps prevent undefined behavior (UB) at compile time by refusing to compile if there is a chance a data race may occur.

    let a: String = String::from("IDK");
    let b: &String = &a;
    let c: &String = &a;
    println!("{b} and {c}"); //Works just fine!

    let mut x: String = String::from("Sup");
    let y: &String = &mut x;
    let z: &String = &mut x;
    println!("{y} and {z}");  //Uh oh!

error[E0499]: cannot borrow `x` as mutable more than once at a time
  --> src/bin/ownership.rs:71:22
   |
70 |     let y: &String = &mut x;
   |                      ------ first mutable borrow occurs here
71 |     let z: &String = &mut x;
   |                      ^^^^^^ second mutable borrow occurs here
72 |     println!("{y} and {z}");
   |               --- first borrow later used here

For more information about this error, try `rustc --explain E0499`.
error: could not compile `learning_rust` (bin "ownership") due to previous error

Its possible to have multiple mutable references, but they must exist within different scopes.

    let mut s = String::from( "Hell" );
    {
        let r1 = &mut s;
        r1.push_str("o");
        println!("{r1}")
    }
    {
        let r2 = &mut s;
        r2.push_str(", world");
        println!("{r2}");
    }

Hello
Hello, world

Note that a reference’s scope, just like a variable, starts when its introduced and continues through the last time it is used. The following code is equally valid because the first reference r1 ends before &mut s is borrowed again.

    let mut s = String::from("Hell");

    let r1 = &mut s;
    r1.push_str("o");
    println!("{r1}");

    let r2 = &mut s;
    r2.push_str(", world");
    println!("{r2}");

Dangling pointers

Dangling pointers occur when objects in memory associated with a pointer are freed before the associated pointer reference goes out of scope or is modified. In Rust there are no dangling pointers. The compiler ensures that the relationship between pointer and heap-allocated memory is valid and that the data will not go out of scope before the reference does.

fn dangle() -> &String {
    let s: String = String::from("Peter");  //Creates String
    &s  //Creates String reference
}  //String s goes out of scope and memory is dropped

fn main() {
    let ref_to_nowhere: &String = dangle(); //Expected reference doesn't exist!
}

error[E0106]: missing lifetime specifier
  --> src/bin/ownership.rs:65:16
   |
65 | fn dangle() -> &String {
   |                ^ expected named lifetime parameter
   |
   = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime
   |
65 | fn dangle() -> &'static String {
   |                 +++++++

For more information about this error, try `rustc --explain E0106`.
error: could not compile `learning_rust` (bin "ownership") due to previous error

Lifetimes and static memory will be explained later. For now, we can read the compiler error’s first help message that the heap-allocated variable s doesn’t exist. Why is that? If we look at the dangle() function we can see that s goes out of scope at the end of the function, so there is nothing for the returned pointer to reference. To fix this we need to make the returned value owned instead of referenced. That way the calling function will be forced to deal with the data. We can modify the example to remove the reference from the function signature and pass the ownership to ref_to_nowhere.

fn dangle() -> String {
    let s: String = String::from("Peter");
    s
}

fn main() {
    let ref_to_nowhere: String = dangle();
}

One side note is that if this function had parameters, the error would indicate that the function returns a value referencing data owned by the current function instead. The solution is the same, unless we want to specify lifetimes. See the Lifetimes section for more information.

Memory Safety & Leaks

Rust bills itself as a memory-safe language, but that does not guarantee that programs will always be free from memory leaks. Memory leaks happen in Rust when programs create cyclical references (reference cycle). For example it is possible to use Smart Pointers like Rc<T> and RefCell<T> to create situations in which reference counts never reach zero.

“Creating reference cycles is not easily done, but it’s not impossible either. If you have RefCell values that contain Rc values or similar nested combinations of types with interior mutability and reference counting, you must ensure that you don’t create cycles; you can’t rely on Rust to catch them.”

Strong vs weak references

A “strong reference” prevents the garbage collector from deallocating memory held by an object or reference still in use/scope. Rust does not have a garbage collector, but it does use strong references to keep data alive until it can be deallocated. Strong references are how you can share ownership in smart pointers like Rc<T>. A “weak reference” does not prevent the system from deallocating memory, even if there is a reference to the data/object still live. Weak references do not express ownership and their count does not affect when data can be deallocated/cleaned up.

Ownership Rule Summary

We’ve explored ownership and borrowing with some depth. Let’s recap some rules we learned.

An owned variable is called a value. It is possible to pass ownership by value. If a function passes an owned value, it is no longer valid (moved and de-allocated) within the scope it was passed from. To return ownership to the original scope it must be re-bound there.
A given scope can include multiple immutable references. An immutable reference looks like &x where x is the variable name. For immutable access to data from multiple, non-deterministic processes (multiple ownership) in the same thread, use smart pointers like Rc<T>.
A given scope can only include ONE mutable reference to a value. Mutable references look like &mut x where x is the variable name. It is common to see &mut self parameters in methods where you want to mutate the struct instance. It’s possible to have multiple mutable references to data, but they must exist within different scopes, ensuring that only one routine mutates the value at a time. You cannot typically have multiple mutable references to the same value in a single scope without either the interior mutability or unsafe Rust patterns. Interior mutability suspends the compile time borrow checks, but still enforces the borrow rules at runtime. Unsafe Rust, invoked within unsafe{} blocks also allow multiple mutable access to variables with raw pointers.
You generally cannot mix immutable and mutable references. In situations where we can ensure that the borrowing rules will be followed at runtime we can skirt this borrow check at compile time with smart pointers like RefCell<T>. It is important to mention that this doesn’t break the ownership rules. If mutliple mutable references occur at runtime the program will still panic, but at least it will compile. This uses the interior mutability pattern. See the cell module documentation for details about interior mutability. RefCell<T> has borrow() and borrow_mut() methods that we can use on the same type. The RefCell<T> type also provides weak reference functionality.