Lifetimes
Lifetimes are like sub-types that keep track of how long variable data references are available. Lifetimes prevent “use after free” or “dangling pointer” errors where the program deallocates memory before its done using it. Lifetimes are unique to Rust and help ensure memory safety without a garbage collector. Lifetimes on functions or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.
Lifetime Rules
In many cases lifetimes are implicit and inferred just like how types are inferred because there is only one available (or logical) choice for the type or lifetime. However, there are cases when dealing with references where the logic of the control flow means that the static analyzer and compiler can’t tell when it’s optimal to deallocate memory and thus cannot be certain about when all pointers are valid. This happens mostly in functions and impl
blocks, though lifetimes may be necessary in structs where fields declare reference types as well. Similar to how we only have to annotate types when multiple types are possible, we only have to annotate lifetimes when it’s not obvious when to deallocate.
There are certain common patterns where we dont have to annotate lifetimes. As Rust develops the team started recognizing certain situations in which the use of explicit lifetimes equated to something like boilerplate because the situations were ubiquitous and predictable. In these scenarios the language team developed certain analysis and compiler clauses to account for the patterns that result in us not needing to explicitly state lifetimes. The patterns of use follow what is called the “lifetime elision rules”. Implementation of these rules results in compiler errors and hints in situations where lifetimes can be applied. There are three rules that the compiler follows to when there aren’t explicitly stated references. The first rule applies to input lifetimes, the second and third apply to output lifetimes.
-
1: The compiler assigns a specific lifetime to each reference in a function. For example, if a function takes one reference argument the compiler assigns one lifetime. If the function takes three reference arguments the compiler assigns three lifetimes, etc.
-
2: If there is exactly one argument, that lifetime is also assigned to all return lifetimes where reference values are used as a generic lifetime.
-
3: If there are multiple argument lifetimes, but one of them is either
&self
or&mut self
, the lifetime ofself
is assigned to all return values as a generic lifetime.
Lifetime Syntax
Lifetimes need to be both declared and assigned. The overall syntax rules are similar to generics. The lifetime is declared (named) in angle brackets with a leading apostrophe '
(not a back-tick), and are typically named with a short name which is often a single lowercase letter. The declaration for functions happens after the function name and before the parameter list. In structs the lifetime is declared after the struct name. In methods, lifetimes are declared after the impl
keyword and implementation keyword.
function<'a>() {struct Structure<'a> {impl<'a> Method<'a> {impl<'a> Trait for Method<'a> {
Lifetime assignment is similar to the declaration. The lifetime is named following the reference symbol &
, If the reference is mutable, the lifetime syntax is coupled with the reference symbol which precedes the mut
keyword by a single space.
&i32 // reference&'a i32 // reference with an explicit lifetime&'a mut i32 // mutable reference with an explicit lifetime
Lifetimes are typically used as either generic identifiers, such as for all arguments in a function, or used in groups to tell the compiler which reference is what. Because of this, lifetimes are rarely used singularly. Note that for all syntactical applications the lifetime identifier does not have any keyword immediately following it.
The static lifetime
There is a reserved lifetime named called 'static
that says that the assigned reference can live for the entire duration of the program. Implicitly all string literals have a 'static
lifetime. The values for static lifetimes are stored directly in the program’s binary file which is always available to the program. Before declaring a static lifetime to fix a compiler error consider the intended use of the reference. Many times a static declaration is not necessary.
let s = "Hello world"; //String literal with implied type and lifetimelet s: &str = "Hello world"; //String literal with implied lifetimelet s: &'static str = "Hello world"; //String literal with explicit type and lifetime
Lifetimes In Functions
THis section starts with a toy example and moves on to a real-world example.
Toy example
Let’s use a simplified example to illustrate how the elision rules work. Spoiler, this example doesn’t actually need lifetimes!
fn simple(a: &str) -> &str {
Under the hood, the compiler applies the elision rules in order. The first rule adds lifetimes for each parameter. The process of declaring a lifetime is similar to generics in that the lifetime is declared in angle brackets between the function name and the parameter list. The lifetime is then assigned within the type declaration of each parameter according to the syntax rules defined in the lifetime syntax section.
fn simple<'a>(a: &'a str) -> &str {
Next the compiler applies the lifetime to the return value because its also a reference. The third rule doesn’t apply because there is no self
parameter. In this toy function the compiler can accurately figure out what we mean so we dont need to explicitly annotate any lifetimes. The original code works just fine and is simpler and more readable!
fn simple<'a>(a: &'a str) -> &'a str {
//same as
fn simple(a: &str) -> &str {
Practical example
Lets write a function that compares string slice length and returns the longer value. Spoiler, attempting to compile this function without lifetimes results in an error.
// Error!pub fn lifetimes(x: &str, y: &str) -> &str { if x.len() > y.len() { x } else { y }}
So whats going on here? This function takes two slice references (because len()
takes slice references) and returns a slice reference. The body of our function contains logic that compares the arguments and returns one of them. If we run through the three rules we can tell why. The first rule creates two lifetimes, one for x
and one for y
. There is more than one parameter that requires a lifetime, so we don’t know which to assign to the returned reference. The function takes no self
references so the third rule does not apply. The result is that neither the analyzer, compiler, nor the programmer knows which reference to return. We get a compiler error because without the proper return reference lifetime the compiler doesn’t know how to deallocate memory properly.
Remember that lifetime annotations don’t directly change how long any of the references live. Rather, they describe relationships between multiple references. This allows the static analyzer and compiler to determine how best to treat the information. Annotating lifetimes indirectly affects the lifetimes of the references by describing those relationships because Rust will deallocate the memory differently depending on when its used/needed.
In our example the compiler isn’t the only one who doesn’t know which lifetime to use. We may need the reference to either argument depending on the function logic (depending on which argument is bigger). The solution in this case to declare a generic lifetime and assign it to all the references. Generic lifetimes are no different than specific lifetimes. The only difference is that they’re applied to more than one reference.
// Good!pub fn lifetimes<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y }}
This updated code tells Rust that for some lifetime 'a
the function takes two arguments that live as long as lifetime 'a
and returns a value that lives as long as lifetime 'a
. In practice this means that the lifetime of the return value is the shorter of the two argument lifetimes because that is the reference that is being returned. This ensures that the borrow checker rejects any values that don’t adhere to these constraints.
Returning references
Rust requires lifetimes for all functions that return a reference even if its obvious to the programmer what we’re after.
// Error!pub fn sample(a: &str, b: &str) -> &str { println!("{b}"); a}
// Good!pub fn sample<'a>(a: &'a str, b: &str) -> &'a str { println!("{b}"); a}
Specifying lifetimes for return values that originates in argument data indicates that the data returned by the function lives as long as the data passed into the function. The data referenced by a slice needs to be valid for the reference to be valid; if the compiler assumes we’re making string slices of the wrong argument’s contents it will not perform safety checks correctly. Connecting arguments with return values in this way may not feel natural coming from other languages, but Im told it gets easier over time.
In cases where a reference is returned which doesn’t refer to any of the function’s parameters it must refer to a value created within the function. Take the following example. The result
value gets cleaned up at the end of the function, but the return is trying to reference it which results in a dangling reference and thus a compiler error.
pub fn dangling_ref<'a>(x: &str, y: &str) -> &'a str { let result = String::from("really long string"); result.as_str()}
The only way around this is to transfer ownership to the return value to make the calling code deal with the memory deallocation.
pub fn transfer(x: &str, y: &str) -> String { String::from("really long string")}
Lifetimes In Structs
Structs can contain fields for both owned and referenced data. As we’ve already covered, the analyzer and compiler require all referenced data to have known lifetimes to prevent dangling pointers. Consider the following example. We have a struct called Borrowed that has two fields, one owned, and one referenced. This will not compile without lifetime specifiers placed after the struct name and after the reference character &
for any referenced field data type. This is similar to the notation we use with generic types in structs except that lifetimes are not types themselves, but merely describe the relationship between types and their use. This annotation means an instance of Borrowed can’t outlive the reference it holds in its part field.
struct Borrowed<'a> { owned: String, borrowed: &'a str}pub fn calling_code() { let s = String::from("world"); let example = Borrowed { owned: String::from("Hello"), borrowed: &s, }; println!("Together they make: {} {}", example.owned, example.borrowed);}
Together they make: Hello world
Lifetimes need to be declared for implementations of structs that themselves have declared lifetimes. The elision rules often make it unnecessary to declare lifetimes for method signatures within properly declared implementation blocks. If we had an example with more than one parameter all parameters and output values would be given the same lifetime according to elision rule number three.
struct Borrowed<'a> { owned: String, borrowed: &'a str}//Illegalimpl Borrowed { fn method(&self) -> i32 { ... }}//Legalimpl<'a> Borrowed<'a> { fn method(&self) -> i32 { ... }}
Mixing Types, Traits, & Lifetimes
Lets rewrite a previously used example to illustrate how all of these concepts come together.
pub fn lifetimes<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y }}
Say we want to add a parameter over generic type T
that implements Display
. First we add the generic type T
to the declaration and parameter list.
pub fn lifetimes<'a, T>(x: &'a str, y: &'a str, ann: T) -> &'a str {
Next we need to filter the parameter list to only accept types that implement Display
. To do this add a where
clause to the signature.
pub fn lifetimes<'a, T>(x: &'a str, y: &'a str, ann: T) -> &'a strwhere T: Display, {
Finally, we can use our generic in the function body.
pub fn lifetimes_2<'a, T>(x: &'a str, y: &'a str, ann: T) -> &'a strwhere T: Display,{ println!("{}", ann); if x.len() > y.len() { x } else { y }}