Core Data Types

As you may have already noticed, Rust handles data a little differently than other programming languages. There is an emphasis on safety but also on identity. Because Rust is a statically typed language it must know the types (and sizes) of all variables at compile time. Sometimes the system can infer what type is intended by contextual clues, but it can be smart to add a type annotation to explicitly declare the data type.

    let x = 12; // inferred as default i32 type
    let y: i32 = 23; // explicitly typed as i32

In Rust, primitives encompass scalar types, compound types, and reference types.

Scalar Types

Scalar types represent single values. In Rust there are four scalar types: Booleans, integers, floating-point numbers, and characters. Scalar types in Rust are a little different from scalar values you may have encountered in mathematics. The two usages are not directly related, but both use the term to denote simplicity or indivisibility. As you’ll see, Rust classifies primitives

Boolean type

In Rust Boolean types true and false comprise only a single byte and are designated with the bool keyword.

Integers

Integers can be either signed (i) or unsigned (u). Signed and unsigned refer to whether or not it’s possible to hold negative values. Think of the unary “negative” operator - as the sign. Unsigned holds only positive values. Signed numbers are stored with two’s compliment representation.

Integers have five fixed-size options ranging from 8-bit to 64-bit as well as an option for the architecture’s default. The architecture default is handy for indexing collections. If the type is signed the variable can store number values ranging from -2^(n - 1) to 2^(n - 1) - 1 where n is the number of bits. If the variable is unsigned it can store positive values from 0 to 2^n - 1. An i8 variable can therefore store values from -2^7 to 2^7 - 1 which works out to values ranging from -128 to 127. A u8 variable can store values ranging from 0 to 2^8 - 1 or 255. Rust defaults to a 32-bit signed type, which is a good place to start.

Length	signed	unsigned
8-bit (1-byte)	i8	u8
16-bit (2-byte)	i16	u16
32-bit (4-byte)	i32	u32
64-bit (8-byte)	i64	u64
128-bit (16-byte)	i128	u128
arch	isize	usize

Integer literals can be written in the following formats. Number literals can include a _ for visual separation to make the value easier to read. In this case 1_000 has the same value as 1000.

Literal type	Example
Decimal	98_222
Hex	0xff
Octal	0o77
Binary	0b1111_0000
Byte (u8 only)	b’A’

Integer overflow happens when a variable’s value exceeds its storage limit. Rust has two behaviors associated with integer over, but both of them are considered “errors” to be avoided. In debug mode the program will “panic” at runtime which causes the program to exit with an error. If the behavior persists with a --release build the value will experience two’s compliment wrapping and cycle through the range of values. If you add 2 to a u8 variable with a value of 255, the program will panic (exit with error) in debug mode and become 1 in --release mode. Rust includes methods designed to explicitly handle integer overflow. There are wrapping_* methods to cycle through a type’s values, checked_* methods to check if a variable overflows, overflowing_* methods that return Booleans if a variable overflows, and more.

Integer arithmetic truncates toward zero to the nearest integer. So 24 / 5 = 4, not 4.8.

Floating point types

Rust includes two floating-point type options that are designated with an f prefix. Similar to the integer types the f32 and f64 designations refer to the number of bits they hold. Because modern CPUs handle both with roughly the same speed, Rust defaults to f64 because it is capable of double the precision of the single precision f32 option.

The character type

The char type is the most primitive alphabetical type. Use single quotes ' when describing the character type instead of double quotes " used for describing string types. Characters are four bytes and represent Unicode scalar values. Unicode values range from U+0000 to U+D7FF and U+E000 to U+10FFFF (inclusive). There is more to the char type than meets the eye.

Compound Types

The two compound types covered in this section could easily be considered collections, or even data structures, but are explicitly declared as primitives within Rust’s core language structure. Both the array and the tuple discussed here can take any types as their values.

Tuples

Tuples are used to group values of similar or mixed type. Tuples have a fixed set of elements and once they are declared types cannot be added or removed from the tuple. The example below first creates a tuple and binds it to the variable “idk”. Rust uses a pattern called “destructuring” to turn the tuple into separate values. We can access either the entire tuple or elements as indexes using the . character. Notice that the tuple is 0-indexed.

fn tuple() {
    let idk: (i32, f64, u8) = (32, 6.4, 8);
    let (a, b, c) = idk; //access all elements
    let d = idk.1; //access specific elements by index
    println!("The whole tuple: {a}, {b}, {c}");
    println!("Accessing a tuple index: {d}");
}

fn tuple_two() {
    let t: (String, u8) = (String::from("Peter"), 40);
    let name: String = String::from(x.0);
    let age: u8 = x.1;
    println!("{name} is {age} years old.");
}

In Rust there is a difference between parenthesized values and tuples. The compiler only really infers with 1-ary (single) elements. For example,

    let oneary = (1,); // 1-ary tuple
    let binary = (1, 2); // 2-ary (binary) tuple
    let parenthesized_1 = (1); // parenthesized value of inferred type i32
    let parenthesized_2 = 1; // same as above but without parenthesis

There are also things called “tuple structs” that are kind of half-way between tuples and structs. Tuple structs are effectively named tuples. See the section on structs and enums for more information.

Arrays

Arrays in Rust are fixed-length, contiguously-stored groupings of values of the same type. Similar to tuples they cannot grow or shrink in size and must be sized at compile time. Arrays are immutable by default and cannot be changed after they are initialized. It is only possible to retrieve data from an array after it is created. Arrays are useful for when you want data allocated on the stack rather than the heap or when you want a fixed number of elements in a collection.

Annotating the array type includes two declarations within a set of square brackets that are separated by a semicolon (;). The first declaration represents the array base type and the second declaration represents the number of array elements. Array initializations are also encased in a set of square brackets. Accessing an array is similar to other languages whereby the array variable is named with an index listed in square brackets.

fn array_one() {
    let array: [f64; 3] = [32.0, 6.4, 8.0];
    let a: f64 = array[0]; //access specific elements by index
    println!("Accessing a tuple index: {a}");
}

It is also possible to initialize an array that contains the same value for each index by using an initialization pattern on the opposite side of the assignment operator that is similar to the type declaration pattern. Instead of naming the type explicitly you name the value you want to initialize and Rust infers the type declaration from the supplied value. In this example Rust would assume you want an array consisting of five elements of (default integer) type i32, with all indexes containing a value of 23. Remember that while the element number is indexed at 1 the array itself is indexed at 0 so there are 5 elements and 0-4 index values.

fn array_two() {
    let array_two = [23; 5];
    let b: i32 = array_two[3];
    println!("Lets print a hastily initialized array index: {b}");
}

If you want to be specific about the type you can name it in the declaration and initialize all values with the same number using the previously mentioned shorthand notation. This example declares an array of type u8 with 8 indexes. Then it initializes all index values to zero.

fn array_three() {
    let array_three: [u8; 8] = [0; 8];
}

Rust includes a memory safety feature that does not allow the program to access array indexes that are greater than or equal to the array length (out of bounds) at runtime. This represents a refreshing departure from some other lower-level languages.

Vectors (see below) allow for a dynamic number of indexes, but arrays are more useful when you already know the number of elements we’re dealing with. For example, you know all the days of the week, or months of the year, so an array is a better choice for dealing with those sets.

Reference Types

Rust does not allow you to create pointers as freely as you can in C/C++. Creating pointers directly mostly defeats the purpose of Rust’s safety guarantees. Instead you can create references which carry additional safety guarantees. There are some situations where you cannot avoid using raw pointers, such as within foreign function interfaces (FFIs). As a result Rust allows you to create raw pointers from references, which allows you to skirt the ownership and borrowing rules. Using raw pointers is inherently memory unsafe because there are no checks about the validity of data involved. Unsafe operations are discussed in more detail in the Unsafe section.

References/reference types are similar to pointers but contain additional metadata and safety guarantees. The syntax and rules for references allow Rust to enforce ownership and borrowing constraints at compile time, which help to ensure memory safety. At their heart, references are just pointers with extra metadata that the compiler uses to ensure safety. It may help to think of references as “safe pointers” or “pointer objects”, or it may not. IDK, I’m not your dad. References have a known, fixed size at compile time, even if they point to data of unknown or dynamically-sized types.

To declare an immutable reference use the &T syntax. To declare a mutable reference use the &mut T syntax. You can only create mutable references if the referenced value T is also mutable.

pub fn immutable_reference() {
    // Heap-allocated value
    let val = String::from("Peter");

    // Creates an immutable reference to the value
    let val_ref: &String = &val;
    //let val_ref = &val; // Alternative phrasing with inferred type

    // Prints the referenced object
    println!("Reference: {}", val_ref);
}

Reference: Peter

pub fn mutable_reference() {
    // Creates a mutable heap-allocated value
    let mut val = String::from("Peter");

    // Creates a mutable reference to the value
    let val_ref: &mut String = &mut val;
    //let val_ref = &mut val; // Alternative phrasing with inferred type

    // Modifies the original value through a mutable reference
    val_ref.push_str(" is pretty OK");

    // Prints the referenced object
    println!("Mutated reference: {}", val_ref);
}

Mutated reference: Peter is pretty OK

Dynamically sized types

Rust needs to know how much memory to allocate for any value of a particular type, and all values of a type must use the same amount of memory. You cannot know a dynamically sized type’s size at compile time. As a result Rust requires all dynamically sized types (DSTs) to be declared behind a reference. The reference provides a fixed-size object that points to the start of the DST in (heap or static) memory and has extra metadata that stores the size of the dynamic information. Examples of DSTs include slices, the str type, and trait objects, among others. For more information on these mentions, see the next section, the Strings section, and the Traits section respectively.

Rust uses the Sized trait to determine whether a type’s size can be known at compile time. The Sized trait is automatically implemented for every type whose size is known at compile time. Additionally, Rust adds an implicit bound for generic functions.

// The Sized bound is implied, so you can write this:
fn generic<T: Sized>(t: T) {
    ...
}
// Like this:
fn generic<T>(t: T) {
    ...
}

Generic functions work only with Sized types by default. To relax this requirement, use the sytax in the following example. This syntax roughly translates to, “T may or may not be Sized”. This syntax is only available to the Sized trait. Also note that this example uses &T, just in case T is not Sized.

fn generic<T: ?Sized>(t: &T) {
    ...
}

Slices

The general concept of a slice is data structure that represents a view into a collection or contiguous region of memory. In Rust a slice is a view that is not owned, which exists as either a mutable or immutable reference, into a contiguous set of elements [T] in a mutable or immutable collection or region of memory. Rust defines contiguous such that each element is laid out in equal distance from its neighbors. You can take a slice of any collection type as either a shared (&[T]) or mutable (&mut [T]) reference. The slice just stores the starting position and the length of the slice.

    // Illustrates slicing a vector as a shared reference
    // Base collection instance
    let x = vec![1, 2, 3, 4, 5];

    // Creates a slice of the whole vector
    let whole_slice = &x[..];
    assert_eq!(x, whole_slice);

    // Creates a subset of the vector
    let partial_slice = &x[0..3];
    assert_eq!(partial_slice, vec![1, 2, 3]);

    // Illustrates slicing a vector as a mutable reference
    // Instantiates a mutable array
    let mut y = [1, 2, 3];

    // Shadows x to create a mutable slice of the whole original array
    let y = &mut y[..];

    // Mutuates the value at index 1
    y[1] = 7;

    // Compares x to a slice literal
    assert_eq!(y, &[1, 7, 3]);
    // Compares x to an array literal
    // Rust implicitly dereferences x
    assert_eq!(y, [1, 7, 3]);
    // Compares an explicitly dereferenced slice to an array literal
    assert_eq!(*y, [1, 7, 3]);

The string slice in Rust, identified with the str keyword, is a special kind of slice that expands on the universal definition of a slice. String slices in Rust are immutable, dynamically-sized, and always guaranteed to contain UTF-8 encoded data. These characteristics are what make working with raw string slices difficult and is why Rust includes the String wrapper for manipulating text. Before talking about the String type lets explore the implications for the raw string slice type str.