Rust Project Structures

Crates are the source code files that makes up the program or library. They are both the individual *.rs files that we write as well as the whole shipped package. Think of the term “crate” as the program (or library) that we’re building. The program may contain different individual source program files such as the hello world (root) crate (main.rs) that comes with a new Cargo project. Crates are either of binary or library type. Binary crates are used to produce executables. Crates can also contain modules, which may be defined in outside crates. Modules are organizational constructs that contain functions, structs, enums, etc. so that we can work with our code efficiently.

Crates

Lets look at crates as structure first. There are two fundamental types of crates; library and binary crates.

Library crates
- Named lib.rs in the src directory
- Dont have a main() function
- Designed to encapsulate and export functionality to be reused by other crates
- Functions, structs, enums, and traits in a library crate are part of its public API
- Dont compile to an executable
- Commonly referred to as “crates” or “libraries”
- Created using cargo new --lib project_name
Binary crates
- Named main.rs in the src directory
- Have a main() function as the program’s entry point
- Compile to an executable
- Designed to be standalone applications/tools
- Do not expose functions that other crates can use directly
- Created using cargo new --bin project_name

Packages

Packages are bundles of one or more crates. Packages can contain one or more binary crates, but can only contain ONE library crate. Packages contain a Cargo.toml file that describes how to build the crates within the package. The Cargo program itself is an example of a package.

The easiest way to create packages is with the Cargo tool. In fact, we’ve already shown this with the cargo new command in the Intro. Using Cargo to create a new project creates a package. This is evidenced by the newly created top-level Cargo.toml file that is created with this command in the package. If we inspect the contents we will find a definition similar to the following exerpt.

[package]
name = "learning_rust"
version = "0.1.0"
edition = "2021"

If we look at the newly created package structure we see that a src directory is created with a simple main.rs inside it. This main.rs file is our “root crate”, and where the compiler will start.

~ $ tree MyProjects/new_project/
MyProjects/new_project/
├── Cargo.toml
└── src
    └── main.rs

By default cargo new creates a binary package. We can specify that we want to create either a binary or library package by adding the --bin or --lib option to the command. The result is similar to the structure created with the default command, but a --lib option creates a lib.rs instead of a main.rs as the default file in the src directory.

~ $ cargo new project_name --lib
     Created library `/Users/me/MyProjects/project_name` package
~ $ tree MyProjects/project_name
MyProjects/new_projects
├── Cargo.toml
└── src
    └── lib.rs

Modules

Modules allow us to create organized file structures with referencable code within our project. Modules are essentially directories that we declare in designated file locations and access (reference) through paths. To talk about modules, lets first take the wrong path, and then show why there is a better way. This next section explains why the process works. If you just want to know the right way to create projects/crates skip ahead to the “create a module” section.

The wrong way

From the default binary (“program”) package structure you may notice that we only have one Rust file named main.rs that contains a main() function. What if we want to add more “programs” (individual crates) to the project in addition to this root crate? Simply adding more programs (*.rs files with main() functions) to the src folder will not result in properly built binaries. We can see this by creating another Rust file called hello.rs in the src directory next to the main.rs file.

~/Project/src (main) $ tree
.
├── hello.rs
└── main.rs

If we write similar contents to each files (with a different println!() literal) and run the cargo build command we will get a successful build. So whats the problem? By default the build creates a single binary executable from the “root crate”, which by convention is the main.rs file with the same name as the project (package). That means that if we want to run the binary we’re 1) looking for a project_name file, not “main”, and 2) the executable file does not contain any information about our hello.rs file, so there is no path to execute it.

~/IdeaProjects/project_name $ tree
.
├── Cargo.lock
├── Cargo.toml
├── src
│   ├── hello.rs
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    └── debug
        ├── build
        ├── deps
        ├── ...
        ├── project_name
        └── project.d

To explain how this works, lets look at the Cargo.toml file that was created when we ran the cargo new command. The main.rs file is not mentioned in the Cargo.toml file. So how does the compiler know what code is part of our program? In binary crates, the crate root is src/main.rs by convention, and in library crates the root is src/lib.rs. These root crates are where the compiler starts. Our main.rs file makes no mention of our hello.rs file so the compiler wont know it exists and thus wont build something that accesses it.

The workaround (also wrong)

Does this all mean that we cannot add standalone crates and build simple one-off binary executables in our project? No. We can add standalone crates (individual one-off programs) by creating a bin folder in the src directory.

~/Project/src (main) $ tree
.
├── bin
│   ├── falkens_maze.rs
│   ├── checkers.rs
│   ├── poker.rs
│   ├── chess.rs
│   ├── guerrilla_engagement.rs
│   └── global_thermonuclear_war.rs
└── main.rs

This works for collections of small, one-off programs, but what if we want access to all of these little programs or functions from somewhere else like a “parent” crate (program)? Its been a long way to it, but this is where modules come in.

The right way

Think of modules like nodes in a file tree. Modules can be declared as either nested directories/files or within a singular file. Lets start by creating modules in a binary crate using a “physical” structure. The “physical” structure uses nested files and directories, in opposition to a “logical” module structure that is declared all within one file. The process of creating a physical module structure involves a three-step process.

Create a file structure to identify our modules. To create a module, we need some form of file structure within our src directory. Lets say that instead of a list of games we want to explore some Rust concepts. We have a bunch of “programs” that each explore one topic of Rust. Consider the following list.

~/MyProjects/rust_project (main) $ tree /src
src/
├── control_flow.rs
├── enums.rs
├── guessing_game.rs
├── ownership.rs
├── rng_range.rs
├── strings.rs
├── struct_test.rs
├── structures.rs
├── time.rs
├── variables.rs
└── main.rs

Lets say we want to create a module for Rust concepts, a module for examples we’ve ripped from the book, and a module for one-off programs and utilities we write ourselves. We could restructure the project like this.

~/IdeaProjects/learning_rust (main) $ tree /src
src/
├── cncpt
│   ├── control_flow.rs
│   ├── enums.rs
│   ├── guessing_game.rs
│   ├── ownership.rs
│   ├── rng_range.rs
│   ├── strings.rs
│   ├── struct_test.rs
│   ├── structures.rs
│   └── variables.rs
├── exmpl
│   └── guessing_game.rs
├── main.rs
└── util
    └── time
        └── time.rs

Its starting to look a little more organized. The directory structure of Rust’s module system is the first big step to creating modules.

Declare the module. In the src/main.rs root crate we need to declare our new module(s) using the mod keyword at the top of the file. Lets start with a simple, single-tier exmpl module. If we want to add the cncpt and util modules we also need to declare them the same way on separate lines.
```
mod exmpl;

fn main() {
...
```
Declaring the modules here the compiler will now look for either a file named src/exmpl.rs (in the same directory as our main.rs file), or a mod.rs file in a src/exmpl directory.
Implement the module. We have some options here. To access our exmpl module as defined in the structure we created at the top of the section we should create a src/exmpl/mod.rs file. We could also create a src/exmpl.rs file, but as we’ll see this limits our options.
```
~/IdeaProjects/learning_rust (main) $ tree /src
src/
├── cncpt
│   ├── control_flow.rs
│   ├── enums.rs
│   ├── ownership.rs
│   ├── rng_range.rs
│   ├── strings.rs
│   ├── struct_test.rs
│   ├── structures.rs
│   └── variables.rs
├── exmpl
│   ├── mod.rs //Module file
│   └── guessing_game.rs
├── main.rs
└── util
   └── time
       └── time.rs
```
The guessing_game.rs file is pretty simple with only one defined function; main(). We could simply add it to the mod.rs file and rename the main() function to something like game(). We could also simply skip the mod.rs file route entirely and create a src/exmpl.rs file and put our refactored guessing game inside it. Neither of these options will scale very well as we write more, and potentially more complex functions for our exmpl module. Instead lets create a src/exmpl/guessing_game.rs file and place the contents of the game inside it. We still have to redefine main() to something else. We also have to modify the src/exmpl/mod.rs file to look for the guessing game. Within the src/exmpl/mod.rs file create a submodule that points to the guessing game file.
```
mod guessing_game;
```
At this point we can call our new game() function from the src/main.rs file using the module path structure. This workflow just lists the path. See the path section for more information.
```
mod exmpl;

fn main() {
   crate::exmpl::guessing_game::game(); //absolute path
   exmpl::guessing_game::game(); //relative path
}
```
But wait! We have errors! Why do we have errors?! Well, as you may have noticed by now, Rust is safe by default. We need to make our modules public by adding the pub keyword to all modules and functions we want to reference outside of their source crates. In src/exmpl/mod.rs we need:
```
pub mod guessing_game;
```
In src/exmpl/guessing_game.rs we need:
```
pub fn game() {
...
```
Now when we compile, the references should all be valid and we should be good to go!

What about our util module and its time submodule? The pattern/structure works the same way as we just explored. We’ll need to create a mod.rs file that names public modules and module crates that name public functions for the entire structure.

~/IdeaProjects/learning_rust (main) $ tree /src
src/
├── cncpt
│   ├── mod.rs
│   ├── ctrl_flow.rs
│   ├── enums.rs
│   ├── ownership.rs
│   ├── rng_range.rs
│   ├── strings.rs
│   ├── struct_test.rs
│   ├── structures.rs
│   └── variables.rs
├── exmpl
│   ├── mod.rs
│   └── guessing_game.rs
├── main.rs
└── util
    ├── mod.rs
    └── time
        ├── mod.rs
        └── time.rs

It is also possible to nest modules within the same file. For example, our control flow crate contains functions for if statements and loops. We can declare pub mod ifs {} to group the functions that demonstrate if logic and pub mod loops {} to group the functions demonstrating loop logic. Because these are binary crates we also need a (now useless) fn main().

pub mod ifs {
    pub fn if_statements(n: f64) {...}
    pub fn lets_if(n: i32) {...}
}
pub mod loops {
    pub mod age_calculator() {...}
    pub mod lets_loop() {...}
}
fn main() {}

These functions are called from the main() function in our main.rs file the same way as if they were in separately declared module files. Each tier of module declaration is named in the path. cncpt::ctrl_flow::ifs::lets_if(6)

Paths

In order to use our new modules/sub-modules we need to reference them. Now that we have modules, how do we reference them? The answer is paths. We’ve already seen paths with the familiar :: syntax. Paths to public modules start at the root and are similar to directory paths but use the :: path syntax.

Absolute vs relative paths

Consider the following crate structure.

~/IdeaProjects/learning_rust (main) $ tree /src
src/
├── exmpl
│   ├── mod.rs
│   └── guessing_game.rs
└── main.rs

Lets say we want to call a game() function from our main.rs crate, and the game() function is located in the src/guessing_game.rs crate. The path implicitly starts at the root. The root crate is known to the compiler as crate, so we could call the game function with crate::exmpl::guessing_game::game(). This is the absolute path of the function. This is similar to naming the directory we’re starting from in terminal with ./. Rust is smart enough that we dont actually need the crate:: , and can instead use the relative path of the function as exmpl::guessing_game::game() (once we’ve imported the module, of course).

mod exmpl;

fn main() {
    exmpl::guessing_game::game();
}

Access

Just like most things in Rust, modules and their contents are (mostly) private, which is to say inaccessible, by default. So far we’ve just used functions within module trees as examples, but we may also need to access enums, structs, and their implementations (methods and associated functions). Enums are public by default, and we can

Privacy rules

In Rust a descendant (child..) module has access to it’s parent’s contents, but an ancestor (parent..) does not have access to its child contents. Sibling modules (modules declared at the same level) have access to each-other, but require functions/methods/structs/struct fields to be public to access them. To make a module/sub-module and any of its contents referencable throughout the crate you must dedlare it public with the pub keyword.

Consider the following structure of modules and functions.

~/IdeaProjects/learning_rust (main) $ tree /src
src/
├── lib.rs
└── main.rs

With the contents of lib.rs that resemble a simple module tree. The following tree is legal, though a little nonsensical.

mod grandparent {
    fn hello_world() {
        println!("Hey! Im a library!");
    }
    mod sibling {
        pub fn test_3() {
            super::hello_world(); //Works even though hello_world() is not public
        }
    }
    mod child {
        fn test_1() {
            //grandchild::test_2(); //Illegal access
            //super::sibling::what(); //Illegal access
            super::sibling::test_3(); //Works only with absolute path
        }
        mod grandchild {
            fn test_2() {
                super::super::hello_world();
            }
        }
    }
}

From our main.rs file, we are unable to access any modules or functions because they are all private. All nodes in the access tree need to be public for the path to work. If we want to run the function under the grandparent module, both the grandparent module and the hello_world() function must be marked public. Then we can import the library and access the function using a relative path. If we want to access test_1() function the grandparent and child modules will have to be public, as well as the test_1() function. You get the idea.

use library_name;

fn main() {
    library_name::grandparent::hello_world();
    library_name::grandparent::child::test_1();
}

In addition to granting access to functions, we can also use the pub keyword to grant access to structs, their fields, and their methods.

pub mod privacy {
    pub struct structure {
        first: i32, //Struct fields are private by default
        pub second: String //Explicitly made public
    }
    impl structure {
        pub fn constructor(f: i32, s: String) -> Self {
            Self {
                first: f,
                second: s,
            }
        }
    }
}

...

fn main() {
    let s: String = String::from("Peter");
    let neato = project::privacy::structure::constructor(23, s); //Module located in the same project as lib.rs
    let a: &i32 = &neato.first; //Illegal because this field is private by     default
    let b: &String = &neato.second; //Its all good, this is a public struct field
...

Shortcuts and `use`

We have been using absolute paths to refer to external modules and their contents so far. Rust makes it easy to create something like a symbolic link by using the use keyword. The use statement is only relevant to the scope its declared in. Unlike import statements in Java, for example, this means we can use use statements for hyper-local scope usage. Taking the example we just saw we can shorten the path using a use statement.

use project::privacy;

fn main() {
    let s: String = String::from("Peter");
    let neato = privacy::structure::constructor(23, s);
...
}

The path in the use statement can even get more specific, down to the element of the module we’re referencing. The last element named in the use statement must also be named in the relative path. It is “idiomatic” in the Rust convention to not bring the full path of a function into scope so that we can see where its defined. By contrast, it is convention to bring in all but a struct or enum so it can be used easily that way. The following example is considered idiomatic Rust because structure is a struct that can now be named directly, and its associated function lets us know which module its defined in.

use project::privacy::structure;

fn main() {
    let s: String = String::from("Peter");
    let neato = structure::constructor(23, s);
    structure::printer(neato);
...
}

We can also restructure the way code is used from the way its implemented by the use of something called “re-exporting”. Re-exporting makes a use statement public to cut out a tier of the organization for programmers calling the function, while keeping the original implementation intact for organization. This example makes it so that we can still organize (separate) front_of_house and back_of_house modules, but call functions using simpler restaurant::hosting::add_to_waitlist() syntax.

//original implementation
mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

//re-export for cleaner interaction
pub use crate::front_of_house::hosting;

pub fn eat_at_restaurant() {
    hosting::add_to_waitlist();
}

Because Cargo allows us to produce documents from our code, we can re-export items to make a public structure that’s different from our private structures. To do this we use pub use. Re-exporting takes a public item in one location and makes it public in another location, as if it were defined in the other location instead.

If two elements have the same name they should have their parent modules named for disambiguation. We can also use the as keyword to rename an element. Both path naming with parent module and using the as keyword to rename an element that has the same name as another are both considered idiomatic, so we can choose whichever we like more.

use std::fmt::Result;
use std::io::Result as IoResult;

fn function1() -> Result {
    // --snip--
}

fn function2() -> IoResult<()> {
    // --snip--
}

Rust also allows us to nest use statements with common descendants. The syntax dictates that we only list common elements once. The path then includes ::{} with unique path elements listed in the curly braces. In cases where common element named is the element used, we can list self to indicate that there are no futher elements required. The self and super keywords have to be used as the first elements in the list of unique path elements.

use std::io;
use std::io::Write;
->
use std::io{self, Write}

use std::cmp::Ordering;
use std::io;
use std::io::Write;
->
use std::{io, io::Write, cmp::Ordering};

This can also provide some confusion if too many elements are included. It may be wise to sort things out by functionality. This example separates out the compare and I/O elements into two use statements for the standard library.

use std::cmp::Ordering;
use std::io;
use std::io::Write;
->
use std::cmp::Ordering;
use std::io::{self, Write};

Finally, we can use the “glob operator” to bring all elements for a particular path into scope. This can lead to confusion about what is actually being used, so it is generally reserved for special use cases like testing functionality.

use std::io::*;

Workspaces

Workspaces represent a way to develop projects that contain packages with different dependencies that all share a single Cargo.lock file and output directory. Workspaces are useful when you want to develop discrete packages in a “monorepo” style projector if you’re developing parallel libraries that may contain different dependency requirements or versions, or if you plan on writing procedural macros, and still want to retain testing across the entire project. For example, workspaces may be beneficial when developing large projects that requires some integrations and testing between microservices. Rust also allows you to specify more than one binary executable with a special project setup for situations where you just want to tinker with a project and dont need the granular control over each microserve.

Discrete Executables

If you just want multiple discrete binary executables in the same project you can create a <project>/src/bin/ directory and add files that contain a main() function. If you want to import functions from the wider project you’ll need to set up a top-level src/lib.rs and declare the modules as public with the pub mod <module> syntax. Setting up multiple binary executables also means that you have to specify which one to run with the cargo run command as Rust will no longer default to src/main.rs automatically. To specify binaries you can run the cargo run --bin <target> command where <target> is the name of the file with the main() function (without the .rs extension) inside the bin/ directory. As always, you have customization options by modifying the project’s Cargo.toml file. For example, you can specify a default by adding a default-run field to the package. This example makes the default the src/main.rs file:

[package]
name = "example"
...
default-run = "example"

You can also alias all the binaries in the project with a top-level [[bin]] definition:

[[bin]]
name = "file"
path = "src/bin/file_output_builder.rs"

Putting it all together you might come up with something like the following:

[package]
name = "example"
version = "0.1.0"
edition = "2021"
default-run = "example"

[dependencies]
...

[[bin]]
name = "file"
path = "src/bin/file_output_builder.rs"

For this setup running cargo run executes the src/main.rs binary, and running cargo run --bin file runs the file output builder binary.

Workspace Creation

You can set up and configure workspaces in a variety of “non-standard” ways, so there is no convenient function/command to create them. Instead, you can create them manually just by structuring our project directory.

This example We’ll create a workspace with the following structure. The workspace we’ll set up has two modules. Before we get into how, Id just like to point out that this workspace contains more than one main.rs file and more than one Cargo.toml file. Each Cargo.toml is configured slightly differently than a standard binary/library crate. The workspace still only has one Cargo.lock file, which ensures that all sub-crates are using the same dependency versions.

$ cargo new add_one --lib
$ cargo build
$ tree
.
├── Cargo.lock
├── Cargo.toml
├── add_one
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── adder
│   ├── Cargo.toml
│   └── src
│       └── main.rs
├── src
│   └── main.rs
└── target
├── CACHEDIR.TAG
└── debug
├── adder
├── adder.d
├── build
...

Create the parent workspace directory and add a Cargo.toml file manually. For a project workspace called “add” we can manually mkdir add instead of cargo new add.
```
$ mkdir add
$ cd add
$ touch Cargo.toml
$ nvim Cargo.toml
```
Define the workspace in the top-level Cargo.toml file. The default contents of the Cargo.toml when we run cargo new <project name> contains a [package] section.
```
[package]
name = "add"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.    org/cargo/reference/manifest.html

[dependencies]
```
However, for a workspace we dont want to define a package. Instead we’ll define a [workspace].
```
[workspace]

members = [
    "adder",
]
resolver = "2"
```
NOTE: Rust relies on a resolver to minimize the number of dependencies in the project. You’ll need to set the resolver in the top-level Cargo.toml or you will see a resolver warning during compilation/runtime. For more information see the Rust default resolver documentation.

At this point, still from the /add directory, we can run the cargo new adder command to create a new workspace member. Remember that by default the cargo new command creates binaries. After running the new command, the project/workspace can be built. If we build and check the structure it should look something like this.

$ cargo new adder
$ cargo build
$ tree
.
├── Cargo.lock
├── Cargo.toml
├── adder
│   ├── Cargo.toml
│   └── src
│       └── main.rs
├── src
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    └── debug
        ├── adder
        ├── adder.d
        ├── build
        ...

Notice that there are multiple main.rs files, but only one /target directory.

We can add a library member by editing the top-level Cargo.toml file we manually created.
```
[workspace]
members = [
    "adder",
    "add_one",
]
```

After the new member is added to the top-level Cargo.toml file run another cargo new command to create a member. In this example we’re adding a library called add_one.

$ cargo new add_one --lib
$ cargo build
$ tree
.
├── Cargo.lock
├── Cargo.toml
├── add_one
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── adder
│   ├── Cargo.toml
│   └── src
│       └── main.rs
├── src
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    └── debug
        ├── adder
        ├── adder.d
        ├── build
        ...

Now that we have our basic structure we can start to use it. Cargo doesn’t make any assumptions about the relationships of the packages within a workspace. This means we need to be explicit. Say we want to access functions defined in our library add/add_one/src/lib.rs from some other workspace member. Once we write the code in the lib, we can add it as a dependency in one of our other workspace members. For example, we can add the add_one package to our add/adder/Cargo.toml file as follows. The dependency name must match the original.
```
[dependencies]
add_one = {path = "../add_one"}
```
Now all we have to do to access the functions across modules is include an appropriate use statement in the module we want to access the code from. For example, in the add/adder/src/main.rs file we can write something like the following,
```
use add_one; // Imported workspace member (module)

fn main() {
    let x = add_one::addotron(2);
    println!("This main() accesses \"add_one::addotron()\" from the \"adder\"     module");
    println!("addotron(2) = {}", x);
}
```
NOTE: I was getting errors in Neovim after changing function names in one module and trying to use them in another module. I couldn’t figure out why, because the function names, use statements/paths were correct, and the code compiled and ran. The errors did not appear in Rust Rover. I eventually solved the issue by running cargo clean which deletes the /target directory. Apparently Neovim was using build artifacts to index function names across modules, and for some reason wasn’t updating with subsequent compilations.

Running code in workspaces

We can specify which module we want to run with the -p option.

$ cargo run -p adder
...
This main() accesses "add_one::addotron()" from the "adder" module
addotron(2) = 4

Workspace dependencies

For projects that have more than one Cargo.toml, such as workspaces with multiple modules, adding dependencies works similarly to projects with only one Cargo.toml file. For example, all dependencies used in a module must be declared in that module’s Cargo.toml file. If multiple modules use the same dependency, they must all be declared in their respective modules. That is to say that dependencies are not accessible across modules in the workspace. If we have multiple dependencies declared across our workspace, how do we know which version to use? Cargo automatically resolves dependency versions across modules and records a single, universally compatible version in the top-level Cargo.lock file.