We saw a couple of functions, in passing, in the last chapter when we looked at the automatically generated boilerplate code created by cargo new. What were we actually seeing, though?
In Rust, a function starts off with the fn keyword. A keyword is a sequence of letters or symbols which has a fixed meaning in the language. Nothing we do in our program can change the meaning of a keyword, and the libraries we use can't change the meaning either. Keywords occasionally have different meaning in clearly different contexts, but they always mean the same thing when used in the same way. Keywords are the solid foundation that everything else is built on.
So, the fn keyword is used to tell the Rust compiler that we're about to tell it about a new function. After that, separated by a space, comes the function's name. There are rules for what the function name can look like:
- It must be made up of the following:
- English letters (the letters A through Z, in their lowercase or CAPITAL forms)
- Arabic numerals (the digits 0 through 9)
- Underscores(_)
- It can't start with a number (so 7samurai is not a valid name)
- If it starts with an underscore, it must have at least one further character (_ by itself has a special meaning)
Then comes an open parenthesis ( and a close parenthesis ), with a list of parameters between them. We're going to gloss over the parameter list for now and come back to that later. There doesn't have to be anything between the parenthesis if the function does not need parameters, and that's how we'll do it for now.
After the close parenthesis of the parameter list, we can optionally include a → symbol followed by a return type, another thing which we'll go into in more detail later.
Next comes a { symbol, which tells Rust that we're about to begin a sequence of commands, followed by as many commands as we need in order to tell Rust how to do what we want the function to do, and then finally a } symbol to mark the end.
Going back to the boilerplate code, Let's take a look at the automatically generated main function again:
fn main() {
println!("Hello, world!");
}
Here, we can see the fn keyword, function name, and empty parameter list. The optional return type has been omitted. Then, between the { and }, we see a single instruction, which tells the computer that we want it to print out Hello, world! whenever we tell it to run the main function.
There's not a lot more to say about functions until we have some understanding of what kinds of instructions we can give the computer, between those { and } symbols. The main idea is that we can bundle up many instructions into a function, and then use a single instruction elsewhere in the program to tell the computer to do all that stuff.
Modules give us a way to organize our functions (and other items that have names, such as data structures) into categories. This helps us keep things organized, and allows us to use the same name more than once, as long as we only use it once per module. It also lets us use shorter versions of a thing's name most of the time, but gives us a longer version we can use when those short names might be confusing or ambiguous.
Defining a module is easy. In any .rs file which the compiler is going to be looking at, we can use the mod keyword to start a new module. There are two different ways to use that keyword, though, depending on whether we want to define the module as a section of the current file or as a separate file.
To define a module as a section of a file, we use the mod keyword followed by a name and then a { symbol, then the contents of the module, and then a } symbol to finish it up.
So, if we define a new module containing a couple of functions, it would look something like this:
pub mod module_a {
pub fn a_thing() {
println!("This is a thing");
}
pub fn a_second_thing() {
a_thing();
println!("This is another thing");
}
}
We've created a module named module_a and put the a_thing and a_second_thing functions inside of it. We haven't seen it previously, but the line in a_second_thing that says a_thing(); is an instruction to the computer to run the a_thing function. So, when a_second_thing runs, the first thing it does is run a_thing, and then it prints out its own message afterwards.
The pub keyword means that module_a is part of the public interface of the current module, rather than just being internal data. We'll talk more about that soon.
More often than not, we're going to want to give our modules their own files. It's just nicer to keep things separated and contained as much as possible, because it helps keep the code manageable. Fortunately, this is just as easy. In our .rs file, we can just write something like the following:
pub mod module_b;
That looks a lot like the previous example, except that it doesn't have the module contents right there between { and }. Instead, the Rust compiler goes looking for a file called either module_b.rs or module_b/mod.rs, and uses the whole file as the contents of the module_b module. So, if the file contains a couple of functions similar to the ones we saw previously:
pub fn a_thing() {
println!("This is a module_b thing");
}
pub fn a_second_thing() {
a_thing();
println!("This is another module_b thing");
}
Then module_b will contain two functions named a_thing and a_second_thing. It's not a problem that those functions have the same names as functions in the module_a module from before, because they're in a different module.
Why did the compiler look in two places for the source code of module_b? This allows us to be more flexible in how we lay out our directory structure for our program's source code.
In the A module as a section of a file section, the a_second_thing function is part of the same module as a_thing, so it's automatically allowed to use the short version of the other function's name to refer to it. However, code outside of the module needs to use the full name to refer to items inside the module. There are two ways this can be done. It can either be done directly, which is a good choice if we don't expect to be referring to the item often, or we can tell Rust that we want to use the short name for an item in a different module, which is a good choice if we're going to be using that item often in our code.
An item's full name consists of the module name, a :: symbol, and then the item's short name. If we have several layers of modules that we need to get through before we find the item we want, we list those modules' names in order, with a :: between each name. For example, we might refer to std::path::Path to get the Path item from the path module of the std module.
We can use the full name anywhere and be completely unambiguous as to what item we're talking about.
We can also use the use keyword to tell Rust that we want to refer to an item in a different module by its short name. This is done by just writing use followed by the full name of the item we want to use. For example, use std::path::Path; allows us to use just the short name for that item (Path in this example) in the following instructions, until we come to the } that closes the section of code where our use keyword was written (or we come to the end of the module file, which amounts to the same thing).
We can use the same syntax to tell Rust that we want to use the name of a module, rather than an item in a module. For example, std::path is a valid command. That would allow us to use path::Path as the name of the Path item in subsequent code. This is frequently convenient, since it still keeps the external items boxed up and separate, while providing reasonably short and informative names to work with.
In many of the preceding examples, we saw a pub keyword. That keyword makes the item it's attached to public, meaning that it is available to code that is not part of the same module. If we omit the pub keyword on an item, that item is private, meaning that it can only be accessed within the module where it is defined. Private is the default, so we need to explicitly mark those items that we want to have as part of the mod...