See a typo? Have a suggestion? Edit this page on Github
Heads up This blog post series has been updated and published as an eBook by FP Complete. I'd recommend reading that version instead of these posts. If you're interested, please check out the Rust Crash Course eBook.
Last time, we finished off with a bouncy ball implementation with some downsides: lackluster error handling and ugly buffering. It also had another limitation: a static frame size. Today, we're going to address all of these problems, starting with that last one: let's get some command line arguments to control the frame size.
This post is part of a series based on teaching Rust at FP Complete. If you're reading this post outside of the blog, you can find links to all posts in the series at the top of the introduction post. You can also subscribe to the RSS feed.
Like last time, I'm going to expect you, the reader, to be making changes to the source code along with me. Make sure to actually type in the code while reading!
Command line arguments
We're going to modify our application as follows:
- Accept two command line arguments: the width and the height
- Both must be valid
u32
s - Too many or too few command line arguments will result in an error message
Sounds easy enough. In a real application, we would use a proper
argument-handling library, like
clap. But for now, we're going lower
level. Like we did for the sleep function, let's start by searching
the standard library
docs for the
word args
. The first two entries both look relevant.
std::env::Args
An iterator over the arguments of a process, yielding aString
value for each argument.std::env::args
Returns the arguments which this program was started with (normally passed via the command line).
Now's a good time to mention that, by strong convention:
- Module names (like
std
andenv
) and function names (likeargs
) aresnake_cased
- Types (like
Args
) arePascalCased
- Exception: primitives like
u32
andstr
are lower case
- Exception: primitives like
The std
module has an env
module. The env
module has both an
Args
type and a args
function. Why do we need both? Even more
strangely, let's look at the type signature for the args
function:
pub fn args() -> Args
The args
function returns a value of type Args
. If Args
was a
type synonym for, say, a vector of String
s, this would make
sense. But that's not the case. And if you check out its
docs, there
aren't any fields or methods exposed on Args
, only trait
implementations!
The extra datatype pattern
Maybe there's a proper term for this in Rust, but I haven't seen it myself yet. (If someone has, please let me know so I can use the proper term.) There's a pervasive pattern in the Rust ecosystem, which in my experience starts with iterators and continues to more advanced topics like futures and async I/O.
- We want to have composable interfaces
- We also want high performance
- Therefore, we define lots of helper data types that allow the compiler to perform some great optimizations
- And we define traits as an interface to let these types compose nicely with each other
Sound abstract? Don't worry, we'll make that concrete in a bit. Here's the practical outcome of all of this:
- We end up programming quite a bit against traits, which provide a common abstractions and lots of helper functions
- We get a matching data type for many common functions
- Often times, our type signatures will end up being massive,
representing all of the different composition we performed (though
the new-ish
-> impl Iterator
style helps with that significantly, see the announcement blog post for more details)
Alright, with that out of the way, let's get back to command line arguments!
CLI args via iterators
Let's play around in an empty file before coming back to bouncy. (Either use cargo new
and cargo run
, or use rustc
directly, your call.) If I click on the expand button next to the Iterator
trait on the Args
docs page, I see this function:
fn next(&mut self) -> Option<String>
Let's play with that a bit:
use std::env::args;
fn main() {
let mut args = args(); // Yes, that name shadowing works
println!("{:?}", args.next());
println!("{:?}", args.next());
println!("{:?}", args.next());
println!("{:?}", args.next());
}
Notice that we had to use let mut
, since the next
method will
mutate the value. Now I'm going to run this with cargo run foo bar
:
$ cargo run foo bar
Compiling args v0.1.0 (/Users/michael/Desktop/tmp/args)
Finished dev [unoptimized + debuginfo] target(s) in 1.60s
Running `target/debug/args foo bar`
Some("target/debug/args")
Some("foo")
Some("bar")
None
Nice! It gives us the name of our executable, followed by the command
line arguments, returning None
when there's nothing left. (For
pedants out there: command line arguments aren't technically
required to have the command name as the first argument, it's just a
really strong convention most tools follow.)
Let's play with this some more. Can you write a loop that prints out all of the command line arguments and then exits? Take a minute, and then I'll provide some answers.
Alright, done? Cool, let's see some examples! First, we'll loop
with
return
.
use std::env::args;
fn main() {
let mut args = args();
loop {
match args.next() {
None => return,
Some(arg) => println!("{}", arg),
}
}
}
We also don't need to use return
here. Instead of returning from the
function, we can just break
out of the loop:
use std::env::args;
fn main() {
let mut args = args();
loop {
match args.next() {
None => break,
Some(arg) => println!("{}", arg),
}
}
}
Or, if you want to save on some indentation, you can use the if let
.
use std::env::args;
fn main() {
let mut args = args();
loop {
if let Some(arg) = args.next() {
println!("{}", arg);
} else {
break;
// return would work too, but break is nicer
// here, as it is more narrowly scoped
}
}
}
You can also use while let
. Try to guess what that would look like
before checking the next example:
use std::env::args;
fn main() {
let mut args = args();
while let Some(arg) = args.next() {
println!("{}", arg);
}
}
Getting better! Alright, one final example:
use std::env::args;
fn main() {
for arg in args() {
println!("{}", arg);
}
}
Whoa, what?!? Welcome to one of my favorite aspects of Rust. Iterators
are a concept built into the language directly, via for
loops. A
for
loop will automate the calling of next()
. It also hides away
the fact that there's some mutable state at play, at least to some
extent. This is a powerful concept, and allows a lot of code to end up
with a more functional style, something I happen to be a big fan of.
Skipping
It's all well and good that the first arguments in the name of the executable. But we typically don't care about that. Can we somehow skip that in our output? Well, here's one approach:
use std::env::args;
fn main() {
let mut args = args();
let _ = args.next(); // drop it on the floor
for arg in args {
println!("{}", arg);
}
}
That works, but it's a bit clumsy, especially compared to our previous
version that had no mutable variables. Maybe there's some other way to
skip things. Let's search the standard library
again. I
see the first results as
std::iter::Skip
and
std::iter::Iterator::skip
. The
former is a data type, and the latter is a method on the Iterator
trait. Since our Args
type implements the Iterator
trait, we can
use it. Nice!
Side note Haskellers: skip
is like drop
in most Haskell
libraries, like Data.List
or vector
. drop
has a totally
different meaning in Rust (dropping owned data), so skip
is a better
name in Rust.
Let's look at some signatures from the docs above:
pub struct Skip<I> { /* fields omitted */ }
fn skip(self, n: usize) -> Skip<Self>
Hmm... deep breaths. Skip
is a data type that is parameterized
over some data type, I
. This is a common pattern in iterators:
Skip
wraps around an existing data type and adds some new
functionality to how it iterates. The skip
method will consume an
existing iterator, take the number of arguments to skip, and return a
new Skip<OrigDataType>
value. How do I know it consumes the original
iterator? The first parameter is self
, not &self
or &mut self
.
That seemed like a lot of concepts. Fortunately, usage is pretty easy:
use std::env::args;
fn main() {
for arg in args().skip(1) {
println!("{}", arg);
}
}
Nice!
Exercise 1 Type inference lets the program above work just fine without any type annotations. However, it's a good idea to get used to the generated types, since you'll see them all too often in error messages. Get the program below to compile by fixing the type signature. Try to do it without using compiler at first, since the error messages will almost give the answer away.
use std::env::{args, Args};
use std::iter::Skip;
fn main() {
let args: Args = args().skip(1);
for arg in args {
println!("{}", arg);
}
}
This layering-of-datatypes approach, as mentioned above, is a real boon to performance. Iterator-heavy code will often compile down to an efficient loop, comparable with the best hand-rolled loop you could have written. However, iterator code is much higher level, more declarative, and easy to maintain and extend.
There's a lot more to iterators, but we're going to stop there for the moment, since we still want to process our command line parameters, and we need to learn one more thing first.
Parsing integers
If you search the standard library for parse
, you'll find the
str::parse
method. The
documentation does a good job of explaining things, I won't repeat
that here. Please go read that now.
OK, you're back? Turbofish is a funny name, right?
Take a crack at writing a program that prints the result of parsing
each command line argument as a u32
, then check my version:
fn main() {
for arg in std::env::args().skip(1) {
println!("{:?}", arg.parse::<u32>());
}
}
And let's try running it:
$ cargo run one 2 three four 5 6 7
Err(ParseIntError { kind: InvalidDigit })
Ok(2)
Err(ParseIntError { kind: InvalidDigit })
Err(ParseIntError { kind: InvalidDigit })
Ok(5)
Ok(6)
Ok(7)
When the parse is successful, we get the Ok
variant of the Result
enum. When the parse fails, we get the Err
variant, with a
ParseIntError
telling us what went wrong. (The type signature on parse
itself uses
some associated types to indicate this type, we're not going to get
into that right now.)
This is a common pattern in Rust. Rust has no runtime exceptions, so we track potential failure at the type level with actual values.
Side note You may think of panic
s as similar to runtime
exceptions, and to some extent they are. However, you're not able to
properly recover from panic
s, making them different in practice from
how runtime exceptions are used in other languages like Python.
Parse our command line
We're finally ready to get started on our actual command line parsing! We're going to be overly tedious in our implementation, especially with our data types. After we finish implementing this in a blank file, we'll move the code into the bouncy implementation itself. First, let's define a data type to hold a successful parse, which will contain the width and the height.
Challenge Will this be a struct or an enum? Can you try implementing this yourself first?
Since we want to hold onto multiple values, we'll be using a
struct
. I'd like to use named fields, so we have:
struct Frame {
width: u32,
height: u32,
}
Next, let's define an error type to represent all of the things that can go wrong during this parse. We have:
- Too few arguments
- Too many arguments
- Invalid integer
Challenge Are we going to use a struct or an enum this time?
This time, we'll use an enum, because we'll only detect one of these problems (whichever we notice first). Officianados of web forms and applicative parsing may scoff at this and say we should detect all errors, but we're going to be lazy.
enum ParseError {
TooFewArgs,
TooManyArgs,
InvalidInteger(String),
}
Notice that the InvalidInteger
variant takes a payload, the String
it failed parsing. This is what makes enum
s in Rust so much more
powerful than enumerations in most other languages.
Challenge We're going to write a parse_args
helper function. Can
you guess what its type signature will be?
Combining all of the knowledge we established above, here's an implementation:
#[derive(Debug)]
struct Frame {
width: u32,
height: u32,
}
#[derive(Debug)]
enum ParseError {
TooFewArgs,
TooManyArgs,
InvalidInteger(String),
}
fn parse_args() -> Result<Frame, ParseError> {
use self::ParseError::*; // bring variants into our namespace
let mut args = std::env::args().skip(1);
match args.next() {
None => Err(TooFewArgs),
Some(width_str) => {
match args.next() {
None => Err(TooFewArgs),
Some(height_str) => {
match args.next() {
Some(_) => Err(TooManyArgs),
None => {
match width_str.parse() {
Err(_) => Err(InvalidInteger(width_str)),
Ok(width) => {
match height_str.parse() {
Err(_) => Err(InvalidInteger(height_str)),
Ok(height) => Ok(Frame {
width,
height,
}),
}
}
}
}
}
}
}
}
}
}
fn main() {
println!("{:?}", parse_args());
}
Holy nested blocks Batman, that is a lot of indentation! The pattern is pretty straightforward:
- Pattern match
- If we got something bad, stop with an
Err
- If we got something good, keep going
Haskellers at this point are screaming about do
notation and
monads. Ignore them. We're in the land of Rust, we don't take
kindly to those things around here. (Someone please yell at me for
that terrible pun.)
Exercise 2 Why didn't we need to use the turbofish on the call to
parse
above?
What we want to do is return early from our function. You know what
keyword can help with that? That's right: return
!
fn parse_args() -> Result<Frame, ParseError> {
use self::ParseError::*;
let mut args = std::env::args().skip(1);
let width_str = match args.next() {
None => return Err(TooFewArgs),
Some(width_str) => width_str,
};
let height_str = match args.next() {
None => return Err(TooFewArgs),
Some(height_str) => height_str,
};
match args.next() {
Some(_) => return Err(TooManyArgs),
None => (),
}
let width = match width_str.parse() {
Err(_) => return Err(InvalidInteger(width_str)),
Ok(width) => width,
};
let height = match height_str.parse() {
Err(_) => return Err(InvalidInteger(height_str)),
Ok(height) => height,
};
Ok(Frame {
width,
height,
})
}
Much nicer to look at! However, it's still a bit repetitive, and
littering those return
s everywhere is subjectively not very nice. In
fact, while typing this up, I accidentally left off a few of the
return
s and got to stare at some long error messages. (Try that for
yourself.)
Question mark
Side note The trailing question mark we're about to introduce used
to be the try!
macro in Rust. If you're confused about the seeming
overlap: it's simply a transition to new syntax.
The pattern above is so common that Rust has built in syntax for it. If you put a question mark after an expression, it basically does the whole match/return-on-Err thing for you. It's more powerful than we'll demonstrate right now, but we'll get to that extra power a bit later.
To start off, we're going to define some helper functions to:
- Require another argument
- Require that there are no more arguments
- Parse a
u32
All of these need to return Result
values, and we'll use a
ParseError
for the error case in all of them. The first two
functions need to take a mutable reference to our arguments. (As a
side note, I'm going to stop using the skip
method now, because if I
do it will give away the solution to exercise 1.)
use std::env::Args;
fn require_arg(args: &mut Args) -> Result<String, ParseError> {
match args.next() {
None => Err(ParseError::TooFewArgs),
Some(s) => Ok(s),
}
}
fn require_no_args(args: &mut Args) -> Result<(), ParseError> {
match args.next() {
Some(_) => Err(ParseError::TooManyArgs),
// I think this looks a little weird myself.
// But we're wrapping up the unit value ()
// with the Ok variant. You get used to it
// after a while, I guess
None => Ok(()),
}
}
fn parse_u32(s: String) -> Result<u32, ParseError> {
match s.parse() {
Err(_) => Err(ParseError::InvalidInteger(s)),
Ok(x) => Ok(x),
}
}
Now that we have these helpers defined, our parse_args
function is
much easier to look at:
fn parse_args() -> Result<Frame, ParseError> {
let mut args = std::env::args();
// skip the command name
let _command_name = require_arg(&mut args)?;
let width_str = require_arg(&mut args)?;
let height_str = require_arg(&mut args)?;
require_no_args(&mut args)?;
let width = parse_u32(width_str)?;
let height = parse_u32(height_str)?;
Ok(Frame { width, height })
}
Beautiful!
Forgotten question marks
What do you think happens if you forget the question mark on the let width_str
line? If you do so:
width_str
will contain aResult<String, ParseError>
instead of aString
- The call to
parse_u32
will not type check
error[E0308]: mismatched types
--> src/main.rs:50:27
|
50 | let width = parse_u32(width_str)?;
| ^^^^^^^^^ expected struct `std::string::String`, found enum `std::result::Result`
|
= note: expected type `std::string::String`
found type `std::result::Result<std::string::String, ParseError>`
That's nice. But what will happen if we forget the question mark on
the require_no_args
call? We never use the output value there, so it
will type check just fine. Now we have the age old problem of C: we're
accidentally ignoring error codes!
Well, not so fast. Check out this wonderful warning from the compiler:
warning: unused `std::result::Result` which must be used
--> src/main.rs:49:5
|
49 | require_no_args(&mut args);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_must_use)] on by default
= note: this `Result` may be an `Err` variant, which should be handled
That's right: Rust will detect if you've ignored a potential failure. There is a hole in this in the current code sample:
let _command_name = require_arg(&mut args);
That doesn't trigger the warning, since in let _name = blah;
, the
leading underscore says "I know what I'm doing, I don't care about
this value." Instead, it's better to write the code without the let
:
require_arg(&mut args);
Now we get a warning, which can be solved with the added trailing question mark.
Exercise 3
It would be more convenient to use method call syntax. Let's define a helper data type to make this possible. Fill in the implementation of the code below.
#[derive(Debug)]
struct Frame {
width: u32,
height: u32,
}
#[derive(Debug)]
enum ParseError {
TooFewArgs,
TooManyArgs,
InvalidInteger(String),
}
struct ParseArgs(std::env::Args);
impl ParseArgs {
fn new() -> ParseArgs {
unimplemented!()
}
fn require_arg(&mut self) -> Result<String, ParseError> {
match self.0.next() {
}
}
fn require_no_args(&mut self) -> Result<(), ParseError> {
unimplemented!()
}
}
fn parse_args() -> Result<Frame, ParseError> {
let mut args = ParseArgs::new();
// skip the command name
args.require_arg()?;
let width_str = args.require_arg()?;
let height_str = args.require_arg()?;
args.require_no_args()?;
let width = parse_u32(width_str)?;
let height = parse_u32(height_str)?;
Ok(Frame { width, height })
}
fn main() {
println!("{:?}", parse_args());
}
Updating bouncy
This next bit should be done as a Cargo project, not with
rustc
. Let's start a new empty project:
$ cargo new bouncy-args --bin
$ cd bouncy-args
Next, let's get the old
code
and place it in src/main.rs
. You can copy-paste manually, or run:
$ curl https://gist.githubusercontent.com/snoyberg/5307d493750d7b48c1c5281961bc31d0/raw/8f467e87f69a197095bda096cbbb71d8d813b1d7/main.rs > src/main.rs
Run cargo run
and make sure it works. You can use Ctrl-C
to kill
the program.
We already wrote fully usable argument parsing code above. Instead of putting it in the same source file, let's put it in its own file. In order to do so, we're going to have to play with modules in Rust.
For convenience, you can view the full source code as a Gist. We need to put this in src/parse_args.rs
:
$ curl https://gist.githubusercontent.com/snoyberg/568899dc3ae6c82e54809efe283e4473/raw/2ee261684f81745b21e571360b1c5f5d77b78fce/parse_args.rs > src/parse_args.rs
If you run cargo build
now, it won't even look at
parse_args.rs
. Don't believe me? Add some invalid content to the top
of that file and run cargo build
again. Nothing happens, right? We
need to tell the compiler that we've got another module in our
project. We do that by modifying src/main.rs
. Add the following line
to the top of your file:
mod parse_args;
If you put in that invalid line before, running cargo build
should
now result in an error message. Perfect! Go ahead and get rid of that
invalid line and make sure everything compiles and runs. We won't be
accepting command line arguments yet, but we're getting closer.
Use it!
We're currently getting some dead code warnings, since we aren't using anything from the new module:
warning: struct is never constructed: `Frame`
--> src/parse_args.rs:2:1
|
2 | struct Frame {
| ^^^^^^^^^^^^
|
= note: #[warn(dead_code)] on by default
warning: enum is never used: `ParseError`
--> src/parse_args.rs:8:1
|
8 | enum ParseError {
| ^^^^^^^^^^^^^^^
warning: function is never used: `parse_args`
--> src/parse_args.rs:14:1
|
14 | fn parse_args() -> Result<Frame, ParseError> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Let's fix that. To start off, add the following to the top of your
main
function, just to prove that we can, in fact, use our new module:
println!("{:?}", parse_args::parse_args());
return; // don't start the game, our output will disappear
Also, add a pub
in front of the items we want to access from the
main.rs
file, namely:
struct Frame
enum ParseError
fn parse_args
Running this gets us:
$ cargo run
Compiling bouncy-args v0.1.0 (/Users/michael/Desktop/tmp/bouncy-args)
warning: unreachable statement
--> src/main.rs:115:5
|
115 | let mut game = Game::new();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unreachable_code)] on by default
warning: variable does not need to be mutable
--> src/main.rs:115:9
|
115 | let mut game = Game::new();
| ----^^^^
| |
| help: remove this `mut`
|
= note: #[warn(unused_mut)] on by default
Finished dev [unoptimized + debuginfo] target(s) in 0.67s
Running `target/debug/bouncy-args`
Err(TooFewArgs)
It's nice that we get an unreachable statement warning. It's also a
bit weird that game
is no longer required to be
mutable. Strange. But most importantly: our argument parsing is
working!
Let's try to use this. We'll modify the Game::new()
method to accept
a Frame
as input:
impl Game {
fn new(frame: Frame) -> Game {
let ball = Ball {
x: 2,
y: 4,
vert_dir: VertDir::Up,
horiz_dir: HorizDir::Left,
};
Game {frame, ball}
}
...
}
And now we can rewrite our main
function as:
fn main () {
match parse_args::parse_args() {
Err(e) => {
// prints to stderr instead of stdout
eprintln!("Error parsing args: {:?}", e);
},
Ok(frame) => {
let mut game = Game::new(frame);
let sleep_duration = std::time::Duration::from_millis(33);
loop {
println!("{}", game);
game.step();
std::thread::sleep(sleep_duration);
}
}
}
}
Mismatched types
We're good, right? Not quite:
error[E0308]: mismatched types
--> src/main.rs:114:38
|
114 | let mut game = Game::new(frame);
| ^^^^^ expected struct `Frame`, found struct `parse_args::Frame`
|
= note: expected type `Frame`
found type `parse_args::Frame`
We now have two different definitions of Frame
: in our parse_args
module, and in main.rs
. Let's fix that. First, delete the Frame
declaration in main.rs
. Then add the following after our mod parse_args;
statement:
use self::parse_args::Frame;
self
says we're finding a module that's a child of the current
module.
Public and private
Now everything will work, right? Wrong again! cargo build
will vomit
a bunch of these errors:
error[E0616]: field `height` of struct `parse_args::Frame` is private
--> src/main.rs:85:23
|
85 | for row in 0..self.frame.height {
|
By default, identifiers are private in Rust. In order to expose them
from one module to another, you need to add the pub
keyword. For
example:
pub width: u32,
Go ahead and add pub
as needed. Finally, if you run cargo run
, you
should see Error parsing args: TooFewArgs
. And if you run cargo run 5 5
, you should see a much smaller frame than before. Hurrah!
Exercise 4
What happens if you run cargo run 0 0
? How about cargo run 1 1
?
Put in some better error handling in parse_args
.
Exit code
Alright, one final irritation. Let's provide some invalid arguments and inspect the exit code of the process:
$ cargo run 5
Error parsing args: TooFewArgs
$ echo $?
0
For those not familiar: a 0 exit code means everything went OK. That's
clearly not the case here! If we search the standard library, it seems
the
std::process::exit
can be used to address this. Go ahead and try using that to solve the
problem here.
However, we've got one more option: we can return a Result
straight
from main
!
fn main () -> Result<(), self::parse_args::ParseError> {
match parse_args::parse_args() {
Err(e) => {
return Err(e);
},
Ok(frame) => {
let mut game = Game::new(frame);
let sleep_duration = std::time::Duration::from_millis(33);
loop {
println!("{}", game);
game.step();
std::thread::sleep(sleep_duration);
}
}
}
}
Exercise 5 Can you do something to clean up the nesting a bit here?
Better error handling
The error handling problem we had in the last lesson involved the call
to top_bottom
. I've already included a solution to that in the
download of the code provided. Guess what I changed since last time
and then check the code to confirm that you're right.
If you're following very closely, you may be surprised that there
aren't more warnings about unused Result
values coming from other
calls to write!
. As far as I can tell, this is in fact a bug in the
Rust compiler.
Still, it would be good practice to fix up those calls to
write!
. Take a stab at doing so.
Next time
We still didn't fix our double buffering problem, we'll get to that next time. We're also going to introduce some more error handling from the standard library. And maybe we'll get to play a bit more with iterators as well.
Rust at FP Complete | Introduction