Introduction to Rust

February 15, 2020

Many of my recent projects are written in Rust, a new-ish systems programming language with the stated goal of “empowering everyone to build reliable and efficient software.” I’m a big fan, and so are others. I think Rust gets a lot of things right on the tooling and language side that make it fun to program in.

There are a ton of cool Rust libraries out there, and if you want to use any them, you’ll have to at least learn a bit of it. Lucky for you, Rust offers a variety of ways to learn including documentation, a book, and an interactive course. There are also myriad other blog posts out there on learning Rust, many of which are probably better than this one.

This post is targeted at people with experience in several programming languages and just want to start typing something soon. You should definitely read the book, but maybe this will get you off the ground faster. We’ll implement a simple arithmetic interpreter and explore some of the more important types and concepts you’ll see when interacting with Rust code.

First Steps

First off, you’ve got to install Rust. The install page should do nicely.

You almost definitely want to use this method as opposed to what’s packaged with your OS. This installs rustup, Rust’s toolchain management tool, as well as the latest stable toolchain which includes the compiler (rustc) and build tool (cargo). You’ll really only ever interact with cargo in the regular course of working with Rust.

Now let’s create a new project with cargo new --lib demo. This makes a new directory called demo and sets it up to make a Rust library.

Inside demo, you should find two files:

Notice how lib.rs is already set up for you with a test marked with the #[test] attribute. This test lives in a module called tests that is conditionally compiled only in test mode.

There are a couple ways to build your project:

By default, cargo build and cargo test compile your code without optimizations. Add the --release flag to compile with optimizations.

Arithmetic

Ok, let’s get started. At the top of lib.rs, outside of the tests module, add the following:

enum Expr {
    Num(i32),
    Add(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
}

fn eval(expr: Expr) -> i32 {
    match expr {
        Expr::Num(i) => i,
        Expr::Add(l, r) => eval(*l) + eval(*r),
        Expr::Mul(l, r) => eval(*l) * eval(*r),
    }
}

This should look mostly familiar if you’ve done much functional programming. I’ll try cover some Rust-isms that may be less familiar.

In Rust, enums are algebraic data types, so variants can carry data or even be generic. To create or match on an enum, you must use the :: syntax to refer to its variants. Alternatively, you could have put a use Expr::*; to include the variants into a scope.

Our eval function contains expressions rather than statements. Rust has both, and they come together in the very common block syntactic form. A block consists of statements followed by an optional expression, and it evaluates to the value of that expression (or to () if there wasn’t one). For example, we could equivalently write the Add arm of eval:

Expr::Add(l, r) => {
    let i1 = eval(*l);
    let i2 = eval(*r);
    i1 + i2
}

Note the need for Box in the recursive parts of the enum declaration. Rust gives you great control over things like memory layout; types are all “in line” by default. This is normally great for performance, but it can cause trouble with recursive data structures. If you removed the Boxes, you’d get a (pretty helpful!) error like the following:

error[E0072]: recursive type `Expr` has infinite size
 --> src/lib.rs:1:1
  |
1 | enum Expr {
  | ^^^^^^^^^ recursive type has infinite size
2 |     Num(i32),
3 |     Add(Expr, Expr),
  |         ----  ---- recursive without indirection
  |         |
  |         recursive without indirection
  |
  = help: insert indirection (e.g., a `Box`, `Rc`, or `&`)
          at some point to make `Expr` representable

You should still be getting warnings about unused types and functions. Rust sees you’re building a library, and there is code with no public path to it. You can get rid of these by prefixing enum Expr and fn eval with pub to export them from the library.

Borrowing and Ownership

Borrowing and ownership are likely to be the most confusing concepts if you’re coming from another programming language. Again, the chapter in the Rust book is the gold standard here, but this quick run-through will suffice for what we’re doing.

In Rust, pretty much anything you do—passing arguments, making assignments, and so on—consumes the given values. So the following variation on our eval function fails to compile:

Expr::Add(l, r) => {
    let i1 = eval(*l);
    let i2 = eval(*r);
    let _  = eval(*r);
    i1 + i2
}

We’ve consumed *r, but then we try to use it again! Once more, the compiler steps in with a helpful message:

error[E0382]: use of moved value: `*r`
  --> src/lib.rs:16:27
   |
15 |             let i2 = eval(*r);
   |                           -- value moved here
16 |             let _  = eval(*r);
   |                           ^^ value used here after move
   |
   = note: move occurs because `*r` has type `Expr`, which does not implement the `Copy` trait

It mentions the Copy trait which gives some types copy semantics instead of move semantics. Primitives (i32, u8, f64, bool, etc.) and immutable references (which we’ll see in a moment) implement Copy.

Conceptually, we’ve designed the eval function to take ownership of its argument, but we could instead implement it to work on a borrowed Expr. This would be more permissive to the caller and more idiomatic; it expresses the fact that I don’t need to take an Expr to eval it, I just need to look at it.

fn eval(expr: &Expr) -> i32 {
    match expr {
        Expr::Num(i) => *i,
        Expr::Add(l, r) => eval(l) + eval(r),
        Expr::Mul(l, r) => eval(l) * eval(r),
    }
}

The differences here are subtle and worth unpacking. We’ve changed the type of the parameter expr from an owned Expr to an immutably borrowed &Expr. We can still match on a borrowed value, but this time a little bit of magic is happening. When you match on a reference, Rust will basically give you borrowed versions on the inner variant types. For example, the i in the arm Expr::Num(i) is of type &i32, so we must dereference it to get an i32. This works because i32 implements Copy, so deref-ing copies instead of moves. In the other arms, Rust cannot give you the Box<Expr>s for l and r, because you don’t own the expr. It could give you a &Box<Expr>, but that’s not very useful; it’s essentially a pointer to a pointer. Instead, Rust automatically borrows it for you, so l and r are of type &Expr, allowing us to make the recursive call without further modification.

We can write a test that shows we can indeed eval the same Expr twice:

#[cfg(test)]
mod tests {
    // import everything from the parent of the tests module
    // this is very common in tests
    use super::*;

    #[test]
    fn take_a_borrow_twice() {
        let e = Expr::Add(
            Box::new(Expr::Num(1)),
            Box::new(Expr::Num(3)),
        );
        assert_eq!(eval(&e), 4);
        assert_eq!(eval(&e), 4);
    }
}

Impl

Our eval is a free-standing function, but it’s better suited as a method of Expr. Methods are nice because you can use the foo.bar() syntax, and you don’t have to import them (Rust can figure out what you’re calling based on the receiver type).

Let’s convert eval to a method:

impl Expr {
    fn eval(&self) -> i32 {
        match self {
            Expr::Num(i) => *i,
            Expr::Add(l, r) => l.eval() + r.eval(),
            Expr::Mul(l, r) => l.eval() * r.eval(),
        }
    }
}

We’ve changed the single parameter eval to be its receiver, using the special syntax for defining a method that takes its receiver by immutable reference. Methods can also take a mutable reference (&mut self) or even consume the value (self). We also changed the recursive called to the method syntax.

Note that you can still use the function-style syntax, it’s just namespaced:

let e = Expr::Add(
    Box::new(Expr::Num(1)),
    Box::new(Expr::Num(3)),
);
assert_eq!(e.eval(), 4);
assert_eq!(Expr::eval(&e), 4);

See how e is an owned Expr, and eval takes it’s argument by reference, but we don’t need to borrow to call the method? The method syntax also buys you automatic borrowing of the receiver.

This is called an inherent method of Expr, as opposed to trait method that belongs to a trait that Expr happens to implement. Let’s implement the extremely common Clone trait for Expr:

impl Clone for Expr {
    fn clone(&self) -> Self {
        match self {
            Expr::Num(i) => Expr::Num(*i),
            Expr::Add(l, r) => Expr::Add(l.clone(), r.clone()),
            Expr::Mul(l, r) => Expr::Mul(l.clone(), r.clone()),
        }
    }
}

#[test]
fn clone_it() {
    let two = Box::new(Expr::Num(2));
    // don't need to clone the second time,
    // moving is fine as long as we don't use it after
    let e = Expr::Add(two.clone(), two);
    assert_eq!(e.eval(), 4);
}

The Clone trait allows you to explicitly (unlike Copy, which is implicit) copy things. The copy is apparent at the type level: clone takes a borrowed value and gives you the real deal. This is an extremely boring trait implementation, and thankfully Rust has features to let us automatically derive many common traits. Let’s go ahead and derive Clone for Expr (deleting our implementation), as well as the very common Debug, PartialEq, and Eq traits:

#[derive(Debug, Clone, PartialEq, Eq)]
enum Expr {
    Num(i32),
    Add(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
}

Error Handling

Let’s make our little language more complex, so it’s possible for eval to fail. We’ll see how Rust uses the Result type to handle errors.

First, let’s do it the easy way and just crash the program on an error. Let’s evaluate Exprs down to Values, which are either numbers or booleans. We’ll replace Mul with LessThan to make things more interesting.

#[derive(Debug, Clone, PartialEq, Eq)]
enum Expr {
    Value(Value),
    Add(Box<Expr>, Box<Expr>),
    LessThan(Box<Expr>, Box<Expr>),
}

#[derive(Debug, Clone, PartialEq, Eq)]
enum Value {
    Num(i32),
    Bool(bool),
}

impl Expr {
    fn eval(&self) -> Value {
        match self {
            Expr::Value(v) => v.clone(),
            Expr::Add(l, r) => {
                let nl = l.eval().to_num();
                let nr = r.eval().to_num();
                Value::Num(nl + nr)
            }
            Expr::LessThan(l, r) => {
                let nl = l.eval().to_num();
                let nr = r.eval().to_num();
                Value::Bool(nl < nr)
            }
        }
    }
}

We’ll need some helper methods on Value to forcibly get the right types out. These will crash the program if called on the wrong Value.

impl Value {
    fn to_num(&self) -> i32 {
        match self {
            Value::Num(i) => *i,
            Value::Bool(b) => panic!("Expected num, got bool {}", b),
        }
    }

    fn to_bool(&self) -> bool {
        match self {
            Value::Num(i) => panic!("Expected bool, got num {}", i),
            Value::Bool(b) => *b,
        }
    }
}

And finally, let’s check to make sure that everything is still working. Writing tests is a little verbose now, but we’ll fix that once we get to generics.

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn do_it() {
        let two = Box::new(Expr::Value(Value::Num(2)));
        let zero = Box::new(Expr::Value(Value::Num(0)));

        let e1 = Expr::Add(two.clone(), zero.clone());
        assert_eq!(e1.eval().to_num(), 2);

        let e2 = Expr::LessThan(two.clone(), zero.clone());
        assert_eq!(e2.eval().to_bool(), false);
    }
}

So we’ve introduced some errors, now let’s handle them. The first step is to stop crashing the program! We’ll need to use the Result type. A Result<T, E> is generic over T, the type if things were ok, and E, the error type. For now we’ll just use String as the error type: Let’s make our to_num and to_bool methods return Results.

impl Value {
    fn to_num(&self) -> Result<i32, String> {
        match self {
            Value::Num(i) => Ok(*i),
            Value::Bool(b) => Err(format!("Expected num, got bool {}", b)),
        }
    }

    fn to_bool(&self) -> Result<bool, String> {
        match self {
            Value::Num(i) => Err(format!("Expected bool, got num {}", i)),
            Value::Bool(b) => Ok(*b),
        }
    }
}

Great, now the helper methods don’t crash. Instead they return Results, which can tell us if they worked properly or not. Now we have to update our eval method. It should return Result<Value, String> now, since it can fail as well.

In the recursive cases, we want to propagate any errors returned by the recursive call or the to_num or to_bool conversions. We could do this a bunch of nested matches, but that would be pretty nasty. You could also use some of the many combinators on the Result type, but instead, we’ll use the handy ? operator:

impl Expr {
    fn eval(&self) -> Result<Value, String> {
        match self {
            Expr::Value(v) => Ok(v.clone()),
            Expr::Add(l, r) => {
                let nl = l.eval()?.to_num()?;
                let nr = r.eval()?.to_num()?;
                Ok(Value::Num(nl + nr))
            }
            Expr::LessThan(l, r) => {
                let nl = l.eval()?.to_num()?;
                let nr = r.eval()?.to_num()?;
                Ok(Value::Bool(nl < nr))
            }
        }
    }
}

Spot the difference? The ? might seem like magic, but it’s actually a pretty straightforward transformation. Essentially, x? desugars to:

match x {
  Ok(ok) => ok,
  Err(err) => return err
}

The ? is inserting early returns into our function, bailing if the given expression is an Err. Note that this only works on expressions of type Result inside functions that return Result. This gives us a very concise way to propagate errors.

Swell, now we can test to make sure the error handling is doing the right thing:

let two = Box::new(Expr::Value(Value::Num(2)));
let zero = Box::new(Expr::Value(Value::Num(0)));
let tru = Box::new(Expr::Value(Value::Bool(true)));

let e1 = Expr::Add(two.clone(), zero.clone());
assert_eq!(e1.eval().unwrap().to_num(), Ok(2));

let e2 = Expr::LessThan(two.clone(), zero);
assert_eq!(e2.eval().unwrap().to_bool(), Ok(false));

let e3 = Expr::LessThan(two, tru);
assert!(e3.eval().is_err());

Note the use of the unwrap method to get a T out of a Result<T, E>, crashing if it’s an error. Also, we can check to see if a Result is an error (like in the e3 assertion) with the is_err method.

Generics and the From Trait

You may have notices that actually making expressions is a little verbose. For example, making the Expr corresponding to 1 + (2 + 3) in a single Rust expression looks like this:

let e: Expr = Expr::Add(
    Box::new(Expr::Value(Value::Num(1))),
    Box::new(Expr::Add(
        Box::new(Expr::Value(Value::Num(2))),
        Box::new(Expr::Value(Value::Num(3))),
    )),
);

We can make this a lot better by taking advantage of the important From and Into traits to do conversion between types.

Let’s start with values. We want to make it so i32s and bools know how to make themselves into Values. In other words, Value show know how to make one of itself from an i32 or bool. Here’s the needed impl for From<i32>, the From<bool> is analogous:

impl From<i32> for Value {
    fn from(i: i32) -> Value {
        Value::Num(i)
    }
}

In a similar vein, we should know how to get an Expr from a Value:

impl From<Value> for Expr {
    fn from(v: Value) -> Expr {
        Expr::Value(v)
    }
}

But we can actually do one better. Not only can we get an Expr from a Value, but we can get an Expr from anything that knows how to convert itself into a Value. So instead of the above impl From<Value>, we can do this with generics:

impl<V: Into<Value>> From<V> for Expr {
    fn from(v: V) -> Expr {
        Expr::Value(v.into())
    }
}

In plain English, I would read this as “for all Vs that implement Into<Value>, we’re going to implement From<V> for Expr.”

Now that we’ve implemented these conversion traits, we can use them to write a handy generic function that will let us easily create expressions. Rust has some equivalent ways to express trait bounds on generics, so I’ll write the same function three ways.

fn add<T1: Into<Expr>, T2: Into<Expr>>(l: T1, r: T2) -> Expr {
    Expr::Add(Box::new(l.into()), Box::new(r.into()))
}

fn add<T1, T2>(l: T1, r: T2) -> Expr
where
    T1: Into<Expr>,
    T2: Into<Expr>,
{
    Expr::Add(Box::new(l.into()), Box::new(r.into()))
}

fn add(l: impl Into<Expr>, r: impl Into<Expr>) -> Expr {
    Expr::Add(Box::new(l.into()), Box::new(r.into()))
}

The different syntaxes fit better in situations where the generics are more or less complex. Choose your favorite for now. Finally, we can use this to make Exprs:

let e1 = Expr::Add(
    Box::new(Expr::Value(Value::Num(1))),
    Box::new(Expr::Add(
        Box::new(Expr::Value(Value::Num(2))),
        Box::new(Expr::Value(Value::Num(3))),
    )),
);
let e2 = add(1, add(2, 3));
assert_eq!(e1, e2);

That’s much better, right?

Conclusion

So that’s the quick intro. We’ve obviously only scratched the surface, but hopefully this will help you jump into a codebase that’s exciting you and understand what’s going on.

From here I would recommend still looking at the book (it’s really good) or trying out Rust by Example. Have fun!