How Rust Makes Error Handling Part of the Language
In Spanish these are all “dedos,” while in English
we can distinguish between fingers and toes.
Learning a foreign language can be an incredible experience, not only because you can talk to new people, visit new countries, read new books, etc. When you learn the words someone from a different culture uses, you start to see things from their perspective. You understand the way they think a bit more.
The same is true for programming languages. Learning the syntax, keywords and patterns of a new programming language enables you to think about problems from a different perspective. You learn to solve problems in a different way.
I’ve been studying Rust recently, a new programming language for me. As a Ruby developer, I was curious to learn how Rust developers approach solving problems. What do Rust programs look like? What new words would I learn?
Why Rust Was Difficult For Me
I knew it would be a challenge to learn Rust. I had heard horror stories about how difficult the Rust compiler can be to use, or about how confusing the ownership memory model and the borrow checker can be. And I was right: Rust is a very difficult language to learn. But not because of move semantics or memory management.
For me, the most challenging syntax in Rust had to do with simple error handling. Let’s take an example: opening and reading a text file. In Ruby, this is a one-liner and error handling is completely optional:
string = File.read("foo.txt")
In Ruby, File.read returns a simple string. Will this ever return an error? Who knows. Maybe Ruby will raise an exception, maybe not. I don’t have to worry about that at the call site when I’m writing the code. I can focus on the happy path, but I end up with a program that can’t handle errors.
Golang, at least, returns an error value explicitly when I try to read a file:
b, err := ioutil.ReadFile("foo.txt") if err != nil { fmt.Print(err) } else { str := string(b) }
Here the Golang ioutil.ReadFile function returns two values: the string I want and also an error value. The Go compiler forces me to think about errors that might occur, at least for a moment. But error handling is still optional. I can simply choose to ignore the err value entirely. Most C programs work in a similar fashion, returning an error code in some manner. And if I do choose to handle the error, I end up with verbose, messy code that checks for error codes over and over again.
In Rust error handling in mandatory. Let’s try to rewrite the same example using Rust:
let mut file = File::open("foo.txt"); let mut contents = String::new(); file.read_to_string(&mut contents);
Right away I run into trouble when I try to compile this:
error[E0599]: no method named `read_to_string` found for type `std::result::Result<std::fs::File, std::io::Error>` in the current scope
What? What is the Rust compiler talking about? I can see there’s a read_to_string method on the File struct right in the documentation! (Actually the method is on the Read trait which File implements.) The problem is the File::open function doesn’t return a file at all. It returns a value of type io::Result<File>:
pub fn open<P: AsRef<Path>>(path: P) -> io::Result<File>
How do I use this? What does io::Result<File> even mean? When I try to write Rust code the way I write Ruby or Go code, I get cryptic errors and it doesn’t work.
The problem is I’m trying to speak Rust the same way I speak in Ruby. Rust is a foreign language; I need to learn some vocabulary before I can try to talk to someone. This is why Rust is difficult to learn. It’s a foreign language that uses many words completely unfamiliar to most developers.
Types Are the Vocabulary of Programming Languages
My wife is Spanish, and lucky for me she’s had the patience and the endurance to teach me and our kids Spanish over the years. As a native English speaker, it always seemed curious and amusing to me that Spanish has only one word for fingers and toes, dedos. Don’t people in Spain or Latin America ever need to talk about only fingers and not toes? Or vice-versa? And in Spain I invariably end up saying silly things like dedos altos (“upper fingers”), or dedos bajos (“lower fingers”). I always worry about which digits I’m talking about. Somehow, though, the Spanish never have any trouble with this; where the dedos are located always seems obvious to them from the context.
But I wonder: Do Spanish speakers have trouble learning English when it comes to fingers vs. toes? Do they ever say finger when they mean toe? The problem is not just learning a new word. You have to learn the meaning behind the word. English has a concept, a distinction, that Spanish doesn’t.
Back to computer programming, the “words” we use in programming languages aren’t only syntax tokens like if, else, let, etc. They are the values that we pass around in our programs. And those values have types, even for loosely, dynamically typed languages like Ruby.
Aside from whatever formal definition Computer Science has for types, I simply think of a value’s type as it’s meaning or purpose. To understand what role a value plays in your program, you need to understand the concept behind its type. Just like the words finger and toe represent certain anatomical concepts in English, types like Result<T, E> or Option<T> represent programming concepts in Rust - concepts that foreigners need to learn for the first time.
Language shapes the way we think, and determines what we can think about.
-- Benjamin Lee Whorf
In fact, some linguists take this to the extreme: That a language’s words determine what people in that community are able to think and talk about, what concepts they can understand. (However, most modern linguists, according to Wikipedia, don’t believe this is actually true.)
Because Rust includes the Result type, Rust programmers are empowered to talk about error handling in a very natural way. It’s part of their daily vocabulary. Of course, native Spanish speakers, I’m guessing, have no trouble understanding the distinction between fingers and toes. But I certainly have trouble understanding the concept behind Result in Rust.
If Rust is Spanish, then Haskell is Latin
So what does Result<T, E> mean? What is a value of type Result<T, E>?
Just as human language borrow words from other languages — many Spanish words are taken from Latin or Arabic while English borrowed many words from French and German — programming languages borrow words and concepts from other, older programming languages.
Rust borrowed the concept behind the Result<T, E> type from Haskell, a strongly typed functional programming language. Haskell includes a type called Either:
data Either a b = Left a | Right b
This syntax seems bizarre at first glance but in fact it’s simple. Haskell makes it easy to create new types by combining other types together. This line of code means the Either type is a combination of two other types: a and b. Drawing that type equation, this is how I visualize Haskell Either values:
A single Either value can only encapsulate either a value of type a or a value of type b:
-
If the Either value is Left, then it contains an inner value of type a. This is written: Left a
-
If the Either value is Right, then it contains an inner value of type b. This is written: Right b
The Either type is also “monad,” because Haskell provides certain functions that create and operate on Either values. I won’t cover this concept here today, but when I have time I'll discuss monads and how they can be applied to error handling in a future post.
In Haskell, the Either type is completely general, and you can use it to represent any programming concept you would like. Rust uses the concept behind Either for a specific purpose: to implement error handling. If Haskell is Latin, then Rust is Spanish, a younger language that borrows some of the older languages’s vocabulary and grammar.
Result<T, E> in Rust
In Rust, the Result type encapsulates two other types like Either. A single Result value has either one of those types or the other:
Instead of Left a and Right b like in Haskell, Rust uses the words Ok(T) and Err(E):
-
If the Result value is Ok, then it contains an inner value of type T. This is written: Ok(T). Ok(T) means some operation was successful, and the result of the operation is a value of type T.
-
If the Either value is Err, then it contains an inner value of type E. This is written: Err(E) Similarly, this means the operation was a failure, and the result of the operation is an error of type E.
Back to my open file example, the proper way to open a file and read it using Rust is to check the Result values returned by the Rust standard library functions:
fn main() { let file = File::open("foo.txt"); match file { Ok(file) => println!("I have a file: {:?}", file), Err(e) => println!("There was an error: {}", e) } }
And If I want to actually read in the contents of that file, I would check that return value also:
fn main() { let file = File::open("foo.txt"); match file { Ok(mut file) => { let mut contents = String::new(); match file.read_to_string(&mut contents) { Ok(_) => println!("The file's contents are: {}", contents), Err(e) => println!("There was an error: {}", e) } } Err(e) => println!("There was an error: {}", e) } }
The ? Operator In Rust
That last code snippet is quite a mouthful - error checking with Rust is even more tedious and verbose than it is using Go!
Fortunately, Rust includes an operator that allows Rust programmers to abbreviate all of this logic. By appending the ? character to the call site of a function that returns a Result<T, E> value, Rust automatically generates code that checks the Result<T, E> value, and returns underlying T value if the result is Ok(T):
fn main() { let mut file = File::open("foo.txt")?; let mut contents = String::new(); file.read_to_string(&mut contents)?; }
Here, the use of ? after File::open("foo.txt") tells the Rust compiler to check the return value of File::open for me automatically:
If the return value of File::open is Ok(T), then Rust assigns the inner T value to file. If File::open returns Err(E), then Rust jumps to the end of the main function immediately and returns.
The program above is much more concise and easy to understand. The only problem is that it doesn’t work! When I try to compile this, I get:
error[E0277]: the `?` operator can only be used in a function that returns `Result` or `Option` (or another type that implements `std::ops::Try`) --> src/main.rs:5:20 | 5 | let mut file = File::open("foo.txt")?; | ^^^^^^^^^^^^^^^^^^^^^^ cannot use the `?` operator in a function that returns `()` | = help: the trait `std::ops::Try` is not implemented for `()` = note: required by `std::ops::Try::from_error`
Rust Programs Revolve Around Error Handling
As the error message says, the problem here is that the ? operator generates code that will jump to the end of the main function and return the Err(E) value, where E is of type std::io::Error. The problem is that I haven’t declared a return value for main. Therefore the Rust compiler gives me an error:
the `?` operator can only be used in a function that returns `Result` or `Option` (or another type that implements `std::ops::Try`)
The function containing the use of the ? operator has to return a value of type Result<T, E> with a matching E type in order for this to make sense. I have to extract my File calls into a separate function, like this:
fn read() -> Result<String, std::io::Error> { let mut file = File::open("foo.txt")?; let mut contents = String::new(); file.read_to_string(&mut contents)?; Ok(contents) } fn main() { match read() { Ok(str) => println!("{}", str), Err(e) => println!("{:?}", e) } }
Note the new read() function above returns a value of type Result<String, std::io::Error>. This allows the use of the ? operator to compile properly. For the happy path, if my code is able to find the “foo.txt” file and read it, then read() returns Ok(contents). However, if there’s an error, read() will return Err(e), where e is a value of type std::io::Error. Note open returns the same error type that read does:
This is where Rust shines. It allows for concise and readable error handling that is also thorough and correct. The Rust compiler checks for error handling completeness at compile time, before I ever run my program.
Now that I’ve learned some vocabulary words, now that I can understand how native Rust speakers use the word Result<T, E>, I can have a Rust conversation about error handling. I can begin to think like Rust developers think. I can start to see things from their perspective.
And I begin to realize that Rust programs tend to be designed with error handling in mind. Notice above how I had to extract a separate function that returned a value of type Result<T, E>, just because of the ? operator. The overall structure of my program is determined by error handling just as much as it’s determined by the nature of the task I’m trying to accomplish. Rust programmers think about errors and what might go wrong from the very beginning, from when they start writing code. To be honest, I've often thought about errors and what might go wrong as an afterthought, after I've written and deployed my code.