Corner Cutting vs. Productivity
I recently got into a discussion with another very knowledgeable Rustacean, who (I paraphrase) claimed that Rust is about adding just enough roadblocks to keep you from cutting corners. This is a nice metaphor because it explains a lot: Rust may feel more cumbersome, because it won’t let you cut corners. On the other hand, once it compiles, many classes of errors will already have been taken care of, so your code will usually work as expected (or if you’re new to Rust, unexpectedly well).
This theme is pervasive throughout. Not only is the compiler strict, it is also often surprisingly helpful (and you can help make it even better, see issues marked with A-diagnostics!), the docs are so good (and with doctests also useful as tests) that they make you try to emulate their style (confession: I’ve never documented any code as well as my Rust code) and the Code of Conduct keeps you from cutting corners in personal interactions (sure it would be easier to be an ███████, but it will hinder more than help in the long run).
But I think this view implies that Rust is all roadblocks, which isn’t true. So my counter-argument here is that Rust is an experiment to see how productive it can make you without letting you cut any corners. I used to say “Writing Rust may be a little harder, but writing incorrect Rust is a lot harder”. But I (carelessly?) omitted the reference point. Harder than what? Today my conclusion is “harder than cutting corners in other languages”. But even that may underestimate Rust’s design.
Consider for example the ? operator, which together with the Result type creates an easy way to
correctly handle all errors. Now let’s see the alternatives:
- Go will simply return errors, and together with multiple return values this forms a poor man’s monadic error handling, unless for the fact that you can (and may out of lazyness or forgetfullness) cut corners and omit the error handling, with results ranging from benign to disastrous. The worst thing about this error handling style is that the failure mode (what happens if you forget to cross the t or dot the i) is that the code runs as if the error didn’t happen, possibly running into follow-up errors later that will be harder to debug because the original problem has been silently ignored
- Java has unchecked exceptions, which are mostly as bad as Go’s returns but for the slightly
better failure mode of an early exit until the next catch block. While unchecked exceptions were
usually reserved for programmer error and should not be caught at all, this is often done
regardless (and sometimes for good reason), so the failure mode very much depends on the catch
blocks on the call stack. In the worst case the error will be silently discarded (though this is
deemed very bad style), otherwise it will hopefully at least be logged. Java also has checked
exceptions, which must be caught somewhere, and should be used for errors stemming from system or
input (e.g.
IOException). To some this is the worst of both worlds, requiring more ceremony with little benefit. On the other hand, it at least compels the programmer to somehow deal with the errors that may happen. The downside compared to Rust is that it is unclear where in thetry-catchblock the exception may happen - Python’s exceptions are similar to Java’s unchecked exceptions. In current versions you can only
raisethings deriving fromBaseException, earlier versions even allowedraise "Ouch"– Ouch indeed - Lua allows you to call a function with an error handler, which is like a function-wide catch block
- On the non-GC language side, C++ has exception, but they are notoriously hard to work with while avoiding memory leaks and almost-constructed objects. Many programmers will try to avoid them for this exact reason, and I always have a smidgen of doubt if my C++ code is really exception-safe.
- C uses a mixture of return codes (which have the same downsides as with Go) and an archaic
mechanism called
setjmp/longjmp, which allows you to setup a handler and jump into it from arbitrary code. This can be used for basic error handling (and is for example used as the basis for Lua error handling)
In summary, the Rust way is both easier, naturally leads the programmer to do the right thing and conveys more information to boot.
A related example is Rust’s use of Option<_> to represent nullable values. Thus the type system
keeps you from forgetting to deal with the None case while simultaneously giving you combinator
functions (for example the powerful map_or) that make working with those types easy.
- C, C++ and Java have
null, lua hasnil(which is the same thing under a different name). It has the same bad failure mode as ignoring an error. At least in Java, you will get aNullPointerExceptionat the point of dereferencing the null (which may not be the same point as obtaining it). In C or C++ you’ll get a segmentation fault if you’re lucky, and nasal demons otherwise. - Java as of version 8 has an
Optional<T>object, that unfortunately still can benull. There’s a “Project Valhalla” to allow value objects in Java which would fix this. Here’s to hoping we get it soon. On the other hand, a very good static nullness check for Java has been introduced with the FindBugs (now SpotBugs) static analyzer - Go, like lua has
nil, but only for references. Since Go has value objects, those cannot benil
Verdict: null (or its cousin nil) has the effect of making every reference value an Option.
Using references for all objects (as Java does) exacerbates this problem considerably. By making
the Option type explicit, Rust (and other modern languages) greatly reduces the cognitive load it
entails.
Let’s take another example, splitting strings. In Rust, you get a Split object, which is an
Iterator over borrowed immutable string slices. This both avoids allocation for the parts and
lets you deal with the values one by one without needing a container. One can map str::to_owned
to get Strings of the subslices. Easy and performant. The downside is that we need two string
types, &str for borrowed string slices and String for owned strings. So what do other languages
do?
- Java’s
String.split(..)returns an array, which necessitates an allocation and requires either a count of the substrings before building or some kind of reallocation. Worse, until Java 7u6, the substrings reused the internal char array of the original String, which is similar to what Rust does. Unfortunately, Java has no way of limiting the substring lifetime to the lifetime of the original string, so it had to expand the lifetime of the original string instead. Since 7u6, the substrings are instead copied from the original string, requiring a bit less than double the memory (some memory is also needed for allocating the returned array), but letting the GC reclaim the original string earlier even if the substrings are still alive - Python’s
str.splitmethod returns a list of substrings. So they need to allocate both the list and copies of the substrings - Lua’s minuscule standard library has no functions to split strings, but
string.gmatchmight be used to search for the inverse of the split pattern - C has a wildly unsafe, aptly named
strtokfunction (“string tokenizing, remember this is a language from a time when long names were expensive). Apart from being a security nightmare it appears to work - C++ users, redditor CornedBee informs me have the option
of either using C’s
strtok, creating and reading “lines” from anistringstream(if the separator is a single character), writing their own or using some library, e.g. Boost.
Rust gives you a simple way to work with split strings without requiring allocation unless you
really need it. The borrow checker ensures it is all safe, where in other languages you’d have to
make copies left and right to be sure. Using the iterator’s methods is not much harder than working
with the lists or arrays of substring, and lends itself to further optimizations. I’d like to note
that other languages make it basically impossible to safely match Rust’s performance here. In this
case, the information retained in the Split iterator allows the compiler to eke out more
performance while keeping a familiar interface. Simply looping over the subslices will look mostly
the same as in other languages, and you only need a modicum of additional complexity to get
Strings (.map(str::to_owned)) or a collection (.collect(), though you may need to specify the
type of collection via “turbofish”: .collect::<Vec<_>>()).
Speaking of loops, Rust has loop, while, while let and for loops. While loop may seem
unnecessary, in fact all other loops are desugared into some loop + match combination. for
loops work with the IntoIterator and Iterator traits to compile down into well-performing
code.
- C and C++ have a
for (<initializer>;<test>;<increment>)… loop and awhile (<test>)… loop (also a version with a test after the loop body,do {…} while(<test>);). Theforloop is infamously sometimes misused for strange iteration patterns which may make the code hard to understand. The block brackets{…}are optional for all but the do-while loop, however this can lead people to misread code. C++11 also has afor(<binding> : <iterator>)loop similar to Rust’s - Java has both a C-style
forandwhileloops, as well as afor(<binding> : <iterator>)loop (spoken for-each loop) introduced in 1.5 that works with arrays andIterables. The latter is now the preferred form to iterate over collections of objects. Since Java 8, there are alsoStreams which almost work similar to Rust’sIterators, except of course they don’t allow as much control, miss some functions and need to allocate to be safe - Python has a
forloop that works with its own iterator protocol which is similar to Rust’s, however it uses aStopIterationexception to break the loop instead of anOptiontype. Python also haswhileloops. Since blocks are marked by indentation, it has less risk of misreading the code, however it makes refactoring more cumbersome, and you run the risk of losing track in larger swathes of code. - Lua also has a
forloop working with an internal iteration protocol similar to Python’s, but returning a sentinelnilvalue to end the iteration.
Rust loops are as simple as high-level Python’s, but offer far better performance. In addition,
loop { .. } arguably captures the intent of an unbounded loop better than while (true) { .. }.
My last example is destructuring pattern matching. Though it adds a lot of complexity to the
language, the resulting simplicity in the code is well worth it. Take a simple enum with a match
and you will both get the correct internals for each variant and be assured by the compiler that
you handled every possible variant.
- C and Java only have a
switchstatement that works on plain numeric values orenums (which may not contain additional data. You could use a tag enum + union idiom in C and would likely use a polymorphic type in Java if you need to store data. The former is crude and won’t give you any assurance that you used the right variant. The latter can get quite unwieldy and won’t give you the same assurances either. As for theswitchstatement, both the fact that it is a statement instead of an expression and that it allows fallthrough (you need tobreakfrom eachcase) exacerbates the ergonomics. - C++ has learned from this and now offers
std::varianttypes, which are still not used widely and I hear they have their own downsides (they don’t have destructuring, for one) - Python and Lua simply don’t have such statements. You need a wall of ifs or a dict/table to do something like matching, and it still won’t give you safe access to the data. In Python, you could also use Java-like class polymorphism, but that doesn’t get you better ergonomics either.
Short story short, Rust’s enums + destructuring matches offer great productivity while doing away with footguns other languages may allow or even prescribe. Here the language pays with a bunch of complexity to make your code easier to both write and read (at least to the initiated).
I hope those examples have convinced you that disallowing cut corners doesn’t need to harm productivity even in the short run. In the long run, being diligent obviously wins.