Corner Cutting vs. Productivity
I recently got into a discussion with another very knowledgeable Rustacean, who (I paraphrase) claimed that Rust is about adding just enough roadblocks to keep you from cutting corners. This is a nice metaphor because it explains a lot: Rust may feel more cumbersome, because it won’t let you cut corners. On the other hand, once it compiles, many classes of errors will already have been taken care of, so your code will usually work as expected (or if you’re new to Rust, unexpectedly well).
This theme is pervasive throughout. Not only is the compiler strict, it is also often surprisingly helpful (and you can help make it even better, see issues marked with A-diagnostics!), the docs are so good (and with doctests also useful as tests) that they make you try to emulate their style (confession: I’ve never documented any code as well as my Rust code) and the Code of Conduct keeps you from cutting corners in personal interactions (sure it would be easier to be an ███████, but it will hinder more than help in the long run).
But I think this view implies that Rust is all roadblocks, which isn’t true. So my counter-argument here is that Rust is an experiment to see how productive it can make you without letting you cut any corners. I used to say “Writing Rust may be a little harder, but writing incorrect Rust is a lot harder”. But I (carelessly?) omitted the reference point. Harder than what? Today my conclusion is “harder than cutting corners in other languages”. But even that may underestimate Rust’s design.
Consider for example the ?
operator, which together with the Result
type creates an easy way to
correctly handle all errors. Now let’s see the alternatives:
- Go will simply return errors, and together with multiple return values this forms a poor man’s monadic error handling, unless for the fact that you can (and may out of lazyness or forgetfullness) cut corners and omit the error handling, with results ranging from benign to disastrous. The worst thing about this error handling style is that the failure mode (what happens if you forget to cross the t or dot the i) is that the code runs as if the error didn’t happen, possibly running into follow-up errors later that will be harder to debug because the original problem has been silently ignored
- Java has unchecked exceptions, which are mostly as bad as Go’s returns but for the slightly
better failure mode of an early exit until the next catch block. While unchecked exceptions were
usually reserved for programmer error and should not be caught at all, this is often done
regardless (and sometimes for good reason), so the failure mode very much depends on the catch
blocks on the call stack. In the worst case the error will be silently discarded (though this is
deemed very bad style), otherwise it will hopefully at least be logged. Java also has checked
exceptions, which must be caught somewhere, and should be used for errors stemming from system or
input (e.g.
IOException
). To some this is the worst of both worlds, requiring more ceremony with little benefit. On the other hand, it at least compels the programmer to somehow deal with the errors that may happen. The downside compared to Rust is that it is unclear where in thetry
-catch
block the exception may happen - Python’s exceptions are similar to Java’s unchecked exceptions. In current versions you can only
raise
things deriving fromBaseException
, earlier versions even allowedraise "Ouch"
– Ouch indeed - Lua allows you to call a function with an error handler, which is like a function-wide catch block
- On the non-GC language side, C++ has exception, but they are notoriously hard to work with while avoiding memory leaks and almost-constructed objects. Many programmers will try to avoid them for this exact reason, and I always have a smidgen of doubt if my C++ code is really exception-safe.
- C uses a mixture of return codes (which have the same downsides as with Go) and an archaic
mechanism called
setjmp
/longjmp
, which allows you to setup a handler and jump into it from arbitrary code. This can be used for basic error handling (and is for example used as the basis for Lua error handling)
In summary, the Rust way is both easier, naturally leads the programmer to do the right thing and conveys more information to boot.
A related example is Rust’s use of Option<_>
to represent nullable values. Thus the type system
keeps you from forgetting to deal with the None
case while simultaneously giving you combinator
functions (for example the powerful map_or
) that make working with those types easy.
- C, C++ and Java have
null
, lua hasnil
(which is the same thing under a different name). It has the same bad failure mode as ignoring an error. At least in Java, you will get aNullPointerException
at the point of dereferencing the null (which may not be the same point as obtaining it). In C or C++ you’ll get a segmentation fault if you’re lucky, and nasal demons otherwise. - Java as of version 8 has an
Optional<T>
object, that unfortunately still can benull
. There’s a “Project Valhalla” to allow value objects in Java which would fix this. Here’s to hoping we get it soon. On the other hand, a very good static nullness check for Java has been introduced with the FindBugs (now SpotBugs) static analyzer - Go, like lua has
nil
, but only for references. Since Go has value objects, those cannot benil
Verdict: null
(or its cousin nil
) has the effect of making every reference value an Option
.
Using references for all objects (as Java does) exacerbates this problem considerably. By making
the Option
type explicit, Rust (and other modern languages) greatly reduces the cognitive load it
entails.
Let’s take another example, splitting strings. In Rust, you get a Split
object, which is an
Iterator
over borrowed immutable string slices. This both avoids allocation for the parts and
lets you deal with the values one by one without needing a container. One can map str::to_owned
to get String
s of the subslices. Easy and performant. The downside is that we need two string
types, &str
for borrowed string slices and String
for owned strings. So what do other languages
do?
- Java’s
String.split(..)
returns an array, which necessitates an allocation and requires either a count of the substrings before building or some kind of reallocation. Worse, until Java 7u6, the substrings reused the internal char array of the original String, which is similar to what Rust does. Unfortunately, Java has no way of limiting the substring lifetime to the lifetime of the original string, so it had to expand the lifetime of the original string instead. Since 7u6, the substrings are instead copied from the original string, requiring a bit less than double the memory (some memory is also needed for allocating the returned array), but letting the GC reclaim the original string earlier even if the substrings are still alive - Python’s
str.split
method returns a list of substrings. So they need to allocate both the list and copies of the substrings - Lua’s minuscule standard library has no functions to split strings, but
string.gmatch
might be used to search for the inverse of the split pattern - C has a wildly unsafe, aptly named
strtok
function (“string tokenizing, remember this is a language from a time when long names were expensive). Apart from being a security nightmare it appears to work - C++ users, redditor CornedBee informs me have the option
of either using C’s
strtok
, creating and reading “lines” from anistringstream
(if the separator is a single character), writing their own or using some library, e.g. Boost.
Rust gives you a simple way to work with split strings without requiring allocation unless you
really need it. The borrow checker ensures it is all safe, where in other languages you’d have to
make copies left and right to be sure. Using the iterator’s methods is not much harder than working
with the lists or arrays of substring, and lends itself to further optimizations. I’d like to note
that other languages make it basically impossible to safely match Rust’s performance here. In this
case, the information retained in the Split
iterator allows the compiler to eke out more
performance while keeping a familiar interface. Simply looping over the subslices will look mostly
the same as in other languages, and you only need a modicum of additional complexity to get
String
s (.
map
(
str::to_owned
)
) or a collection (.collect()
, though you may need to specify the
type of collection via “turbofish”: .collect::<Vec<_>>()
).
Speaking of loops, Rust has loop
, while
, while let
and for
loops. While loop
may seem
unnecessary, in fact all other loops are desugared into some loop
+ match
combination. for
loops work with the IntoIterator
and Iterator
traits to compile down into well-performing
code.
- C and C++ have a
for (<initializer>;<test>;<increment>)
… loop and awhile (<test>)
… loop (also a version with a test after the loop body,do {
…} while(<test>);
). Thefor
loop is infamously sometimes misused for strange iteration patterns which may make the code hard to understand. The block brackets{
…}
are optional for all but the do-while loop, however this can lead people to misread code. C++11 also has afor(<binding> : <iterator>)
loop similar to Rust’s - Java has both a C-style
for
andwhile
loops, as well as afor(<binding> : <iterator>)
loop (spoken for-each loop) introduced in 1.5 that works with arrays andIterable
s. The latter is now the preferred form to iterate over collections of objects. Since Java 8, there are alsoStream
s which almost work similar to Rust’sIterator
s, except of course they don’t allow as much control, miss some functions and need to allocate to be safe - Python has a
for
loop that works with its own iterator protocol which is similar to Rust’s, however it uses aStopIteration
exception to break the loop instead of anOption
type. Python also haswhile
loops. Since blocks are marked by indentation, it has less risk of misreading the code, however it makes refactoring more cumbersome, and you run the risk of losing track in larger swathes of code. - Lua also has a
for
loop working with an internal iteration protocol similar to Python’s, but returning a sentinelnil
value to end the iteration.
Rust loops are as simple as high-level Python’s, but offer far better performance. In addition,
loop { .. }
arguably captures the intent of an unbounded loop better than while (true) { .. }
.
My last example is destructuring pattern matching. Though it adds a lot of complexity to the
language, the resulting simplicity in the code is well worth it. Take a simple enum with a match
and you will both get the correct internals for each variant and be assured by the compiler that
you handled every possible variant.
- C and Java only have a
switch
statement that works on plain numeric values orenum
s (which may not contain additional data. You could use a tag enum + union idiom in C and would likely use a polymorphic type in Java if you need to store data. The former is crude and won’t give you any assurance that you used the right variant. The latter can get quite unwieldy and won’t give you the same assurances either. As for theswitch
statement, both the fact that it is a statement instead of an expression and that it allows fallthrough (you need tobreak
from eachcase
) exacerbates the ergonomics. - C++ has learned from this and now offers
std::variant
types, which are still not used widely and I hear they have their own downsides (they don’t have destructuring, for one) - Python and Lua simply don’t have such statements. You need a wall of ifs or a dict/table to do something like matching, and it still won’t give you safe access to the data. In Python, you could also use Java-like class polymorphism, but that doesn’t get you better ergonomics either.
Short story short, Rust’s enums + destructuring matches offer great productivity while doing away with footguns other languages may allow or even prescribe. Here the language pays with a bunch of complexity to make your code easier to both write and read (at least to the initiated).
I hope those examples have convinced you that disallowing cut corners doesn’t need to harm productivity even in the short run. In the long run, being diligent obviously wins.