All About Arrays

20 December 2016

Recently, I wrote up yet another RFC containing a seemingly simple extension to allow for dynamically-sized arrays. This RFC got a lot of discussion (I believe we’ll crack 100 comments soon), some of which was bringing up some great ideas. Now I have enough stuff in my head for three RFCs, but I don’t have the time to write them up right now, so to sort my thoughts, I’m going to try to explain the ideas here.

First up, the RFC suggested adding an alloca![..] macro (or function) that does much the same as the alloca(..) builtin function in C. As I said, easy idea, big implications. First the array has to be created (LLVM can do this for us), the lifetime has to be rigged to be that of the containing function (we can do this by hacking the HIR) and we’re on our merry way. During the discussion, we also gained a way of calling the macro with an iterator to store up what it yields.

However, it was rightfully pointed out that this is a rather hackish solution to the problem. So we all took a step back to tease out the underlying ideas:

Dynamically-Sized Arrays
Lifetime Ascription
Array Generators

So without further ado, let’s look what that means:

Dynamically-Sized Arrays

This is a very simple idea: Arrays in Rust are currently constant-sized. However, it’s feasible to reserve a dynamically calculated number of elements on the stack (not in current Rust, but in C). For Rust’s type system, the resulting values are !Sized (which is to say, unsized) – their size isn’t known at compile time. One needs a way to describe them in code, in HIR and in MIR (LLVM IR already has the alloca operation). I don’t care too much if the syntax on top is [foo; n], alloca(foo, n) or alloca![foo, n], although the first one seems the most elegant.

Lifetime Ascription

Lifetime ascription is the unification of lifetimes and labels. Interestingly, those already share a common syntax, both are denoted by an apostrophe followed by a lowercase name, e.g. 'foo: { .. }. Currently, those labels are ignored when looking up lifetimes, but this could be changed to allow to ascribe the lifetime of a certain block to any value.

This is an awesome idea because a) it make two things that are currently different the same, b) it makes for an awesome teaching device as it allows us to show the lifetimes explicitly and c) it extends the expressive power of our type system, as we can now set up values with lifetimes that were unavailable before. IDEs could even supply refactorings to make lifetimes explicit/implicit where applicable. The only question is why we didn’t do this in the first place?

Implementing this would require us to allow labels on any block, not only on loops. There is already a lint that warns on labels shadowing lifetimes. That lint would need to be cranked up to deny for one version, but as it exists for some time, the fallout should be minimal.

Array Generators

Python programmers will know this as “List comprehensions”, but we all know this is a misnomer, as lists don’t comprehend anything. However, the base idea is to have some syntactic sugar for collecting the results of an iteration into an array or Vec. The latter has the FromIterator::collect() method, but it’s cumbersome to write. Case in point: What would you rather write?

(0usize..10).into_iter().collect::<Vec<_>>()
// or
vec![for 0usize..10]

My money is on the latter.

Update: I’ve been informed by multiple sources that the canonical way is in fact Vec::from_iter(0usize..10), which, while being just a smidgen longer than the proposed macro form, is a really elegant solution. One wonders why there is no From impl for Vec from IntoIterator, but I suspect that this would require higher-kinded types, because the From trait has no way of fixing the item type otherwise (note that this works for all FromIterator implementors).

This should also work for arrays: [for [0usize..10]] in the best case, but if that’s too complex to build, we could add an array![..] macro that does the trick. The benefit is that it would work with any IntoIterator type and make safe array/vec initialization really ergonomic.

For a simple stop-gap implementation, we can use:

macro_rules vek {
    [for $e:expr] => {
        ($e).into_iter().collect::<Vec<_>>()
    };
    //.. the rest just forwards to vec![..]
}

Rolling this into the standard vec! macro is easy. The implementation for arrays (be they sized or unsized) would probably be a bit more complex, but I suspect it’s a barrier we can surmount.

Another Update: This is of course only a very crude approximation and much less capable than python’s comprehensions, which allow filtering (via if) as well as combinatorics (via multiple fors). I still find them rather neat, but they may not fit into Rust at the moment. We’ll see.

One More Update: If we had an unsized array type, we could create a FromExactSizeIterator trait that works much like FromIterator, but requires the iterator know its size in advance.

Those are, in laymen’s terms, the features I’d like to see in Rust in the not-too-long term. What do you think? Weigh in on the discussion at /r/rust or rust-users!