Map of a Lifetime
Today let’s look on a small thing. Let’s say (for the sake of simplicity, the actual algorithm comes up more often than you’d think) you want to do some contest, so you need to select any combination of two teams out of your list. To make it even more simple, let’s ignore teams and just focus on indices. You might write:
const TEAMS : usize = 6;
fn main() {
for (a, b) in (0..TEAMS).flat_map(|a| (0..a).map(|b| (a, b))) {
compete(a, b)
}
}
However, Rust does not like this:
error: `a` does not live long enough --> combine.rs:4:59 | 6 | for (a, b) in (0..TEAMS).flat_map(|a| (0..a).map(|b| (a, b))) { | --- ^ - borrowed value only lives until here | | | | | does not live long enough | capture occurs here 7 | compete(a, b) 8 | } | - borrowed value needs to live until here error: aborting due to previous error
So, what does this tell us? As you can imagine, I was a bit stumped. Isn’t
usize
Copy
? Why is it borrowing at all? Taking a step back, the capture
(the first span in the error message) appears to constitute a borrow, which the
borrow checker cannot work out.
So what to do? We don’t need to borrow, and I just remembered that closures
can move. So we change the map
closure to move |b| (a, b)
and borrowck is
happy.
However, is this really a good way to do this? Let’s benchmark against a naive double loop (I’m going to use nightly’s test feature so I don’t need to setup a crate):
#![feature(test)]
extern crate test;
use test::{Bencher, black_box};
const TEAMS : usize = 6;
#[bench]
fn bench_flat_map(b: &mut Bencher) {
b.iter(||
for (a, b) in (0..TEAMS).flat_map(|a| (0..a).map(move |b| (a, b))) {
black_box((a, b));
}
);
}
#[bench]
fn bench_nested_loops(b: &mut Bencher) {
b.iter(||
for a in 1..TEAMS {
for b in 0..a {
black_box((a, b));
}
}
);
}
Let’s compile and run it:
$ rustc --test -C opt_level=3 -C lto -C target-cpu=core2 compete.rs $ ./compete --bench running 2 tests test bench_flat_map ... bench: 64 ns/iter (+/- 0) test bench_nested_loops ... bench: 6 ns/iter (+/- 0) test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
Ouch. More than 10 times the execution time of a naive nested loop. Well, we
all know that it pays to keep code simple. Even with a larger number of TEAMS
we see the flat_map
version lagging behind (though it’s catching up
eventually). Looking into the assembly shows that the nested loops were fully
unrolled whereas the flat_map
version was left as an exercise to a MIR
optimizer. Perhaps this should be seen as an LLVM bug.
On the upside, the flat_map
version is easier to abstract over than the
nested loops (although using a higher-order function would make perfect sense
here).