Attention! Span

28 June 2016

As I continue to learn different parts of Rust, I begin to gain more appreciation for the tricky questions that come up when you combine certain features. Often, things that seem trivially easy from one side turn out to be quite complex when looked at from a different angle. One of those things is the unassuming Span.

Those are used throughout the higher-level representations (that includes parsing, syntax desugaring, macro expansion, lowering to HIR, linting, type checking and error reporting) to locate the responsible piece of code, e.g. for errors, but also for suggestions.

Spans have a start position, an end position and an expansion ID, which can be used to look up an expansion information (ExpnInfo) object that records syntax desugarings and macro expansions, including a parent Span from which it was expanded. Or so the theory goes.

In practice this hinges on the question if macros (yes, that includes procedural macros) do the right thing when expanding. Unfortunately, this is not as easy as it sounds. For example, serde at one place failed to set correct spans, causing clippy to report warnings for things people didn’t even write.

When we try to fetch code by span, we usually want to have the original code as written by the developer, though the code the compiler sees may be something completely different. So this state of affairs fails lint writers (and in turn people who want to use the output of such lints, e.g. IDEs). It also fails procedural macro writers, because there is zero documentation on how to deal with spans. A recent StackOverflow question of mine was unanswered until I posted my findings which I detail here.

As far as I can see, the MacroExpander (this is the type responsible for taking you macro_rules! definitions and applying them unto your code) does the right thing™, but it does so in a very subtle manner. It uses a Marker type that runs through the AST and sets the expansion IDs to the most recent expansion.

This is however not done for procedural macros, which are free to set their own spans everywhere. As I wrote above, there is little information on what the span for things should be. Let’s take an example: A macro that changes a + b to a.checked_add(b). The a and b should retain their original spans, as they haven’t been changed. We still need a new span for the checked_add path expression and one for the method call.

I would assume the right thing would be for the method name path to be declared an expansion of the + binary operator (which neatly has a span) and the outer method call expression to be an expansion of the binary operation expression. The expansion info should note the attribute that asked for the expansion.

So if cx is the ExtCtxt, expr is the original expression and op is the + operator, the outer span should be Span { expn_id: cx.backtrace(), ..expr.span } and the path span should be Span { expn_id: cx.backtrace(), ..op.span }. If that’s it, the only remaining questions are if ExtCtxt should have a method to handle those spans and if we want to call anyone writing procedural macros like this a spanhandler…

Did I get this right? If not, please correct me on /r/rust or rust-users. Otherwise, feel free to go there and discuss your learnings around spans.