Directly unit-testing Parse impls to avoid unneeded UI tests

sam0x17 · November 17, 2023, 5:28am

This is my last week at Parity, but I would like to take some time to evangelize a new paradigm I set up as part of my #[pallet::tasks] work in Tasks: general system for recognizing and executing service work by sam0x17 · Pull Request #1343 · paritytech/polkadot-sdk · GitHub for testing pallets (or really, testing anything in the syn ecosystem that implements Parse).

Parsing is Fallible, Expansion should be Infallible

Syn actually follows a very strict structure where the vast majority of compile errors are supposed to happen exclusively at parsing time, which is why the Parse trait’s parse method returns a syn::Result, but the quote::ToTokens trait’s to_tokens method, which is supposed to be responsible for the expansion of any parseable type, is infallible.

This is not by mistake, David Tolnay’s intention here is that compile errors, except for those that arise from post-expansion code disagreeing with other user code in some way (which should be emitted exclusively by rustc itself), should all be detected at parsing time and handled there.

This means that all custom errors, i.e. those that are not automatically thrown by the type system or rustc itself but are instead manually thrown using syn::Error should happen exclusively at parsing time.

If you take a stroll through substrate/frame/support/procedural, you will see that we violate this tenet frequently. Look in the expand directory for TryFrom impls and Result-returning functions with names like expand_ prepended to their name, and you will see what I am talking about. In pretty much every one of these scenarios, we are detecting some parsing error late in the process and issuing it during expansion time.

This by the way is the main reason we get those huge cascades of 500 errors resulting from a single parse error sometimes when building FRAME, because we are emitting a compile error during expansion that should have been emitted during parsing, before any new tokens have be emitted that could lead to a cascade of additional compile-time errors.

Now obviously given our reliance on the outer macro pattern and in particular aggregated enums and some very fancy macro_magic-esque “spooky token teleporting at a distance” functionality, sometimes there are legitimate post-parsing scenarios where we do have to issue an error or two, but these should be the extremely rare hacky exception rather than the rule, and often there are workarounds that can hoist these errors forward to the parsing stage, as I found when I was implementing #[pallet::tasks]. It takes a little bit of extra effort sometimes, but the pay-off is enormous.

In a perfect world, we’d have one folder called parse, filled with syn::Parse impls, and another folder called expand, filled with ToTokens impls, and that would the full story for frame support procedural.

Now at this point, you’re probably wondering why I think this distinction is so important. Let me explain.

Expansions are hard to test

The output of expansion is a TokenStream2, and without actually compiling that token stream, it is either extremely difficult or actually impossible to know whether any errors were encountered during the expansion process (again, expansion should be infallible, so there should be exactly zero errors in producing the token stream).

Legitimate post-expansion errors, again, include things where user-provided code elsewhere in the file conflicts with something in our macro expansion. TryBuild (what we use for all of our UI tests) lets us simulate building a mini rust project with arbitrary code and asserting that a particular compiler error is thrown when we try to build it. It is extremely slow and rudimentary and is intended for precisely these scenarios where we need to make assertions about rustc’s errors that are emitted when user code conflicts with our expanded code in some way that is only evident post-expansion (remember, this is also a dtolnay crate).

TryBuild wastes a ton of time compiling an isolated project, so this is a lot of extra work to if all we get to show for it is an assertion about whether some Parse impl threw some particular error or not. It turns out these types of assertions are entirely in the purview of regular unit testing.

Unit testing of Parse errors

TryBuild is not the proper tool for dealing with mere parsing errors, which are 100% detectable and testable at the point where parse returns an Err(..).

One of the reasons people often reach for TryBuild to test things like this is you can directly test the exact error message (and the Span, but more on this later), but this is easy to do in a unit test if you take the effort to build a few helpers to make it easy:

#[macro_export]
macro_rules! assert_error_matches {
	($expr:expr, $reg:literal) => {
		match $expr {
			Ok(_) => panic!("Expected an `Error(..)`, but got Ok(..)"),
			Err(e) => {
				let error_message = e.to_string();
				let re = regex::Regex::new($reg).expect("Invalid regex pattern");
				assert!(
					re.is_match(&error_message),
					"Error message \"{}\" does not match the pattern \"{}\"",
					error_message,
					$reg
				);
			},
		}
	};
}

Now instead of being limited to just asserting things like:

assert!(parse2::<CustomThing>(quote!(3 + 7)).is_err());`

we get the full fine-grained ability to assert both that the expression results in an Err(..) variant and that this Err(..) variant’s string representation matches our specified regular expression:

assert_error_matches!(parse2::<CustomThing>(quote!(3 + 7)), "expected ident");

Consider the following CustomThing from above that implements Parse (we are using the derive-syn-parse crate for a quick and easy Parse impl):

#[derive(Parse)]
struct CustomThing {
    ident1: Ident,
    comma: Token![,],
    ident2: Ident,
}

We can now write a quick and easy unit test fully covering the parsing edge cases for CustomThing:

#[test]
fn test_parse_custom_thing() {
    assert!(parse2::<CustomThing>(quote!(some, thing)).is_ok());
    assert!(parse2:<:CustomThing>(quote!(some_thing, some_thing)).is_ok());
    assert_error_matches!(parse2::<CustomThing>(quote!(3, thing)), "expected ident");
    assert_error_matches!(parse2::<CustomThing>(quote!(something)), "expected `,`");
    assert_error_matches!(parse2::<CustomThing>(quote!(some, "hey")), "expected ident");
    assert_error_matches!(parse2::<CustomThing>(quote!(one, two, three)), "unexpected token");
}

Tests like this take only microseconds to run compared to the 10-15 seconds it takes to compile a TryBuild test and do not require an isolated compilation step.

This is also not limited to one-off parseables. If you look in my #[pallet::tasks] PR (linked above), additional syntax has been added to parse entire pallets and assert things about the resulting compile error (if any). I have omitted the macros because they are lengthy but here is what the usage looks like:

#[test]
fn test_parse_minimal_pallet() {
	assert_pallet_parses! {
		#[manifest_dir("../../examples/basic")]
		#[frame_support::pallet]
		pub mod pallet {
			#[pallet::config]
			pub trait Config: frame_system::Config {}

			#[pallet::pallet]
			pub struct Pallet<T>(_);
		}
	};
}

#[test]
fn test_parse_pallet_missing_pallet() {
	assert_pallet_parse_error! {
		#[manifest_dir("../../examples/basic")]
		#[error_regex("Missing `\\#\\[pallet::pallet\\]`")]
		#[frame_support::pallet]
		pub mod pallet {
			#[pallet::config]
			pub trait Config: frame_system::Config {}
		}
	}
}

These two tests, which test that an entire pallet parses or fails to parse with a particular error, will run in microseconds compared to the 10-15 seconds taken to compile a similar TryBuild test covering the same thing.

What about Spans?

So yes, one thing we technically lose with this approach is the ability to make fine-grained assertions about the locations of error spans.

From a technical perspective, it isn’t too hard to extend syn so that spans can be compared, especially the custom proc_macro2::Span type, and I actually do this quite effectively in my work-in-progress ground-up re-write of the syn/proc-macro2 ecosystem called “sin”: https://github.com/sam0x17/sin/blob/b95ea4a758e72dcdbae582ceff20743d316f193c/sin_types/src/span.rs.

The real difficulty here, though, is figuring out some sort of syntax for specifying where in your quoted tokens the error span should be. If your desired span is in Expr or Item position, this is quite easy as you can use something like an #[error_span_here] attribute to mark where the desired error span should point, and then we could use a visitor pattern built into the macro to remove the attribute and extract the span location.

This approach will not work, however, in more exotic span location scenarios like if we want an error span to be on an fn keyword, an impl keyword, something embedded within an expression, etc. There is simply no obvious syntax for representing “this span here”. In theory I could make this whole thing into a crate and figure out something with an unused rust token like @ to designate the start and end of the error span, and maybe I will do that someday, but for now this is left as an exercise for the reader.

This is fine…

The reason I think this is all still perfectly OK is if you follow the above approach from the ground up, all your parseables will have unit tests directly covering their specific parse errors, localizing the spans to the point where you can be pretty confident they are in the correct locations. So if you have some large compound object like a pallet, there will be many smaller unit tests covering individual parse errors for the items within it, and this compositional structure more or less ensures that your error spans are where you expect them to be.

In fact, I would argue really the only time it becomes important to test that a span is in a particular place is when this methodology is not closely followed and the span could appear anywhere within your pallet or outer macro pattern or what have you. If we don’t have to worry about this because every component has its own parsing tests that manage their own errors, this is much less of a concern, so I think our particular paranoia about span locations with polkadot-sdk is more a symptom of just how many custom errors we emit at expansion time instead of during parsing.

Conclusions

All custom syn::Errors should be thrown during parsing and never during expansion unless there is absolutely no other way. Expansion should be infallible with the exception of errors raised by rustc itself when our expansion conflicts with user code in some way.
Expansion should always be defined in terms of a quote::ToTokens impl, and never using custom TryFroms or methods, unless there is literally no other way.
Everything that you emit or quote in proc macro output should be something implementing both syn::Parse and quote::ToTokens. This ensures that we can fully test the round-trip of parsing and expanding and helps avoid anti-patterns.
100% of parsing can be tested via unit tests, which means we should never have to lean on TryBuild to test something parsing/custom-error related
TryBuild is needed to test post-expansion errors that happen when our expansion code conflicts in some way with user code. This really the only time you should use TryBuild.
Probably a very large percentage of the current UI tests in polkadot-sdk could be replaced with pure parsing unit tests, which could dramatically improve CI times and the feedback loop when developing locally.
I am definitely going to eventually add a whole workflow for doing this to my proc_utils crate and eventually my sin crate, so stay tuned, but the functionality I describe above will all be added to polkadot-sdk once #[pallet::tasks] is merged

Topic		Replies	Views
Testing Complex FRAME Pallets: Discussion + Tools frame	9	1224	November 25, 2022
RFC: trustless in-code auditing annotations Tech Talk frame	5	716	December 5, 2022
Seeking Feedback on `derive_impl` for Default Configurations in Pallet Templates Ecosystem	2	98	May 1, 2024
Parachain Technical Summit - Next Steps Tech Talk roadmap , frame , polkadot-summit	24	4149	November 1, 2022
Where contracts fail, and runtimes/chains are needed Ecosystem frame	11	1143	November 20, 2023