I noticed the other day that all the Rust AST structs, including ItemMod
and others used by syn
and quote
, etc., all #[derive(Hash)]
. In fact, I ran a quick test in the base FRAME #[pallet]
macro, and found that we can easily get a hash code for the current AST of a pallet or really any piece of rust code.
This gave me a very interesting idea – what if we were to add auditing annotations directly to pallets, and other things in substrate/polkadot, based on hashes such as these?
The basic principle is simple, and I illustrate it below for the FRAME use case, but similar approaches could work elsewhere in the ecosystem:
For FRAME, let’s say we define some attribute macro, maybe one that looks like below, to be applied to pallets that have been audited. The attribute would specify a hashcode of the pallet AST, preferably minus any superfluous doc-related AST nodes, that represents the version of the pallet that that has been audited. Then if the hash code changes we simply issue a compiler error saying the pallet needs to be re-audited (the hash code would only change if a non-doc change is made to the AST):
#[pallet]
#[audited(E0C5E21EB7AD91AA)]
pub mod pallet {
...
}
We could also optionally pair this with some sort of signature from the auditing team, but I don’t think that is even necessary – we can just only allow the auditing team to change #[audited()]
lines or something like that.
Regardling the flexibility of this approach, we could also do whatever transformations to the AST we want before hashing, so in principle we could even ignore certain portions of pallets systematically if that was desirable before generating a hash code.
One of the deliverables from a pallet audit would then become a member of the auditing team adding a commit where a #[audited()]
attribute is added or updated. This would make things nice and traceable right from the code and would allow us to associate audits with a commit and a line in the repo.
If a change is ever made to one of those pallets, we’ll know right away they need to be re-audited because the hash code will change and we will get a compile error, triggering the need for a new audit.
What could we do with this? Lots of things. For one, when proc macro warnings are stablizied, we could issue a warning in certain scenarios when a non-audited pallet is included in a runtime. We could also introduce some opt-in analogue of construct_runtime!
like construct_audited_runtime!
which would require that all pallets used in the runtime are audited and have passing hash codes. Since an overhaul of construct_runtime!
is also on the horizon, this could be worked into that as an additional feature.
In general, this would provide extremely strong safety guarantees for runtime authors as they will know that if they use construct_audited_runtime!
and their runtime compiles, every pallet they are using has passed an audit for the exact version being used in their code. This would be much better in my opinion than the current situation where we don’t track this in the code in a consistent and trustless way and have to go around asking or checking old PRs to see what was audited when.
In the wider ecosystem, for example on parachains and situations where people outside of Parity are making and using their own pallets, third party devs would be of course free to conduct their own audits and use this machinery on their own runtimes/pallets, so I think this would be a win for them as well. Since it is an opt-in system, they would be free to use it or not.
There is also no reason we couldn’t apply this sort of auditing process elsewhere outside of FRAME. In principle, any block of rust code could be audited and annotated in this manner
This approach should also be immune to issues related to feature gating, as the #[cfg(
directives at this stage of compilation are still just AST nodes, so these won’t result in different hash codes depending on what features we compile with – you should get the same hash code every time regardless of what features you enable
Anyway I just wanted to set this up as a starting point for a discussion. I can confirm that generating hash codes of entire pallets is doable and that I’ve done it locally
I’ll be creating an issue in FRAME suggesting some version of this #[audited()]
/ construct_audited_runtime!
syntax, but would also love to discuss here any other places in the wider substrate and/or polkadot ecosystem that could benefit from this sort of auditing hash code syntax.
What do you think?