TDLR: Last Wednesday, September 21, 2022 we received a security vulnerability report by SRLabs about adding inherents containing fake data to a block. For example, potentially adding a timestamp that is multiple years in the future, which would lead to stalling the chain and require intervention to get the chain running. This type of vulnerability could only be exploited by block producers. After informing the Parachain teams about this vulnerability, the team behind Moonbeam brought to our attention that an attacker could even execute unauthorized XCM messages on a Parachain. At the time of the report, this vulnerability could be exploited on Kusama and Polkadot resulting in the chain halting.
Vulnerability
The vulnerability was a result of two bugs:
- Insufficient validation of inherents, allowing fake data to be included in a block.
- Cumulus Parachain block validation enabling signed inherents.
Insufficient validation of inherents, allowing fake data to be included in a block
An inherent is an extrinsic that is used by block producers to add data to a block that cannot be verified on chain. A good example of this is a timestamp when a block producer creates a block. Considering the on chain logic cannot verify this timestamp (as it doesn’t know the current time) the other nodes in the network need to verify the inherents on the block import. This is done using the check_inherents
runtime function. The function is basically doing the following:
fn check_inherents(block: Block) {
for ext in block.extrinsics() {
if ext.is_inherent() {
ext.check_inherent()
}
}
}
This is_inherent
call in FRAME
implemented in the following way:
impl RuntimeCall {
fn is_inherent(&self) {
match self {
Self::PalletX(call) if PalletX::is_inherent(call) => return true,
Self::PalletY(call) if PalletY::is_inherent(call) => true,
// All other pallets in the runtime
..,
_ => false,
}
}
}
RuntimeCall
being the FRAME
type that represents all pallets and their transactions that can be executed in a runtime.
The problem with the above code is when you have some pallet that supports wrapping calls like pallet-utility
in batch
etc, it stops working. This is_inherent
function is not being able to look into the batch
call to recursively check all wrapped calls if they are an inherent and need to be checked.
There exists a wide variety of pallets that supports wrapping calls in different ways, so how was the pallet-utility
special? The batch
function was looking like this:
fn batch(origin: Origin, calls: Vec<Call>) {
let is_root = ensure_root(origin).is_ok();
// Dispatch the calls
}
The problem is that we only check if the origin is root
and then forward this origin directly to the calls
at dispatch. Every other pallet is always requiring that the origin is at least signed or some other special origin, but never that the origin is allowed to be None
.
So, a block producer could build a block with batch(Origin::None, [timestamp(timestamp_of_2050)])
and every node would import this block as check_inherents
and not be able to find this wrapped inherent to flag it as invalid.
Cumulus Parachain block validation enabling signed inherents
Every Cumulus based Parachain uses the set_validation_data
inherent. This inherent contains information like the XCM messages from the relay chain and some values from the state of the relay chain. Parachain blocks are not only validated by all of the parachains nodes, but also by the relay chain to ensure that the state transition is valid. This validation by the relay chain also includes checking the inherents, especially of the set_validation_data
inherent. The block validation code from Cumulus checks the validity of the inherent against the input to the validate_block
function from the relay chain. This validity check also includes an existence check. This existence check is similar to the is_inherent
check not being able to recursively look into calls to find the set_validation_data
call. The good thing is that the block validation would panic when it cannot find the set_validation_data
call as it is required. The bad thing is that we didn’t check that the origin of the call is None
.FRAME
supports that every pallet can declare its own origins, but outside the runtime it can only be passed in two origins directly: Signed
and None
. This is basically controlled by putting a Signature
into the extrinsic or not (note: this is abstracted for simplicity of this explanation).
Finding the set_validation_data
was implemented as:
let validation_data = block
.extrinsics()
.iter()
.find_map(|ext| if ext.call() == "set_validation_data" {
Some(ext.call())
} else {
None
})
.expect("Couldn't find `set_validation_data`");
The code of set_validation_data
is implemented like this:
fn set_validation_data(origin, data) {
ensure_none(origin)?;
// Do all the processing
}
So, a block producer could have included an extrinsic like set_validation_data(Origin::Signed(some_account), correct_data)
which would make the set_validation_data
findable, but the call return an error on applying it. The block validation would also use the correct_data
to validate it against the input to validate_block
and wouldn’t find any issue,
Crafting the block
Combining both of the present vulnerabilities is quick and easy. A block producer would have needed to build the following block:
[
set_validation_data(Origin::Signed(some_account), correct_data),
batch(Origin::None, [set_validation_data(Origin::None, fake_data)])
]
All checks would pass for this block and the block producer could have included any XCM messages into the fake_data
payload that would have been executed by the Parachain runtime.
NOTE: Any Parachain that used the CheckWeight
signed extension would have been protected against the signed inherent as well. As the signed inherent would fail the ensure_none
check, CheckWeight
would return an error as it ensures that no inherent is returning an error.
Recap
The report was published on Wednesday 21.09.2022. The same day we included two patches into Substrate and Cumulus to fix the two bugs. At the time of writing, Wednesday 28.09.2022 all Parachains affected by this vulnerability have upgraded their chain, including all the needed fixes, thanks to the forkless upgrades system from Substrate/Polkadot.
Polkadot and Kusama were only affected by the flaw in check_inherents
. So, it would have been possible by a validator to stall the block production, because other nodes would have failed at checking that the timestamp increased by some minimum when building a new block. Both chains were updated on the 28.09.2022 to a new runtime. Polkadot applied a runtime upgrade which only containing the required fixes, based on the latest release. Kusama, that had a runtime upgrade scheduled for the 28.09.2022, was upgraded to the planed runtime upgrade with the required fixes included.
Thank you to everyone who helped identify and fix this vulnerability. Also, Thank you to everyone that helped with a timely deployment to multiple chains.