XCMP(Cross-Chain Message Passing) Design Discussion
Introduction
After much too long we get to have what I think is quite an exciting topic for discussion with regards to what XCMP will look like in the near future. Before proceeding on the implementation of XCMP, we would like to discuss how the design of XCMP suits the overall goals of Polkadot cross chain messaging. This post will focus really on the CORE protocol and proofs to achieve XCMP. There are several implementation details separate which will be addressed in an official RFC. That being said, let’s jump in…
Background
Over the last few months, the research team at Web3 Foundation has come up with a design and prototype for XCMP.
Overall Design Considerations
Independent of the specific protocol we want to consider a few items while assessing the design. There are trade-offs to every scheme and it is important to keep the following in mind.
- Ordered or Un-ordered Delivery of messages
- Guaranteeing delivery of messages or not
- If messages are not received and are dropped what are the consequences for the sender or receiver
- Speed of the protocol/size of proofs (This also includes how much or how little we do on chain…)
- Maintainability and flexibility of the protocol over time in future iterations(Polkadot does change after all…)
- Security assumptions
What does XCMP Aim to Achieve
The vision of Polkadot is a fast scalable and heterogeneous sharded blockchain. An important aspect of achieving this is cross-shard messaging, or cross-chain messaging if one wants to view this in a more blockchainy context. The motivation for XCMP specifically in the context of Polkadot is to optimize cross-shard messaging. This means making things fast and efficient especially in the demand of resources from Polkadot. Currently there is an implementation of cross-chain messaging called HRMP(Horizontal Relay-routed Message Passing) which unfortunately has the downfall of forcing the Polkadot Relaychain to do quite a significant amount of work. XCMP aims to be the replacement of the HRMP system and to move all of the computation and previous on-chain resource consumption to be fully moved off-chain.
Merkle Matryoshka Dolls
Overview
The following design is tailor-made to optimize for XCMP’s goals.
The MMD design avoids the use of State Proofs and instead leverages nesting slimmer bespoke Binary Merkle Structures together. Hence why we call it Merkle Matryoshka Dolls.
Namely we utilize Binary Merkle Structures tied together which are the following:
Merkle Mountain Ranges
BEEFYMMR
- which lives on the Relaychain(BEEFY enables everything!)XcmpMessageMMR
- a dedicated MMR for messages (A way for the source chain to commit to messages on for a specific channel)
Binary Merkle Trees
XcmpChannelTree
- a dedicated Binary Merkle Tree whose leafs consist ofXcmpMessageMMR
roots for each open channel on the source chainParaHeaderTree
- A dedicated Binary Merkle Tree whose leafs consist of all of the active Parachain Headers on Polkadot
Two of these structures are already native on Polkadot namely the BEEFYMMR
and the ParaHeaderTree
. In the next sections we will dive in depth on each of these trees and how they can be leveraged in XCMP.
BEEFY
Regarding XCMP it is important to know that by parachains(Or anyone) tracking BEEFY roots from the Relaychain that it is possible to generate proofs about what any parachain does. This includes messaging or anything else for that matter. This is true because blockchain headers contain state roots and can also include other clever merkle roots which we will discuss in detail in the next sections.
MMDs to facilitate XCMP messages in Polkadot
We will get into the meat and potatoes so to speak so lets discuss how we can use nested Merkle Trees to prove XCMP messages were sent from a source chain. It is helpful to keep the Matryoshka Dolls in mind when thinking about these nestings.
First it is important to understand the overall flow of what needs to happen to safely prove XCMP messages from a source to a destination.
Source chain Commitment
First lets discuss what steps the source chain needs to take to kick off this protocol:
-
Source chain commits to some XCMP messages
a.) How this commitment is done can influence the proof size tremendously, a naive way being just be a hash list. We for now decided to use an MMR(Merkle Mountain Range). This structure allows for a commitment size of O(1) and O(log(n)) membership proofs. -
There are several destinations that a source chain might want to send messages to and we label each of these destinations via a
channel id
. Therefore a chain will contain many open channels. This principle is no different than HRMP(Horizontal Relay-routed Message Passing) practices today. -
Since we have many messages to send to many different destinations we organize each message into a dedicated MMR, one MMR per channel. This was labeled in the overview as a
XCMPMessageMMR
-
So the question now becomes we have many
XCMPMessageMMR
’s living on the source chain how can we simultaneously commit to all of them on chain concisely? Well this is where the fun begins we can use another Binary Merkle Tree! However this time we can make it simpler and not use aMMR
because channels don’t have a strict add only dependency.MMR
’s typically are used as an append only data structure and are optimized as such. There may be instances where we wish to update channels from this tree so a standardBinaryMerkleTree
works well here. There is a caveat that the currentBinaryMerkleTree
implementation in Substrate is not updateable at the moment. -
So from the overview we introduced a second tree: a
BinaryMerkleTree
called theXcmpChannelTree
. ThisBinaryMerkleTree
’s leaves contain theXCMPMessageMMR
roots from above. This makes possible our first nested layer of a MMD proof.
With this nesting we now have a single hash which contains a commitment to all of the messages a source chain wishes to send to any other destination chain. Pretty neat! If this is hard to visualize right now worry not. I will illustrate it for you later. For now it is important just to know that a single hash can represent all of the messages to all destinations.
The question now becomes ok so what do we do with this?
Well, we will utilize a couple more Binary Merkle Structures, so bear with me. The quick and short answer is we can utilize the Polkadot protocol here. We can place this hash inside of the block header for the source chain. Similar to how we place the state root inside each header within blockchain networks.
Parachain Headers and BEEFY
Here it is useful to give a short blurb about BEEFY. BEEFY really is the source of truth we are using to prove consensus over the XCMP messages. Let me explain first briefly what the structure of the BEEFY MMR is and how we can utilize it effectively.
BEEFY in short - Simply put BEEFY is a mini consensus finality protocol(It is in the name even “Bridge Efficiency Enabling Finality Yielder”) built on-top of the Polkadot finality protocol called GRANDPA. The BEEFYMMR
s leaves contain Relaychain State information which has been agreed upon by the BEEFY consensus participants.
Naturally one might ponder, ok what Relay chain consensus information do we even care about here for our XCMP stuff?
There I would focus on one crucial piece of information which is stored inside a BEEFY Leaf and that is… drumroll… Another BinaryMerkleRoot
!
Oh sigh even to explain this at the present makes me tiresome for all of the readers out there. Nesting and recursion really are a funky things to think about.
So continuing on… Lets take a break and look at a picture to save some brain energy.
This BinaryMerkleRoot
inside of the BEEFYMMR
is yet another nesting of our Matryoshka Dolls. From the overview we labeled it the ParaHeaderTree
so this particular root is the ParaHeaderRoot
. So now maybe we can start to see how these dolls are starting to come together. The ParaHeaderTree
as stated in the overview is a tree who’s leaves are parachain block headers for a particular snapshot in time on the Polkadot Relay chain.
Putting The Dolls together
So now lets put all of these dolls together and make sense of what we have.
- A XCMP message goes inside a
XCMPMessageMMR
Smallest Doll - All of the
XCMPMessageMMR
roots goes inside theXCMPChannelTree
Second Doll - The
XCMPChannelTree
Root goes into theParachainHeader
- The
ParachainHeader
goes into theParaHeaderTree
Third Doll - The
ParaHeaderTree
root goes into theBEEFYMMR
Fourth Doll the Big Doll
Phew okay now that that is done what are we left with and what do we have?
Now we have a single Doll or Root hash(That being the BEEFY Root) which contains all of the messages from all parachains sent to all other parachains!
How does the Destination chain accept messages?
Now that we have a way of constructing a proof that a source chain has sent a message to a particular destination how do we receive it and verify it? Well simply put all roads lead back to BEEFY. All the receiver needs is the BEEFYMMR
root in which this message was captured in. Simply by having the BEEFY root hash we can unpack each layer of our Matroyshka Dolls.
A logical question is how does a destination chain receive an updated BEEFYMMR Root? BEEFY in fact runs on the Relaychain right? Here we can take a small detour into how a Polkadot parachain is able to verify a particular Relaychain State value via a State Read Proof. Is this something new? No, in fact a parachain relies on this in order to build a block.
In short every parachain runs a Relaychain Client. This means that a parachain is able to trustlessly follow the Relaychain’s consensus protocol. One important aspect of building a parachain block is to anchor ones block to a relay chain block. That is done by requesting a State Read Proof from the Relaychain Client which is then verified and acted upon inside the building of a Parachain block. This again is fundamental to Polkadot’s protocol. What does this all mean? Well we just add another item to this read proof, a single item in fact, just the current BEEFYMMR
root hash on the Relay chain.
So now that we know how to transfer the current BEEFYMMR
root from the Relaychain to the destination chains how can we unwrap our Matroyshka dolls? Well intuitively it is the same as before but we go in the opposite direction. Starting with one big Matryoshka Doll we unpack each piece verifying each step along the way. In other words taking the merkle co-paths and verifying them in order starting first with the BEEFYMMR
root hash. Let’s just quickly list out the steps and visualize this with a diagram.
- Verify a
BEEFYMMR
proof(Merkle co-path) against theBEEFYMMR
root hash and extract theParaHeaderTree
root from theBEEFYMMR
leaf. - Verify a
ParaHeaderTree
proof(Merkle co-path) against theParaHeaderTree
root hash obtained from step 1 and extractXcmpChannelTree
root fromParaHeaderTree
leaf. - Verify a
XcmpChannelTree
proof(Merkle co-path) against theXcmpChannelTree
root hash obtained from step 2 and extract theXcmpMessageMmr
root from theXcmpChannelTree
leaf. - Verify a
XcmpMessageMmr
proof(Merkle co-path) against theXcmpMessageMmr
root hash obtained from step 3 and extract theXcmpMessage
.
Now we have a verified message!
Who produces the proofs or builds the Matroyshka Dolls?
We discussed earlier how messages make their way into a BEEFY root which we can then verify. And although we discussed the structure of building our proof aka Matryoshka Dolls we need to make note of who builds these proofs. We will utilize Relayers
like in most bridging protocols because after all XCMP is a bridge. It is important to note however that collators can equally play the role of a Relayer
, allowing for easier pruning of messages from storage. What makes this protocol more secure than your typical bridge is that it leverages using the Polkadot Relaychain to allow us to prove consensus on messages sent from other parachains very easily.
The Relayers
themselves will need to store various pieces of information of both the Source
chain and the Relaychain
in order to build MMD proofs. This allows the Source
chain to continue its business as usual with very limited storage overhead in order to send messages. Relayers
which can also be thought of as off-chain workers need the ability to post the MMD
proofs to the destination chain.
Regarding availability of messages the collators of the source chain are storing them in an off-chain database and can keep them around in this manner. Some intelligent method of pruning can be implemented in this regard for the collators.
At this point the protocol does not guarantee delivery of messages so we aim to add incentives for the Relayers to construct the MMD
proofs and post to the Destination
chain. If the Source
chain is very interested and concerned about messages not being received or potentially even withheld by Relayers
then they have the ability to construct the MMD
proofs themselves to ensure that the messages are delivered.
At the moment ordered or un-ordered delivery is an option for XCMP. With the most secure option being un-ordered as it is more resilient in terms of liveness guarantees of the protocol. However both can be achieved.
Visualizing the flow of XCMP MMD Approach
Integration
A big concern for a new protocol is the migration path and ease of implementation. Not to worry this has been thought of! So currently if one wishes to send a XCM
they are routed through the HRMP
protocol. XCMP
will simply hook into XCM
the same way HRMP
does. Essentially a drop in replacement. There will be a a simple pallet to integrate which will be nearly the same for all parachains and does the job of pushing XCMP
messages into the correct XCMPMessageMmr
as well as having the ability to verify MMD
proofs in order to process XCM
s upstream.
In other words all other existing XCM
and parachain modules wont notice what is happening they just continue to operate as usual.
XCMP Design for On Demand Coretime
The question is will this design be future proof for on demand core-time and corejam? Yes we have thought this through so now that we understand the protocol lets talk about what issue will arise and what will be the fix for that issue.
First things first what would any issue even be?
The idea is relatively simple. So let me pose the following scenario:
Say we are some destination chain who would like to receive messages sent to us. We are able to verify those messages using the BEEFYMMR
root as we discussed earlier. However the BEEFYMMR
root hashes that we can use to unpack and verify our proof(Also known has disassembling our Matryoshka Dolls) is updated every time we go to produce a block. So if we haven’t produced a block in a while there have been several potential Matroyshka Dolls sent to us in-between the last time we produced a block and now. Those are valid proofs, just because we weren’t doing anything during that time doesn’t mean some source chain is wrong for attempting to send us messages during that period. Here in lies the problem, someone has legitimately sent us messages to process and we have no valid BEEFY
root to verify them with. Let me illustrate this with a visual.
So as we can see from the image the Relayer is attempting to send a proof which pertains to BEEFY
root 3 which the receiver has never seen before as it only has seen root 1 and 6. Now that the problem is identified what is the solution. I will try to be concise here as to reduce the complexity, but simply put we can add another Matryoshka Doll into our proof. Before explaining what this Doll is and how it functions I can state how adding in this new layer will enable us to fix the problem. By adding in a new layer the collators of the destination chain can first see that this proof is going to fail and then enable a small fix for the original proof. The full explanation for how this method would work requires a fairly deep dive into how Merkle Mountain Ranges work. Particularly how MMR peaks are utilized in computing an MMR root hash. For this particular write up I will opt to explain the finite details and save it for a later write up. I will leave you with this though, adding in the “extra” layer to the proof or Doll is a fairly small step for the Relayer
. The fixing of the proof requires little to no extra storage from the destination chain. So even though this feature of On Demand is not present yet in Polkadot the XCMP design has a way to solve for some of the additional complications On Demand Coretime brings.
Conclusion
We wanted to let everyone see what the plan for the future is ahead. We have completed a Proof of Concept of the MMD
proofs working in a local testnet and will move to implementation. XCMP is one step further towards scaling Polkadot’s cross-shard messaging which furthers the interoperability capabilities of Polkadot. An important goal from the original mission statement. Any and all feedback/questions is welcome here on this forum. We would like the dialogue to be open and transparent in order to best serve the needs of the community at large! If anyone feels that I misspoke or got any particular detail wrong please speak up, I am very open to being proven wrong
Notable Mentions
Thank you to the following people for helping construct this write up and XCMP.
Ordian (Core contributor and implementation lead)
Lederstrumpf (Providing consultation on the protocol)
Alistair Stewart (core MMD protocol design)