XCMP Design Discussion

XCMP(Cross-Chain Message Passing) Design Discussion

Introduction

After much too long we get to have what I think is quite an exciting topic for discussion with regards to what XCMP will look like in the near future. Before proceeding on the implementation of XCMP, we would like to discuss how the design of XCMP suits the overall goals of Polkadot cross chain messaging. This post will focus really on the CORE protocol and proofs to achieve XCMP. There are several implementation details separate which will be addressed in an official RFC. That being said, let’s jump in…

Background

Over the last few months, the research team at Web3 Foundation has come up with a design and prototype for XCMP.

Overall Design Considerations

Independent of the specific protocol we want to consider a few items while assessing the design. There are trade-offs to every scheme and it is important to keep the following in mind.

  1. Ordered or Un-ordered Delivery of messages
  2. Guaranteeing delivery of messages or not
  3. If messages are not received and are dropped what are the consequences for the sender or receiver
  4. Speed of the protocol/size of proofs (This also includes how much or how little we do on chain…)
  5. Maintainability and flexibility of the protocol over time in future iterations(Polkadot does change after all…)
  6. Security assumptions

What does XCMP Aim to Achieve

The vision of Polkadot is a fast scalable and heterogeneous sharded blockchain. An important aspect of achieving this is cross-shard messaging, or cross-chain messaging if one wants to view this in a more blockchainy context. The motivation for XCMP specifically in the context of Polkadot is to optimize cross-shard messaging. This means making things fast and efficient especially in the demand of resources from Polkadot. Currently there is an implementation of cross-chain messaging called HRMP(Horizontal Relay-routed Message Passing) which unfortunately has the downfall of forcing the Polkadot Relaychain to do quite a significant amount of work. XCMP aims to be the replacement of the HRMP system and to move all of the computation and previous on-chain resource consumption to be fully moved off-chain.

Merkle Matryoshka Dolls

Overview

The following design is tailor-made to optimize for XCMP’s goals.

The MMD design avoids the use of State Proofs and instead leverages nesting slimmer bespoke Binary Merkle Structures together. Hence why we call it Merkle Matryoshka Dolls.

Namely we utilize Binary Merkle Structures tied together which are the following:

Merkle Mountain Ranges

  • BEEFYMMR - which lives on the Relaychain(BEEFY enables everything!)
  • XcmpMessageMMR - a dedicated MMR for messages (A way for the source chain to commit to messages on for a specific channel)

Binary Merkle Trees

  • XcmpChannelTree - a dedicated Binary Merkle Tree whose leafs consist of XcmpMessageMMR roots for each open channel on the source chain
  • ParaHeaderTree - A dedicated Binary Merkle Tree whose leafs consist of all of the active Parachain Headers on Polkadot

Two of these structures are already native on Polkadot namely the BEEFYMMR and the ParaHeaderTree. In the next sections we will dive in depth on each of these trees and how they can be leveraged in XCMP.

BEEFY

Regarding XCMP it is important to know that by parachains(Or anyone) tracking BEEFY roots from the Relaychain that it is possible to generate proofs about what any parachain does. This includes messaging or anything else for that matter. This is true because blockchain headers contain state roots and can also include other clever merkle roots which we will discuss in detail in the next sections.

MMDs to facilitate XCMP messages in Polkadot

We will get into the meat and potatoes so to speak so lets discuss how we can use nested Merkle Trees to prove XCMP messages were sent from a source chain. It is helpful to keep the Matryoshka Dolls in mind when thinking about these nestings.

First it is important to understand the overall flow of what needs to happen to safely prove XCMP messages from a source to a destination.

Source chain Commitment

First lets discuss what steps the source chain needs to take to kick off this protocol:

  1. Source chain commits to some XCMP messages
    a.) How this commitment is done can influence the proof size tremendously, a naive way being just be a hash list. We for now decided to use an MMR(Merkle Mountain Range). This structure allows for a commitment size of O(1) and O(log(n)) membership proofs.

  2. There are several destinations that a source chain might want to send messages to and we label each of these destinations via a channel id. Therefore a chain will contain many open channels. This principle is no different than HRMP(Horizontal Relay-routed Message Passing) practices today.

  3. Since we have many messages to send to many different destinations we organize each message into a dedicated MMR, one MMR per channel. This was labeled in the overview as a XCMPMessageMMR

  1. So the question now becomes we have many XCMPMessageMMR’s living on the source chain how can we simultaneously commit to all of them on chain concisely? Well this is where the fun begins we can use another Binary Merkle Tree! However this time we can make it simpler and not use a MMR because channels don’t have a strict add only dependency. MMR’s typically are used as an append only data structure and are optimized as such. There may be instances where we wish to update channels from this tree so a standard BinaryMerkleTree works well here. There is a caveat that the current BinaryMerkleTree implementation in Substrate is not updateable at the moment.

  2. So from the overview we introduced a second tree: a BinaryMerkleTree called the XcmpChannelTree. This BinaryMerkleTree’s leaves contain the XCMPMessageMMR roots from above. This makes possible our first nested layer of a MMD proof.

With this nesting we now have a single hash which contains a commitment to all of the messages a source chain wishes to send to any other destination chain. Pretty neat! If this is hard to visualize right now worry not. I will illustrate it for you later. For now it is important just to know that a single hash can represent all of the messages to all destinations.

The question now becomes ok so what do we do with this?

Well, we will utilize a couple more Binary Merkle Structures, so bear with me. The quick and short answer is we can utilize the Polkadot protocol here. We can place this hash inside of the block header for the source chain. Similar to how we place the state root inside each header within blockchain networks.

Parachain Headers and BEEFY

Here it is useful to give a short blurb about BEEFY. BEEFY really is the source of truth we are using to prove consensus over the XCMP messages. Let me explain first briefly what the structure of the BEEFY MMR is and how we can utilize it effectively.

BEEFY in short - Simply put BEEFY is a mini consensus finality protocol(It is in the name even “Bridge Efficiency Enabling Finality Yielder”) built on-top of the Polkadot finality protocol called GRANDPA. The BEEFYMMRs leaves contain Relaychain State information which has been agreed upon by the BEEFY consensus participants.

Naturally one might ponder, ok what Relay chain consensus information do we even care about here for our XCMP stuff?

There I would focus on one crucial piece of information which is stored inside a BEEFY Leaf and that is… drumroll… Another BinaryMerkleRoot!

Oh sigh even to explain this at the present makes me tiresome for all of the readers out there. Nesting and recursion really are a funky things to think about.

So continuing on… Lets take a break and look at a picture to save some brain energy.

This BinaryMerkleRoot inside of the BEEFYMMR is yet another nesting of our Matryoshka Dolls. From the overview we labeled it the ParaHeaderTree so this particular root is the ParaHeaderRoot. So now maybe we can start to see how these dolls are starting to come together. The ParaHeaderTree as stated in the overview is a tree who’s leaves are parachain block headers for a particular snapshot in time on the Polkadot Relay chain.

Putting The Dolls together

So now lets put all of these dolls together and make sense of what we have.

  1. A XCMP message goes inside a XCMPMessageMMR Smallest Doll
  2. All of the XCMPMessageMMR roots goes inside the XCMPChannelTree Second Doll
  3. The XCMPChannelTree Root goes into the ParachainHeader
  4. The ParachainHeader goes into the ParaHeaderTree Third Doll
  5. The ParaHeaderTree root goes into the BEEFYMMR Fourth Doll the Big Doll

Phew okay now that that is done what are we left with and what do we have?
Now we have a single Doll or Root hash(That being the BEEFY Root) which contains all of the messages from all parachains sent to all other parachains!

How does the Destination chain accept messages?

Now that we have a way of constructing a proof that a source chain has sent a message to a particular destination how do we receive it and verify it? Well simply put all roads lead back to BEEFY. All the receiver needs is the BEEFYMMR root in which this message was captured in. Simply by having the BEEFY root hash we can unpack each layer of our Matroyshka Dolls.

A logical question is how does a destination chain receive an updated BEEFYMMR Root? BEEFY in fact runs on the Relaychain right? Here we can take a small detour into how a Polkadot parachain is able to verify a particular Relaychain State value via a State Read Proof. Is this something new? No, in fact a parachain relies on this in order to build a block.

In short every parachain runs a Relaychain Client. This means that a parachain is able to trustlessly follow the Relaychain’s consensus protocol. One important aspect of building a parachain block is to anchor ones block to a relay chain block. That is done by requesting a State Read Proof from the Relaychain Client which is then verified and acted upon inside the building of a Parachain block. This again is fundamental to Polkadot’s protocol. What does this all mean? Well we just add another item to this read proof, a single item in fact, just the current BEEFYMMR root hash on the Relay chain.

So now that we know how to transfer the current BEEFYMMR root from the Relaychain to the destination chains how can we unwrap our Matroyshka dolls? Well intuitively it is the same as before but we go in the opposite direction. Starting with one big Matryoshka Doll we unpack each piece verifying each step along the way. In other words taking the merkle co-paths and verifying them in order starting first with the BEEFYMMR root hash. Let’s just quickly list out the steps and visualize this with a diagram.

  1. Verify a BEEFYMMR proof(Merkle co-path) against the BEEFYMMR root hash and extract the ParaHeaderTree root from the BEEFYMMR leaf.
  2. Verify a ParaHeaderTree proof(Merkle co-path) against the ParaHeaderTree root hash obtained from step 1 and extract XcmpChannelTree root from ParaHeaderTree leaf.
  3. Verify a XcmpChannelTree proof(Merkle co-path) against the XcmpChannelTree root hash obtained from step 2 and extract the XcmpMessageMmr root from the XcmpChannelTree leaf.
  4. Verify a XcmpMessageMmr proof(Merkle co-path) against the XcmpMessageMmr root hash obtained from step 3 and extract the XcmpMessage.

Now we have a verified message!

Who produces the proofs or builds the Matroyshka Dolls?

We discussed earlier how messages make their way into a BEEFY root which we can then verify. And although we discussed the structure of building our proof aka Matryoshka Dolls we need to make note of who builds these proofs. We will utilize Relayers like in most bridging protocols because after all XCMP is a bridge. It is important to note however that collators can equally play the role of a Relayer, allowing for easier pruning of messages from storage. What makes this protocol more secure than your typical bridge is that it leverages using the Polkadot Relaychain to allow us to prove consensus on messages sent from other parachains very easily.

The Relayers themselves will need to store various pieces of information of both the Source chain and the Relaychain in order to build MMD proofs. This allows the Source chain to continue its business as usual with very limited storage overhead in order to send messages. Relayers which can also be thought of as off-chain workers need the ability to post the MMD proofs to the destination chain.

Regarding availability of messages the collators of the source chain are storing them in an off-chain database and can keep them around in this manner. Some intelligent method of pruning can be implemented in this regard for the collators.

At this point the protocol does not guarantee delivery of messages so we aim to add incentives for the Relayers to construct the MMD proofs and post to the Destination chain. If the Source chain is very interested and concerned about messages not being received or potentially even withheld by Relayers then they have the ability to construct the MMD proofs themselves to ensure that the messages are delivered.

At the moment ordered or un-ordered delivery is an option for XCMP. With the most secure option being un-ordered as it is more resilient in terms of liveness guarantees of the protocol. However both can be achieved.

Visualizing the flow of XCMP MMD Approach

mmd_flow

Integration

A big concern for a new protocol is the migration path and ease of implementation. Not to worry this has been thought of! So currently if one wishes to send a XCM they are routed through the HRMP protocol. XCMP will simply hook into XCM the same way HRMP does. Essentially a drop in replacement. There will be a a simple pallet to integrate which will be nearly the same for all parachains and does the job of pushing XCMP messages into the correct XCMPMessageMmr as well as having the ability to verify MMD proofs in order to process XCMs upstream.

In other words all other existing XCM and parachain modules wont notice what is happening they just continue to operate as usual.

XCMP Design for On Demand Coretime

The question is will this design be future proof for on demand core-time and corejam? Yes we have thought this through so now that we understand the protocol lets talk about what issue will arise and what will be the fix for that issue.

First things first what would any issue even be?

The idea is relatively simple. So let me pose the following scenario:
Say we are some destination chain who would like to receive messages sent to us. We are able to verify those messages using the BEEFYMMR root as we discussed earlier. However the BEEFYMMR root hashes that we can use to unpack and verify our proof(Also known has disassembling our Matryoshka Dolls) is updated every time we go to produce a block. So if we haven’t produced a block in a while there have been several potential Matroyshka Dolls sent to us in-between the last time we produced a block and now. Those are valid proofs, just because we weren’t doing anything during that time doesn’t mean some source chain is wrong for attempting to send us messages during that period. Here in lies the problem, someone has legitimately sent us messages to process and we have no valid BEEFY root to verify them with. Let me illustrate this with a visual.

So as we can see from the image the Relayer is attempting to send a proof which pertains to BEEFY root 3 which the receiver has never seen before as it only has seen root 1 and 6. Now that the problem is identified what is the solution. I will try to be concise here as to reduce the complexity, but simply put we can add another Matryoshka Doll into our proof. Before explaining what this Doll is and how it functions I can state how adding in this new layer will enable us to fix the problem. By adding in a new layer the collators of the destination chain can first see that this proof is going to fail and then enable a small fix for the original proof. The full explanation for how this method would work requires a fairly deep dive into how Merkle Mountain Ranges work. Particularly how MMR peaks are utilized in computing an MMR root hash. For this particular write up I will opt to explain the finite details and save it for a later write up. I will leave you with this though, adding in the “extra” layer to the proof or Doll is a fairly small step for the Relayer. The fixing of the proof requires little to no extra storage from the destination chain. So even though this feature of On Demand is not present yet in Polkadot the XCMP design has a way to solve for some of the additional complications On Demand Coretime brings.

Conclusion

We wanted to let everyone see what the plan for the future is ahead. We have completed a Proof of Concept of the MMD proofs working in a local testnet and will move to implementation. XCMP is one step further towards scaling Polkadot’s cross-shard messaging which furthers the interoperability capabilities of Polkadot. An important goal from the original mission statement. Any and all feedback/questions is welcome here on this forum. We would like the dialogue to be open and transparent in order to best serve the needs of the community at large! If anyone feels that I misspoke or got any particular detail wrong please speak up, I am very open to being proven wrong :slight_smile:

Notable Mentions

Thank you to the following people for helping construct this write up and XCMP.

Ordian (Core contributor and implementation lead)
Lederstrumpf (Providing consultation on the protocol)
Alistair Stewart (core MMD protocol design)

Resources

MMD PoC

18 Likes

Would each parachain need to write their own off-chain worker pallets with their own incentive mechanism, or part of the move to XCMP is also providing a good default implementation for us to use?

1 Like

We will create the mechanism and incentive scheme. Just you would need to run them. This is all assuming we dont just have the collator do it by itself. Of course if you find a need to do something else to ensure certain criteria or constraints which you feel arent covered (For some particular usecases) then of course that should be able to be facilitated.

1 Like

I’m not an expert on the subject, so I’m curious what the Pro’s and Con’s are vs ISMP.

From what I understand, both rely on hashing the messages into different structures and having someone construct a state proof by showing all leafs and intermediary hashes that lead to the on-chain root hash. But you say this design for XCMP is not using “state proofs”. What did you mean there?

1 Like

I can make a separate write up on showing a comparison of the two in a different post.

The short answer is that the MMD approach isnt a full a State proof. The Relayer is not posting a Patricia Merkle Trie(Current State model of Parachains) membership proof of the source chain in which the receiver checks against a latest state root. Although technically with this design you could do full state proofs if you really wanted to because the BEEFY root can verify any parachain header.

So while MMDS do use Merkle co-paths as proofs it is not a Merkle co-path of the State tree of the source chain. I hope this helps.

I was wondering where was XCMP in all the latest updates.

No one was mentioning it anymore.

I’m glad to see it’s still there, i put the XCMP feature as one of the most important in the whole ecosystem .

1 Like

I have some feedback on the overall design of the protocol.

I don’t think ordered delivery of messages is useful to enshrine at the protocol layer. If an application needs to deliver one message at a time, this is trivial to implement at the application level. Otherwise enforcing ordered delivery might lead to a deadlock where the channel cannot be flushed, because one of the messages to an application cannot be processed. I think we need to consider a bit more carefully what the real world applications of ordered delivery of messages are.

It’s very hard to see how this is the case when we have to verify 5 different merkle proofs here.

Also what is the purpose of channels in this design since anyone with the merkle roots can verify and process incoming messages?

Even with all of this this specification does not address:

  1. What happens when a parachain/service is down due lack of coretime? How can messages be timed-out/reverted?
  2. How can parachains read each other state? This is more powerful than just sending messages.
  3. If relayers are to be enshrined, what is their incentivization model?

It seems to me that to fix all of these issues would be to simply re-invent ISMP.

It’s also fair that the w3f may consider these issues out of scope for XCMP. In which case, I think it would be best for parachain teams to rely on a more fully-featured, ecosystem-focused protocol like ISMP and XCMP may be strictly confined to interacting with system parachains.

2 Likes

Thank you @CoaX1d for the nice write-up!

Couple of questions:

  1. When/how can we make the relayer incentivization protocol part of the whole thing? Without that we simply can’t replace HRMP with XCMP; XCMP needs to provide the same (or stricter) guarantees as HRMP, one of the most important ones being “delivery guarantee”.
    I’m fine with even making this part of the “core” collator protocol (collators have to relay XCMP messages as part of their core duties).
  2. are there delivery proofs built into the protocol? if not, have you considered them? with delivery proofs, pruning of sent messages becomes trivial and efficient, also allows simple(ish) incentivization schemes.

I am also interested in seeing a comparison between XCMP and ISMP, not at the protocol level or implementation level, but at the “value it brings” level.

I would like the Polkadot ecosystem to make more Polkadot-as-a-whole product-oriented decisions.

With this mindset, I believe the decision-driving metrics should be (not limited to):

  • improvement leap effect vs current status quo - in this case, something like “how much more bandwidth compared to current HRMP”
  • features/capabilities - what new use-cases does it enable
  • limitations - e.g. need archive node, need X offchain storage, etc (i.e. for building proofs)
  • development + integration estimated cost - how long before this is (or at least can be) deployed across Polkadot
  • system/protocol complexity - how difficult is it to implement (e.g. get a second community implementation), how easy/difficult to debug, how easy/difficult to operate, how little/much extra tooling should be built for it, etc
2 Likes

One thing to try and bear in mind when coding this up is: Would we be able to drop in/use a specialized token, “tuned” to the peculiarities of this use case/industry. E.g. consider collators as distinct from relayers, etc.

It won’t be easy to fight the pump-dot imperative, securities are popular because their incentive effects are so strong.

One rule of thumb you can use is to think in terms of 10x$1 vs 1x$10. If the effects are indifferent in making one of those scenarios more likely you possibly are designing something that might support a consumer token. If the effects favor 10x$1 being the more likely outcome you are more likely have something that will support a consumer token. Obviously, if things lean toward 1x$10, you’re well into pump-dot territory - expect to land here (unfortunately) you are dealing with a security.

Sure yes an argument for un-ordered delivery is that it can exhibit more resilience and liveness properties. I think this is a nice property for blockchains especially.

However… saying there is no case to be made for ordered delivery is naive. Engineering decisions are always a matter of trade offs. This argument that has been long debated in messaging protocols.

I would disagree with this.
It may seem trivial from a conceptual standpoint but there are many complexities that make this have its risks. If a mistake is made it is catastrophic. Where as the application can be agnostic to it completely if the protocol handles it.

Also this consumes additional resources at the application level when it comes to tracking buffers and sorting messages and accounting for missing ones when managing the correct sequence.

Speed is relative. These dont have to be full state proofs like in ISMP. Also Binary Merkle trees as opposed to bloated full state Patricia Merkle Trie proofs has its advantages.

Great question! The alternative is to just put every message in an MMR or some other form of vector commitment. However the structure allows for a couple exploitable advantages.

  1. The structure allows one to easily batch prove of a set of messages because they are all a common ancestor to this local MMR Channel Root.

  2. The main reason being the that if you are comparing with what ISMP does(Although ISMP allows this to be somewhat configurable with the overlay tree field, so this flexible) of just throwing all the messages to every destination together you still have the issue that you need to track on the receiver side what has been received and what hasnt. This structure with channels allows you to just keep a index or counter instead of the alternative way of tracking hashes. You need some way to see what has been nullified. This method is much cheaper than an ever growing list of hashes(Which is expensive and inefficient) that would end up needing to be stored into a nullifier tree in order to be made more efficient.

  3. Channels have already been our model before with HRMP which allows us to utilize a not so foreign model when it comes to implementation and migration.

  4. It nicely separates out what is needed for proving when it comes to the Relayer. The Relayer can be solely focused on sender and receiver where if we want to mash everything into a single tree we need to keep the hashes of messages going to other chains around in order to hash the leaves correctly when generating the proof.

This has been explained in the section labeled XCMP Design for On Demand Coretime

In short we wish not to take a timeout approach to on demand Coretime. This comes with its own complexities on both the sender and the receiving side. We can instead operate under the assumption that if a receiver is offline for a period of time we can always handle messages which were sent while we were offline when we come online again directly via the MMD proof.

This allows us to not have to handle reverting and flushing out messages which is more complex (setting timeouts, what to do on timeout etc).

I did not address it because I dont think it needs to be a requirement of the message passing protocol itself.

That being said there is a BEEFY root on the Receiver… BEEFY contains a commitment to all the verified Parachain headers, so the Relayer can easily construct a lighter proof (2 Dolls) which can verify a state root. Thus allowing the receiver to do a state read proof verification of a source chain.

We can use BEEFY just like how we use it in in the Polkadot Ethereum Bridge if we wish… we dont need to do all this heavy consensus client and state machine client updates like ISMP does in order to achieve this property. BEEFY provides the path.

This depends on if we choose to go down the un-ordered or ordered path. This cant be decided before because one incentivization model wouldnt work for both.

If you have a scheme where you just pay people for the number of messages they deliver that doesnt work in an ordered scheme. You can deliver 99 messages out of 100 and they are all useless because you forgot the one message that comes before them. Yet in an unordered model if you deliver 99 out of 100 we can simply pay the person 99% of the reward.

This statement really bothers me. It makes you look dishonest and incredibly biased. The motivation for your bias is quite evident too. I would keep that in mind when discussing further.

Thank you for the questions and feedback. Happy to have more discussion on comparisons. I do appreciate it because you have had to think of many similar things during hyperbridge and ISMP development. I think ISMP is general for various state models and that is nice but the generality always comes at a cost.

I think both of these can be addressed in the RFC and are both a great topic of discussion I appreciate you bringing this up :slight_smile:

All of these are quite indepth discussions of their own and some of them I would say I am less qualified to answer on. Especially as it pertains to what new usecases integration timelines etc. As far as improvement leap it really lightens up what the Relaychain has to do in any case. And operations move off-chain in both scenarios.

Would love to jump down the rabbit hole tho and explore all of these!

What real-world application do you envision that would not be possible with unordered delivery?

We don’t use the full state trie, we use child tries

I don’t want to come off as confrontational, but calling me dishonest actually bothers me. Biased, maybe. But dishonest is a stretch. Please point out the specific parts of my comment that was dishonest.

My problem with this protocol is simply that it doesn’t address real-world use-cases. It fails to present solutions to issues that parachains are currently facing, issues like state reads, or incentivization models for relayers which cannot be hand-waved.

In my view, ordered messaging isn’t useful in the context of blockchains, neither are channels. Yes that comes at the extra cost of storage for nullifiers, but they greatly simplify the developer experience. Which imo is more important in protocol design.

Like i said, the w3f may consider these issues out of scope and that is fine, the community will decide what they prefer at the end of the day.

This wasnt the point I was making I never stated this… I am saying there is a case to be made for both. Little do you know I am actually in favor of un-ordered delivery for its resiliency when it comes to liveness but that doesnt mean we shouldn’t discuss the tradeoffs. You said there is no case for it at all in the protocol layer and my argument was there is.

Great thank you for the correction.

You basically said just do my thing ISMP and have argued that vehemently… Dishonest ok yes maybe too harsh… But you are not a dumb guy and when you say things like you cant do state reads when there is a BEEFY root there then it makes me pause because I know you know it can be done… Yes I didnt put it in the design spec but it doesnt mean it cant be done.

If state reads is a strict requirement then it can be accounted for.
As far as incentivization come on this is silly we arent going to hand waive anything…

And I will end on this. All of this doesnt mean that people cant utilize ISMP for messaging and building applications if they so choose. In fact I would encourage people to take the route which they deem is best and easiest for their usecases and I wouldnt even argue against it I have no reason to do so.

This is simply for XCM programs to be sent from one chain to another replacing HRMP.

1 Like

Going back to the question posed by @JuaniRios

I’m not an expert on the subject, so I’m curious what the Pro’s and Con’s are vs ISMP .

and @acatangiu

I am also interested in seeing a comparison between XCMP and ISMP, not at the protocol level or implementation level, but at the “value it brings” level.

The only answer I could find is this comment from @CoaX1d, but I still home to see a short and concise description of what is the tradeoff between XCMP and ISMP coming out of this thread :slight_smile:

Here is my point of view, please correct any misunderstanding on my part and so on:

  1. The value for users of the message passing protocol derives from the degree of support for each protocol, the number of chains supporting it, and which operations are enabled. Within the Polkadot ecosystem, it seems clear that XCMP will have wider support. There is a discussion about read access, but in my opinion, it’s somewhat obscured since you can have non-interactive proofs of other chains and prove by the state root, using MMR or not (with MMR, you are not limited on historical proofs).

  2. The value for message relayers: here ISMP opens up a market for new entrants, and the XCMP incentives are not clear yet. On the other hand, the XCMP relaying could be enshrined in the existing rewards for securing the network for the greater benefit of the whole ecosystem itself.

  3. The value for consensus relayers: this is a delicate point because for XCMP, the cost is zero inside the Polkadot ecosystem, while for exterior bridging, it is accounted for in the particular bridge, such as in the case of Snowbridge. In the case of ISMP, someone should pay the bill for storing/updating the consensus state on each supported chain (Ethereum mainnet will be quite expensive).

Comparisons with ISMP

ISMP Overview

First I will introduce ISMP concisely so bear with me. Interoperable State Machine Protocol really can be summed up compactly.

There are two main concepts to grasp:

The first being the idea of a ConsensusClient this mechanism can accept consensus messages and verifies consensus proofs. This allows us to obtain blockchain headers which we can verify have been agreed upon by the source chains consensus protocol.

The second concept being a StateMachineClient this interface allows for a given ConsensusClient to verify State Proofs from a particular source chain.

With this we can understand ISMP generally will prove messages have been sent from a particular chain. First a ConsensusClient updates the latest verified blockchain header of the source chain and from that can grab the latest StateRoot.

How can we prove messages you may ask? Well the StateMachineClient has programmed in the state model of the source chain. Therefore the receipient chain can verify State Proofs (which include messages and anything else you would like to prove) from the source chain.

Below is a diagram from Polytope labs of various StateMachineClients living on the recipient chain

Messaging

ISMP messages fit into a GET POST scheme similar to https requests shown below. So once messages are verified and were sent by a source chain they can fall into these buckets:

pub struct RequestMessage {
    /// Requests from source chain
    pub requests: Vec<Request>,
    /// Membership batch proof for these requests
    pub proof: Proof,
}

pub enum TimeoutMessage {
    Post {
        /// Request timeouts
        requests: Vec<Request>,
        /// Non membership batch proof for these requests
        timeout_proof: Proof,
    },
    /// There are no proofs for Get timeouts, we only need to
    /// ensure that the timeout timestamp has elapsed on the host
    Get {
        /// Requests that have timed out
        requests: Vec<Request>,
    },
}

pub enum ResponseMessage {
    Post {
        /// Responses from sink chain
        responses: Vec<Response>,
        /// Membership batch proof for these responses
        proof: Proof,
    },
    Get {
        /// Request batch
        requests: Vec<Request>,
        /// State proof
        proof: Proof,
    },
}

Lastly but not to dive further than necessary ISMP supports a few additional features:

  1. Message Timeout
  2. Fraud proof submission for Consensus client updates(This being less important for the implementation of XCMP)

ISMP in Polkadot for XCMP

So now that we have a base understanding of ISMP lets look at what proofs would look like for parachains to cross communicate.

First I will give a quick overview of the proofs themselves and list out the steps of a message from source chain to destination chain and lastly will give a quick diagram of the message flow.

Overall Flow:

  1. Source chain commits to a message in some way(Message hash inside a Merkle structure PCS whatever…) and stores the commitment inside its state. ISMP allows for a generalized commitment to messages optionally in what they call an “overlay tree”

  2. The Parachain header gets committed to the Relaychain State as do all headers…(Standard Polkadot things…)

  3. On the receiver side the ConsensusClient for each Parachain is updated (Remember part of ISMP’s design is to consistently update every Statemachine one has to support everyone)

    • The ConsensusClient is updated by verifying a Relaychain State Proof (Which every parachain does when creating a parablock).
    • The Read proof includes all of the Parachain Headers with which one would wish to support communication with.
  4. Now since currently all parachains State model are Patricia Merkle Tries. One can submit a Patricia Merkle Proof from the source chain along with a set of messages and verify them against the current verified State root for that particular source chain.

One question might be do we need to continuously update various state machines ConsensusClients in order to achieve the same thing? We will discuss that a bit later in the second design.

Picture of the flow of a message from source to destination:

Comparing each side by side

ISMP is a general protocol being used for other things such as Hyperbridge. Typically in designs there is a trade off for making something more general purpose or more specific. Take the CPU or GPU for instance if you design for generality that comes at a cost of perhaps supporting features that you don’t care to support.

MMD ISMP
No headers needed on destination chain Needs verified Headers to grab state roots
Lazily evaluating proofs, only evaluating when necessary Constantly updating every para block
Proofs all lean Binary Merkle Proofs(Granted there are 4 of them) Has to verify expensive Merkle Patricia Trie Proofs (What the current state model is for parachains)
Contains MMD PoC piece only Closer to production ready implementation
Scales to support more chains more easily Assumes understanding each state format of the source chain (if source chain state model changes must update immediately to maintain liveness and safety)
2 Likes

This is a great post, thank you!

If I may add a comment from product marketing side to the product team that will build XCMP: We had a big initiative to rename XCM because of the confusion it caused with XCMP. In the end, we decided not to rename it because XCM was already established as a brand, in the ecosystem, in the code - and it’s actually a good name. We said we will, when it becomes clear who will work on XCMP, nudge the team to come up with another name.

Whoever is brainstorming: In general a descriptive name is good, here, a branded name (such as “Polkadot xyz”) or a non-descriptive name (such as “StreetTalk”) are not appropriate, because we are looking at a transport protocol very down in the stack.

Basically something like XCMP but without “XCM” in it :+1: