We had multiple XCM issues in past few releases and that resulted many users’ fund froze and requires many referenda from parachains and relaychains to credit the failed transfers.
While most of the issues are caught by Kusama (except this one also shipped to Polkadot), showing the benefits of a canary network, it still will be great if we can prevent the bugs at earlier stage.
I would like to see what we can do collectively to improve the situation and reduce the number of XCM issues. I think we should begin by improve the process of handling XCM related code.
XCM by nature, requires collaboration between all the teams, and therefore we need to improve the communications.
My suggestions:
For any XCM related PRs, it needs to answer the following question before they can be merged:
Is this change of XCM protocol? e.g. Add a new instruction or modify the behaviour of an existing instruction. If yes, there needs to be an approved XCM RFC first.
If this change is not in scope of XCM protocol spec (e.g. instruction name, Rust API modification), does this impact parachain runtime devs? Does this impact dApp devs? If yes, the change has to be announced first and requires additional approval.
Is this a non-trivial change? If yes, the change has to be announced first and requires additional approval. The author have to demonstrate that alternative implementations are considered and document why this particular solution is used instead of alternatives.
Create XCM specific communication channel. Most likely an Element matrix channel. Or maybe we can reuse the Public Fellowship channel.
Add changelog to the xcm-format repo to indicate the changes between each version.
Create a document that’s the single source of trust for all XCM related content. That includes, change log, FAQ, how to migrate, etc.
Let me know what do you think. And I simply cannot make this happen by myself so please help in whatever way you can.
I largely agree with your proposals, but am concerned about points 2&3 on XCM related PRs, specifically what would “additional approval” mean, and how would we strike a balance between agility and breakage.
For e.g. some XCM change that impacts dApp devs, we cannot realistically wait for N dApp representatives explicit approvals.
I do see it working however using public announcements, on dedicated well established channels, and some decision/pending period during which objections/concerns can be raised.
We do however have to also accept the cost of above in terms of development velocity.
I simply cannot make this happen by myself so please help in whatever way you can.
I will personally help/contribute here.
Let’s get some more feedback/ideas then devise a practical plan.
I don’t really think development velocity will be impacted much. It is better to spend a bit more time to build the right tool on the first place, than revisit it later after we found out it contains some critical issues, which happened before.
Besides, many of the big XCM PR are taking multiple weeks to complete anyway and we can totally be seeking for feedbacks in parallel and not impacting the total at all.
But yeah require approval from all representatives wouldn’t work. We just need to determine the right amount of approvals needed.
XCM specification is named as one of the technical domains of fellowship, so I think moving the code to fellowship with different merge rights is also a reasonable step forward.
I wonder if there are any thoughts about better integration testing as well? I am not very versed in the topic myself, but have heard from others that there is much more that can also be done with better testing.
I agree 100% that we should improve communications and further use RFCs for changes to XCM.
I’ll also help with this.
XCM RFCs should definitely be approved by members of a fellowship. It had occurred to me it could be a differente “ecosystem fellowship” for example, but if it’s one of the technical domains of the fellowship then it makes sense we should move towards that.
I agree it makes most sense to have XCM spec to be governed by the ecosystem fellowship. But it is not yet created so I think the core fellowship should take ownership of it meanwhile.