Sorry I posted same post on subsquare, but I want to draw more attention and let more community members join for review and discussions, so I posted here too.
We are OpenSquare team, currently dedicated to infrastructure building in the dotsama ecosystem, has developed and are maintaining products including subsquare, dotreasury, statescan, etc. We are proposing a specification to decentralize dotsame off-chain governance discussions data.
In current governance workflow, community users leave proposal context and have discussions in centralized platforms like Polkassembly, Subsquare, etc. Though usually these platforms have public APIs with which community members or other platforms can sync the data, there are still problems.
We can not verify the usersâ data. This is a common problem in web2 environment. Maybe these platforms have no motivations to tamper with the data, but we need more verification, less trust. So we can be 100% sure data belongs the owner.
Usersâ data maybe lost with a platformâs misoperation, or a platform may just stop operations. We need a way to keep the data is always accessible.
Data should be more auditable. Currently, platforms donât save all the data modifications. User can modify the legacy leaved context, and the website will just show the context is updated, so users may not be able to see the old data which may include the proposal ownerâs commitments.
The more one signle centralized governance platform own usersâ data, the less possibility another platform can provide better solutions. We need a relatively decentralize data hosting solution which will not rely on single platforms or a dedicated group of people.
Differences of different platformsâ API data format make it hard to sync and adapt all other platformsâ data.
We propose SIMA spec to solve above problems. SIMA defines a set of user actions and data standards to decentralize off-chain governance data for substrate based blockchains. In general, with this spec governane users sign their actions data with their polkadot keys, submit the signed data to spec implementers. Spec implementers will be responsible for submitting the IPFS CID of user actions data to blockchain with a system#remark extrinsic.
Itâs still in draft status, and weâll be very appreciated with any suggestions. We will begin the development when the community reach some degree of consensus with its feasibility. Please check the full spec.
Overall agree with the problems, and hope Substrate can lay the foundations on which to solve them, as I know itâs a much larger problem than just Polkadot governance!
I wanted to make sure I get the process correctly:
Is every user action âoff-chainâ actually mapped one-to-one with a remark? If so, what other than the data is âoff-chainâ about it. Seems like a pallet to manage these interactions on chain could be made to be more expressive, performant, and overall more useful IMHO
All interactions and JSON objects are stand-alone, and thus nothing prevents publishing of possibly masking / misleading statements, like amending a post that does not exits previously for example⌠correct? For example referencing a CID that is junk (intentionally) or not available (perhaps even lost forever to IPFS nodes). I donât see protocols in this spec to address validity of order of operations and referencing IPFS data that may be junk.
What other work has been done in this area, in the Polkadot ecosystem as well as beyond, that you are considering or being inspired by?
Is every user action âoff-chainâ actually mapped one-to-one with a remark?
No. Every user action is represented by a json object, mapped one-to-one with a CID. Remark extrinsic has a CID which map a json object which contains an array of user action CIDs.
All interactions and JSON objects are stand-alone, and thus nothing prevents publishing of possibly masking / misleading statements, like amending a post that does not exits previously for example⌠correct?
For discussion post, we can not amend it, but the author can append extra content which will be highlighted.
For proposal context, proposal authors can override the previous provided context with new actions.
We can not amend comments. Just leave new comments to explain previous ones.
I think I miss actions to cancel upvote/downvote.
For example referencing a CID that is junk (intentionally) or not available (perhaps even lost forever to IPFS nodes).
At the very first implementation, I donât want to make it mandatory that spec implementers have to upload user actionsâ data to arweave, crust, etc, but I believe decentralization storage solution should be a way to solve data loss. Subsquare as a potential spec implementer will upload them to multiple public IPFS host services.
I donât see protocols in this spec to address validity of order of operations and referencing IPFS data that may be junk.
Every user action contains a timestamp field, and spec implementers should not accept user submission with a timestamp which has a long distance with âcurrent timeâ. Yeah, there maybe malicious spec implementers, spec implementers can choose the trusted ones.
What other work has been done in this area, in the Polkadot ecosystem as well as beyond, that you are considering or being inspired by?
I was inspired just by problems we encountered in our experience of developing subsquare, dotreasury, off-chain voting.
Itâs impossible for us to find out all related work in this area. We learned uniswapâs governance portal. I think they upload the full proposal text to ethereum blockchain which I think too expensive and not so necessary.
I think this is a step in the right direction, thanks for preparing this spec proposal.
My thoughts on the current proposal
One problem that the document tries to solve is providing authenticity of messages on governance platforms. Signing the action, context, change and timestamp with the key of the author already helps to provide authenticity that is verifiable in the future (i.e. some action really was performed by some user in a given context at a given time). This is already a great improvement compared to the current situation, as governance platforms could currently easily modify the contents of their page without anyone being able to verify it. Having a valid signature, the raw data and another external service to verify the data is already a huge benefit from my perspective.
Another big feature that the spec provides is the unification of all governance platforms: Any governance platform would be able to utilize the data from other governance platforms, thus creating a unified governance database. New governance platforms could easily catch up with all the data and even participate in adding new data. I think that utilizing identities can help here immensely to verify the authenticity of the provider. I think that using system.remark is a good generalized solution, as it is available on any Substrate chain as of now. In addition, data usually is not deleted, so there should be barely any storage benefit when providing a specialized pallet for that. I like the git-like approach of providing deltas, so the complete history of changes can be replicated and interpreted by different software products.
My improvement proposal
What is the purpose of the blockchain here?
Since the governance data is stored in IPFS and all the data can be verified, since it contains all necessary metadata and a signature to proof authenticity and reconstruct the correct temporal order of changes, why do we need store all the CIDs on the blockchain at all? Instead, we could create a public IPFS cluster that utilizes the blockchain ONLY to retrieve authorized keys that can push data into that cluster. That way, any governance platform could apply to be listed as a verified governance platform and have their keys added by the governance body into the âverified IPFS clusterâ set on the blockchain. The IPFS cluster retrieves the keys authorized to add data from the blockchain. This would immensely reduce the storage requirement on the chain, as all the actual data lives in IPFS, whereas only the set of keys authorized to add content to the IPFS cluster lives on the blockchain.
Instead, we could create a public IPFS cluster that utilizes the blockchain ONLY to retrieve authorized keys that can push data into that cluster.
Iâm not an expert of setting up IPFS cluster, but I have following worries:
We can not assume the cluster is always stable, and Iâd prefer redundant public IPFS gateways for stability and availability.
Iâd prefer all the process should be open and permissionless, but the application to join the cluster seems unreasonable which may prevent a third party to become a spec implementer.
I think these cluster node should be maintained by spec implementers. Though only permissioned members can push data, there still be possibilities that one node may push huge irrelevant data which will cause trouble for spec implementers to verify and recover data.
What is the purpose of the blockchain here?
2 kinds of blockchains are involved. One is substrate chains like dotsama, while another is storage chain like arweave. Both of them provide the guarantee about stability and availability.
system#remark on blockchain different heights coordinate the process to recover data.
The gas for remark extrinsics will be completely acceptable. Iâll give the detail below.
Gas for remark extrinsics will prevent irrelevant huge data attack.
Arweave/crust chains will guarantee data is stored forever with max possibility.
This would immensely reduce the storage requirement on the chain, as all the actual data lives in IPFS, whereas only the set of keys authorized to add content to the IPFS cluster lives on the blockchain
In current spec, the extrinsic and remark data spec implementers will submit is under control and the max value is acceptable.
The spec implementer wonât submit every action CID, while they submit only CID of a group of actions. Check here for details.
Spec implementers can control the frequency to submit remark extrinsic. For example, if a spec implementer submit 1 extrinsic every 5 mins, there will be 288 remark extrinsics submitted at most in one day.
A spec implementer donât have to submit extrinsic if there are no usersâ action data. This spec is designed for a forum like governance discussion solution. not like IM product, usersâ action frequency is usually not so high that I expect remark extrinsics will be much lower than the calculated max possible value.