Data Availability as a Service

While the notion of Execution Cores in Polkadot is useful for coordinating scheduling and resources across both the data availability and settlement layers, it will also be useful for Polkadot to expose some kind of Data-Availability-as-a-Service. The main use-case is for rollups living on top of parachains (L3s - including ZK Rollups as Parachains) as a further vertical scaling mechanism on top of the base-layer blockspace of Polkadot. These rollups should be able to post data to a shared DA layer, provided by Polkadot, but with settlement happening on top of a parachain. It also enables “sovereign rollups”, which I personally don’t think are super useful, but it might be more interesting to others.

Initially, the best approach may be to expose Data Cores alongside execution cores. These will act like execution cores, except that there is no state transition to verify - just a blob of data to make available to the broader network.

This implies that alongside blockspace we would have blobspace, which might be best allocated using the same mechanism as On-demand parachains . In this case, the claim on a core could list node(s) that the data can be fetched from (introduces a DoS risk for certain blobs, but without knowing preimages it’s hard to know what to DoS and what not to), or it could be adjusted to commit to the data in such a way that any node with the data can prove it stores the full data by answering random challenges to validators before sending the full thing.

Over time, with restaking, this DA layer could be spun out into a parachain itself secured by relay-chain stake.

14 Likes

Would you have this in addition to the current availability protocol or eventually replace it by splitting data cores from execution cores, where parachains use both and L3s just use data cores? If the former, would it be implemented the same way or using 2D Reed-Solomon?

Having a separate, more scalable, data availability layer and supporting sovereign rollups is useful from a marketing perspective – same as demonstrating transaction scaling beyond the capacity of one relay chain. Practically, I think you want data availability and execution coupled together because frequent cross-chain messaging eliminates any sublinear scaling benefit from separating the former (at least that’s my understanding from the LazyLedger paper).

2 Likes

I view this as in addition to the existing execution cores.

This could use the current availability protocol, with a 1D entire-validator-set Reed-Solomon encoding - 2D Reed-Solomon with light client sampling solves a different problem, which is an honest f+1 assumption on the validator set. I also alluded to the fact that we could outsource this to a dedicated system parachain over time, which might make use of a 2D sampling approach instead.

Indeed, I think exposing the DA layer as a service carries a bunch of marketing benefits on top of technical ones. I typically make the argument that bundling execution + data availability for parachains leads to more efficient scheduling (as you don’t have to coordinate scheduling across both DA and Execution). But, we need something like this to enable L3s, to the extent that’s needed or desired.

Practically, I think you want data availability and execution coupled together because frequent cross-chain messaging eliminates any sublinear scaling benefit from separating the former

Interesting - can you expand on this?

2 Likes

The example is a DNS chain where domains are paid for with tokens from a separate currency chain, so the DNS chain has a dependency on the currency chain, and users of the DNS chain are topping up balances from the currency chain (I assume to limit cross-chain transfers to begin with). If you have two DNS chains and only one is doing a lot of top-up transactions from the currency chain, users of the other will still have large availability proofs.

ZKRollups can become efficient, but I think we are still missing an important piece of “technology” to achieve that, at least for generic zkRollups (like zkEVM). Maybe dedicated ZKRollups can serve a specific role in the polkadot ecosystem but I’m not sure which. Betting on the future is sometime a good bet :slight_smile:

Concerning the data availability, I feel that decoupling it is going to bring some trade-offs too critical at the moment to justify having it. Things like latency of the block production (including finalization) would have to be increased in order to guarantee DA and Execution. What we have seen in multiple projects around Moonbeam is the trend around low latency requirements.

It might also increase the vector of pratical/network attacks of a chain, giving the possibility to block either the DA or the execution, but with the current state, I’m not sure how safer the current parachain model is anyway. :man_shrugging:

Right now, it is possible for a parachain to include zkRollup and have their own DA mechanism. It would be safer to have it guaranteed on the same chain that verifies the execution, but do projects need it at the moment?

(on a side note, from a Marketing point of view, it would help)

How long would this data be available? Would it be configurable? Because if settlement happens on top of a parathread it might produce a block every substantially high amount of relay blocks. We know right now data is guaranteed to be available for 24 hours, not sure if this will be the case as well with this DA-as-a-service mechanism.

Also is there any though as to how the challenge-prove mechanism could look like?

Nice to see this area of research