Altering Polkadot's fork-choice to reduce DA Load

This idea came up in conversation with @pepyakin recently. It’s not a full-fledged solution, but a starting point for new research.

Problem Statement

Parachains require much more data to be posted to the Data Availability (DA) layer per block than other approaches, like ORUs and ZKRollups. This increases load on DA and reduces scalability.

There is an architectural reason for this: parachain blocks are eagerly executed by validators so we can have fast finality on parachain blocks, but this requires the full witness data required to execute each block to be available to validators assigned to check. This witness data makes up the bulk of the PoV.

Goal 1: Only require posting witness data to the DA layer when necessary
Goal 2: Never finalize a bad or unavailable parachain block

Key Observations

  1. The blocks themselves need to be made available in all cases, as parachain nodes can withhold blocks from each other to prevent other nodes from extending the chain.
  2. The witness data is only a requirement up to the point of approval-checking and finality. After the parachain block is finalized, there is no need to keep witness data available as it can be reconstructed by full nodes.

Solution Sketch: Defer posting witness data to DA until necessary

A sketch of a solution:

  1. Create a new PoV commitment format in the CandidateReceipt which allows to specify Type I data (data which must stay available for some time) and Type II data (data which is only necessary until finality).
  2. Change the availability part of the backing pipeline so validators initially only distribute and commit to the Type I data.
  3. Approval checkers use networking fast paths to eagerly fetch the Type II data without erasure coding when checking - from backing validators, full nodes, and other approval checkers
  4. Introduce either some kind of manual dispute mechanism for forcing the Type II data to be erasure-coded and acknowledged or an automatic mechanism if finality is lagging. After that point, the Type II data can be collected from erasure-coded chunks.
  5. Upgrade the fork-choice rule to revert chains where availability disputes have failed.

(An alternative would actually involve completely deferring all availability distribution until either a) approval checking hits a roadblock or b) approval checking is completed and then adding a finality voting rule that Type I data must be available before finalizing)

Properties

With this approach, we still avoid finalizing anything which is possibly bad, while keeping the necessary Type I data available to parachain full nodes. This would put Polkadot’s expected DA load on par with solutions such as ORUs and ZKRollups. If DA is the eventual bottleneck of all blockchain scaling mechanisms, the better finality times vs. ORUs and developer experience vs. ZKRollups would make Polkadot quite competitive here.

2 Likes

We already have reversion for disputes, but we cannot revert after finality of course.

You’re asking if availability could be delayed until some approval checkers announce themselves. We cannot slashs backers 100% for unavailable blocks, which maybe occur due to innocent power failures, so no this protocol cannot really work.

Afaik availability is not really our bottle neck per se. At least for CPU usage we’ve many optimizations exist there, nothing as staggering as the AFFT + formal derivatives trick but meaningful anyways. We could discuss doing only partial reconstructions maybe.

In theory, we spend more total bandwidth sending parablocks to the five backing validators, but if you cut backing groups to three then we’ll spend equal bandwidth on backers and availability. We spend most bandwidth on approval checkers, so we save most by refining the analysis to require fewer approval checkers.

We should already scramble the chunk sequence around the validator set, so the systemic chunks already provide a fast recovery path, and these already live on the 3 or 5 backing validators, one other validators, and any approval voters. It’s likely libp2p that’s our real problem here. We’ll have much faster transfers if we plug in libtorrent and define a “polite local tracker” based upon the above knowledge.

Assignments & attestations wind up pretty expensive. We’ll improve assignments by merging tranche zero, but other optimizations make sense. Robert K and I have discussed doing attestations without signatures, which makes approval checkers unslashable.

From what I’m hearing, DA (in particular recovery) is indeed becoming the bottleneck

Where can we learn more about Type 1 and Type 2 data?

Can you give an example of what Type 1 and Type 2 data might look like for:

  • a smart contract chain like Moonbeam?
  • for a system parachain like the collectives parachain?

I have had the impression that smart contract chains are likely the culprits of having the most DA needs, since they need to include the full smart contracts into the PoV.

It seems to me that this issue may also be a big step forward to solve scalability issues with DA:

Basically allowing chains to place often reused data directly on the relay chain, like commonly used smart contracts.

How might you prioritize the development of these different solutions? Is there data we can extract which tells us which DA limits chains are running into most often?

This at least means something very different for Polkadot than rollups, where the only resource intensive thing the consensus layer is doing might eventually be DA.

One thought: if we’re thinking on a timescale when zk-rollups win out, Polkadot may be better positioned to transition to them because the duration in which Type 1 data is available is not designed around optimistic challenge periods.

It’s always possible inferior tech wins for market reasons, but imho we need not worry about zk roll ups winning on technical merits.

From what I’m hearing, DA (in particular recovery) is indeed becoming the bottleneck

As an organization we do major optimizations well, but we do not historically do much middle range optimization. We move our talent onto the next major issue before that work happens.

We’ve done exactly one large optimization for availability, which improved performance maybe 50x, but not tried multiple extension field towers, no simd, etc., and supposedly the C flavor remains much faster.

We do not balance our storage trees, which wastes resources, but if we did so then they’d be dense. We stupidly still use radix 16 hashing after years of knowing binary hashing would shrink PoVs by a factor of 4 in dense trees.

I’ll conjecture we’ve another 8x from storage and erasure coding optimization. I maybe wrong of course, but imho that’s unlikely.

1 Like

I agree with Jeff.

If the bottleneck is in recovery computation (CPU), we can work on microoptimizations of the Reed-Solomon impl or simply embrace the systematic code property of it instead of fighting against it.

If it’s the Storage Proof part (Fraud Proofs in ORUs terminology), we should have someone working on the trie optimizations, considering binary tries vs Verkle ones. Also optimization for networking stack, reducing the overhead per connection.

The good thing about our current DA layer is that what is stored there is completely opaque to validators. The current structure is only enforced by Cumulus. For ZKRollups based parachains, the PoV will simply be much more efficient/smaller without any changes to the relay chain protocol.

Polkadot’s DA is already far ahead of Danksharding/Celestia and the like in terms of available bandwidth. Once these lower hanging fruits are addressed, it will be very close in terms of efficient use of space for ORUs and is already as efficient for ZKRollups.

3 Likes

Let’s not argue about the tech merits of different kinds of rollups here (I would propose to move it out to another thread since it’s rather complex topic), and let’s assume that they are good competition just for the sake of this conversation.

We can provide a platform for rollups to be deployed on Polkadot (and that’s in fact how this topic first popped up). However you spin it, the data availability requirements for parachains will be always higher than rollups:

  • Parachains need the block and the witness (storage proofs) required to execute it to be placed in DA.
  • ZKRs need to place batches, equivalent of block bodies, into DA and proofs. Assuming one proof per batch, it would make up at most several hundred of KiB.
  • ORUs, need to place only batches. We can assume fraud proofs ≈never happen.

Once again, if we assume future constructions solve their problems (proving costs and composibility issues), that would diminish the value prop of parachains compared to them.

That’s why I think the fact that cumulus utilizing the DA space more efficently is irrelevant. The other optimizations mentioned here are important but tangential, cause those would be still useable by rollups and can still stack up on the Rob’s proposal. However, unlike those solutions, the proposed solution won’t be useful by rollups, cause those would need to go through the strict DA process, at least in the model we had in mid (I might write up more about the rollups on polkadot when time comes).

Polkadot’s DA is already far ahead of Danksharding/Celestia and the like in terms of available bandwidth.

This is an interesting claim @andronik. Broke out into the following thread:

1 Like

@burdges is there somewhere written down your high level viewpoints on ZK Rollups?

ZKRs need no DA layer, except for liveness. If each ZKR node needs hardware costing like $25k, so you won’t have too many such nodes. You therefore might want the DA layer for liveness, but maybe only if the DA layer comes ready made, because you can deploy many ZKR nodes for the cost of building that DA layer.

A fraud proof would be more likely in a OR than a dispute in polkadot. It’s a much softer target so way more likely someone tries. Also, if they do check disputed batches then either they need the merkle proofs for the storage those batches touch, or else they must replay a bunch of their OR chain.

We could likely merge some PoV data across multiple parachain blocks, but doing so sounds harder than other stuff I mentioned above.

Anyways…

As I understand it, you guys asked about optimistic availability for parachains, meaning disputes never result in 100% slashes, but instead approval checkers who’d dispute simply no-show because they never receive the PoV. It’s better than ORs probably but…

We cannot prove exactly this optimistic availability secure in the sense we envision doing for polkadot, because attackers can simply keep retrying their attacks, aka no gambler’s ruin.

It’d depend upon the threat models of course, but your optimistic availability scheme might demand more approval checkers too, and thus more bandwidth than our current aggressive availability. In Versi or other clouds, then we’ve artificially cheap & plentiful bandwidth but artificially expensive CPU, right?

I’d vote, optimize our erasure coding and storage first, and ideally soon-ish. We’ll hopefully know more about the CPU vs bandwidth trade off after we’ve experience running lots of gluttons on kusama. Also, we might find more appetite for analyzing something like this after we’ve done the machine elves paper.

Is this true? If attackers never reveal the block, then we’d escalate from the no-shows, with some respectable probability. We could simply declare the parablock invalid when 2/3rd of the validators no-show, and then slash the backers 100%.

It’s kinda horrific this could occur from simply a network outage, which governance cannot necessarily reproduce. It’s maybe secure though, albeit no great margins, and again likely demands more approval checkers, so not sure it’s cheaper.

I’m not sure I understand the proposal correctly, but wouldn’t this allow griefing attacks where backers withhold witness data until there are no shows without being slashed for it?

I also agree with Jeff.

Versi load testing and Gluttons deployment have revealed that we have a lot to improve in our network stack: Networking tasks CPU usage is too high · Issue #6828 · paritytech/polkadot · GitHub . This said high CPU usage is observed at only 20% usage of the recommended bandwidth for validators (500Mbps), so by the specs we still have more room to scale up.

Improving Reed-Solomon (SIMD for example) might become a more urgent priority when we increase the number of validators and validations each node is doing per relay chain block. The Glutton testing we did was artificial in the sense that we configured the network such that it did 3-6x more approval work than needed with all parachains crafting large PoVs (3MB) that in practice is very unlikely to be a thing. Maybe with async backing it becomes higher priority, but batching PoVs into a single availability bit in some flavor as per the CoreGroups discussion would also shave a lot of overhead.

1 Like