This idea came up in conversation with @pepyakin recently. It’s not a full-fledged solution, but a starting point for new research.
Problem Statement
Parachains require much more data to be posted to the Data Availability (DA) layer per block than other approaches, like ORUs and ZKRollups. This increases load on DA and reduces scalability.
There is an architectural reason for this: parachain blocks are eagerly executed by validators so we can have fast finality on parachain blocks, but this requires the full witness data required to execute each block to be available to validators assigned to check. This witness data makes up the bulk of the PoV.
Goal 1: Only require posting witness data to the DA layer when necessary
Goal 2: Never finalize a bad or unavailable parachain block
Key Observations
- The blocks themselves need to be made available in all cases, as parachain nodes can withhold blocks from each other to prevent other nodes from extending the chain.
- The witness data is only a requirement up to the point of approval-checking and finality. After the parachain block is finalized, there is no need to keep witness data available as it can be reconstructed by full nodes.
Solution Sketch: Defer posting witness data to DA until necessary
A sketch of a solution:
- Create a new PoV commitment format in the
CandidateReceipt
which allows to specify Type I data (data which must stay available for some time) and Type II data (data which is only necessary until finality). - Change the availability part of the backing pipeline so validators initially only distribute and commit to the Type I data.
- Approval checkers use networking fast paths to eagerly fetch the Type II data without erasure coding when checking - from backing validators, full nodes, and other approval checkers
- Introduce either some kind of manual dispute mechanism for forcing the Type II data to be erasure-coded and acknowledged or an automatic mechanism if finality is lagging. After that point, the Type II data can be collected from erasure-coded chunks.
- Upgrade the fork-choice rule to revert chains where availability disputes have failed.
(An alternative would actually involve completely deferring all availability distribution until either a) approval checking hits a roadblock or b) approval checking is completed and then adding a finality voting rule that Type I data must be available before finalizing)
Properties
With this approach, we still avoid finalizing anything which is possibly bad, while keeping the necessary Type I data available to parachain full nodes. This would put Polkadot’s expected DA load on par with solutions such as ORUs and ZKRollups. If DA is the eventual bottleneck of all blockchain scaling mechanisms, the better finality times vs. ORUs and developer experience vs. ZKRollups would make Polkadot quite competitive here.