Cumulus Consensus Modules

Most of the large features on our roadmap in the Parachains Core team at Parity require significant changes to how collators build and propose blocks:

  • With asynchronous backing, collators have greater flexibility in how many parachain blocks to build and which relay chain block to build off of.
  • With the on-demand fee model, collators will want to scrape the relay chain bid queue and determine how much they’ll pay for an execution core and when.
  • To make full use of elastic scaling, collators will want to be able to build multiple non-overlapping state transitions in parallel.

To enable these changes, @rphmeier has recently refactored how Cumulus handles parachain consensus by separating block proposal logic from the collator consensus mechanism – making it possible to write a variety of different collator consensus modules.

The design space for collators/sequencers is quite large and currently a major focal point of development work in other ecosystems. Polkadot was very early in launching decentralized collators in 2021 so it would be a shame to not stay on the cutting edge in this area.

In Parachains Core, our work in this area is focused on the collators necessary for supporting new parachain consensus features, which are quite a small subset of what’s possible. I want to draw attention to the recent Cumulus changes in hopes that they make it more straightforward for parachain development teams to implement special purpose collators for their own specific use cases.

A non-exhaustive list of ideas for collator consensus include:

  • Proof of Stake using parachain tokens
  • Quorum-based (Tendermint-style)
  • Slot-based (Aura or Sassafras-style)
  • Responsive (HotStuff family)
  • Shared sequencing between multiple parachains
  • Proposer/builder separation

More information can be found in the Cumulus issues and I’m always happy to talk to anyone interested in building in this area.

6 Likes

I’m unsure if “consensus” is the optimal terminology here…

A typically parachain receives “true consensus” from the relay chain, meaning the collators build upon who they like, but the relay chain backers and block producers choose which progresses, with other relay chain protocol playing roles under situations like disputes, etc. We expect this remains true under asynchronous backing and all our methods for “selling blockspace”.

In particular, if a parachain runs aura, babe/praos, sassafras, or even proof-of-work then polakdot does not give the parachains time to reach probabilistic finality, so these act like block production protocols, not exactly consensus protocols.

There are parachain rolls which should run their own true consensus before collation, but doing so adds complexity and latency for them. We merely need them to export consensus proofs in their blocks for polkadot to verify. A bridge cannot export anything unfinalized by the other side for example, but they merely “forward” consensus and do not necessarily establish their own. All those DAG designs have similar properties I think.

At what point do parachains really need their own true consensus? A threshold VRF chain like drand achieves consensus of some form I guess. In fact, a drand bridge would merely be a pallet usable anywhere, not a parachain per se, although maybe a SPREE makes more sense. You could easily imagine fancier threshold signing or decryption parachains though, which then require pre-collation true consensus.

Now what new block production schemes which make sense for parachains? At least two…

  • Sassafras for an SSLE with memepool-less operation or Tor integration.
  • Whisk for an SSLE which runs few nodes and requires a memepool. Also Whisk supports more flexible block space patterns than Sassafras.

Also…

There is some tendency for parachains teams to favor pre-collation true consensus, which maybe helps sound cleaver when raising money, but definitely costs them in complexity, dev ops, etc, even ignoring the developer time. It likely makes them value their parachain slot less too. It might even risks more of our ecosystem working over less secure bridge protocols like IBC too.

I’d suggest we gently steer parachains teams away from pre-collation true consensus whenever sensible. Initially this means explaining the costs. In cases, we should listen to why they desire it, identify if they actually need it, and in most cases teach them patterns which address their needs more simply. This does not mean we should not assist teams building pre-collation true consensus schemes, but we should always ask if they have the right reasons for wanting it.

3 Likes

I was a bit unsure what terminology to use. In this context they’re just block production protocols, but there’s not any distinction in the protocols themselves.

We can think about this in two ways. First, at what point do collator protocols become complex enough they might evolve to not need Polkadot? This isn’t clear to me. Only having probabilistic finality isn’t enough since of course there are plenty of systems like that. And anyway there’s a very clear use case for collators to have deterministic finality in order to guarantee a block’s eventual inclusion.

I think the more useful way to think about this is what properties do collators actual want in a block production protocol where there’s no concern for guaranteeing soundness. So then I think you land on protocols that were designed for L1s, but might be better suited for this; like the HotStuff family and, as you note, leader selection mechanisms that allow you to bypass the mempool.

Ultimately, I think the ideal makeup of collator networks and as a result their topology is very different from the ideal for a network whose purpose is primarily consensus and data availability and that will both drive the types of protocols they use and prevent anyone from wanting to use them for consensus on their own.

I also want to add that, while I think ideal collator networks are probably much smaller and more coordinated than consensus layers, an advantage Polkadot has over ORUs is that from the perspective of the consensus layer the proposer set is open. I wonder if there’s interest in various in-protocol ways to update it. They don’t have to be proof of stake – they could be reputational, for example.

might be some inspiration, we separated block building and block execution (not proposer/builder) on the parachain level

On top, we require some slashable stake from collators to prevent certain MEV attack vectors, mostly denial attacks.

Some consensus on the parachain might be required for transaction ordering. Since consensus systems usually reason only about transactions as atomic and never about their relative order or inclusion/exclusion

1 Like

I suppose “pre-collation consensus” or “early consensus” could mean anything beyond merely parachain block production. Our story for parachains teams becomes:

  1. Parachain block production comes in several flavors which deliver different useful properties. We’ll try to make this as simple as possible.
  2. Pre-collation consensus exists but incurs latency and maintenance costs. And slows their time to market.
  3. Parahains can typically avoid pre-collation consensus, and use only parachain block production, thanks to polkadot providing consensus,
  4. If needed pre-collation consensus can often be much simpler than a full consensus protocol, and
  5. We’re happy to advise them on how to streamline and simplify their pre-collation consensus requirements, but this requires a clear vision for what they’re doing, and why they need pre-collation consensus.

Academics might answer the “why does you consensus protocol matter?” question better than the sort of parachain team that typically asks for pre-collation consensus. We could’ve an academic-ish grants call for “pre-collation consensus” in which they need to justify it’s utility. It might maximize exposure while minimizing near-term distractions for us?

2 Likes

This is cool! One question: your censorship solution requires leader selection that’s single but non-secret. Could this work with something like Sassafras or Whisk? @burdges might also be able to answer this.

1 Like

A priori, whisk discovers the next block producer when the previous block producer shuffles the masked validator keys, so you’ve only 6 seconds of advance knowledge, or 12 seconds if you control both the previous and next one. Whisk could produce maybe k < ¼ √n block producers for the cost of shuffling say 4 √n or something like that, which helps with buying additional blockspace.

Sassafras discovers the upcoming block producers maybe 8-12 hours in advance, so a sassafras block producer should make each sibling block for the same slot, or else run some more complex protocol for additional blockspace.

Sassafras tickets contain an ed25519 key called erased_public, with which it optionally signs the block and destroys the secret key to provide forward security. As erased_public is an ed25519 key, a sassafras chain could run Tor .onion services or provide an anonymize mixnet service using erased_public. Integration with tor or a mixnet sounds complex, but Tor integration enables almost arbitrary interactions between users and whatever upcoming block producer they select. Ideas include:

  • Users can play a game or run an arbitrary MPCs with one another via the block producer, who then produces an MPC transcript proving the block. Very nice for AMMs.
  • Block producers can sign a promise to include a tx. An interactive protocol could control sequencing. etc. All this runs into issues with fair exchange of course, but many things become possible.

Sassafras tickets also contain a second ed25519 key called revealed_public whose secret key is the hash of a VRF output. A sassafras block producer must produce two VRF outputs when claiming their ticket, one of which reveals the secret key behind revealed_public.

At the cost of an O(n) sized encryption header, a user could encrypt a message to k of n upcoming revealed_publics. After the first k-1 make their block, then if any one of the remaining n-k+1 block producers makes their block, then the message can be decrypted by anyone who sees the message. We know stronger “encrypt to the future” protocols based upon IBE and Groth’s DKG, but this flavor costs us almost nothing to deploy, just an extra VRF in block headers.

Themis’ scheme sounds like you encrypt your tx to the revealed_public of slot j+1, send your encrypted tx to slot j over Tor, and then j+1 decrypts blockj. You could encrypt to two out of slots j+1, j+2, and j+2 though, which costs maybe 80 bytes * 3 = 240 bytes.

3 Likes

thanks for ideas! we have Sassafras on the radar, it should help with our usecases once it’s out

It’ll take us a while for Tor integration, even once Sassafras works. In particular, we should give Tor some treasury money to help finish up Arti and buy a support contract for a bit.

Anyways, the same ideas could work with Aura right now, using conventional networking. You need IP addresses and public keys associated to the collators in some way user agents can identify.

1 Like

We should definitely open Cumulus up for more experiments in collation protocols. It’s a related but very different problem to standard blockchain consensus, and it remains to be seen what approaches we can take to get the most out of the core protocol, especially when we account for alternative execution models that are more parallelism-friendly.

It would be nice to see experiments implementing HotShot (Espresso Systems) or taking indications from Sui/Solana execution models for building many blocks in parallel to target multiple execution cores.

1 Like

In this grants issue, I discussed doing encrypt-to-future-slots in Aura using identity-based encryption with each collator being an IBE master key generator and each slot VRF being an identity based secret key.

Increasingly, I think encrypt-to-future-slots used properly could yield similar security properties to encrypt-to-future-via-small-DKG, by which I mean small DKGs have similar limitations.

2 Likes