Polkadot DA vs competition

If we assume that the state machine replication, execution or ZK proving is not a bottleneck, then it would boil down to the amount of transactions (or other data in some cases) a blockchain can push through.

Theoretically and as of things stand now, Polkadot provides 0.66 KiB/s[1] per core or 66–132 MiB/s for the whole system, depending on the total number of cores. Number of core devs expressed opinion that this number could be further increased removing the bottlenecks in networking, improving the erasure coding, etc. No testing to verify any of those claims were performed to my knowledge.

What is known about the values of other DA layers?

  • Ethereum danksharding and proto-danksharding.
  • EigenDA,
  • Celestia,
  • Avail
  • (any other I miseed?)

I guess it’s worth noting that it’s important to look for current values and future values (along with plausiability). E.g. Celestia claimed that the capacity of the network grows with the number of light nodes, but I think it’s more complex than that.


  1. 4 MiB per PoV every relay-chain block. ↩︎

1 Like

Very small nitpick, the PoV size is actually 5MiB which brings us up to 0.83KiB/s. This also assumes that we have async backing and 6 seconds block time for parachains.

That’s a good point! DA shouldn’t be the bottleneck for a robust blockchain system. Celestia provides a 2D RS encode with fraud proofs and Avail provides a 1.5D RS encode with KZG commitments, which are all good ways.
Just notice our grants team approved a proposal about building a DA attesting bridge in DOTSAMA, that’s awesome!

@bkchr @pepyakin Is your (0.83KiB/s x # cores) directly comparable to the “1.3 MB/sec” claim here

If not, why not?

If so, how can we measure the # of cores right now accurately? (There is big spread between 66–132 MiB/s)

How does this KAGOME C++ Erasure Coding change performance?

image

Presumably this component can be connected to the Rust implementation – if not, why not?

More generally, can you advise how we can run unbiased engineering benchmarks between Polkadot, Kusama, Ethereum pre+post Dencun, Celestia and others?

1 Like

I think they mean 0.83MiB/s (5MB / 6s block times with async backing).

With systematic chunk recovery #1644 the reconstruction overhead will be reduced almost to zero, so in theory we that may allow us to bump the PoV max size limit and/or number of cores. But ultimately, we’re limited by the network bandwidth requirement for validators.

We employ a log-exp extension field multiplication trick, while Kagome seemingly uses full multiplication tables. It’s possible Kagome’s C code runs faster in benchmarks, but runs slower in production, due to using much more CPU cache. We donno without glutton benchmarks.

We’ve many untried optimization ideas for erasure coding. Bernard is doing some SIMD work now. We never explored Intel’s carry-less multiplication well enough. GF(2^12) maybe faster than GF(2^16), using more even smaller log-exp tables.

We’d ideally do four very different implementations of the field arithmetic here, all fully optimized with SIMD, then plug them each into testnets running gluttons to observe real world performance. We’ve higher priorities right now, but…

Yeah, we could alraedy ask someone to see if Kagome’s C, or a fixed version of catid/leopard, runs faster or slower under load. We cannot rule out Intel’s carry-less multiplication this way, but maybe it’d tell us not to bother with GF(2^12), but it’s more likely to rule out the big tables.

All things being equal, I’d choose the fastest flavor which stresses CPU cache less, so honestly I’d never even have tried the Kagome flavor myself.

Approval checkers must still recompute the whole chunk tree, so I’d expect systematic chunk recover only save them maybe 40%. Anybody else who utilizes availability later could avoid this however, but the savings for polkadot itself should only look like 40%.

From “0.66 KiB/s per core or 66–132 MiB/s for the whole system, depending on the total number of cores” @pepyakin is definitely implying that there are 100-200 cores in Polkadot at present and in the future, this # of cores is supposed to grow. (Correct me if I’m wrong thank you!) I believe Polkadot and Kusama have a different number of cores right now, and that a new relay chain could have more or less.

If Polkadot DA throughput is NOT actually linear in the number of cores, and throughput is instead bounded by network bandwidth well before a few hundred cores (!) then we should be able to generate a graph of throughput as a function of the number of cores in realistic network conditions and show the presence of this non-linear behavior.

How would you generate this graph?

I would like to see multiple graphs showing throughput as a function of the number of cores under (a) 0 parachains [control] (b) something approximating what Polkadot is doing today (c) something approximating what Polkadot going to be under “minimal relay chain” conditions but maxed out in terms of what it can do for parachains.

If Polkadot DA tech can be unbundled from Polkadot because it is ludicrously better in benchmarks, we should unbundle it and productize it outside Polkadot. This is like Tesla opening its supercharger network to non-Tesla cars, but first we have to improve our benchmarking game and get the first few graphs. What do we do next?

Availability cannot be unbundled. Approval checkers enforce correct encoding, so an availability only block saves execution CPU time, but not bandwidth or reconstruction CPU time. We do this to have a better undecodable ratio of 2/3 vs the 2d RS undecodable ratio of 25%.

Your other comments make little sense: We’ve no reason to accept bottle necks in polkadot yet. We just fix them when we find them, ideally bigger ones first.

If you are committed to Polkadot’s approval checking as the only way to run an L1/L2, you’re right. But its 2023, not 2017, and there are lots of competing architectures.

Competing architectures unbundle DA in a modular architecture

and at least optimistic rollups + ZK rollups can unbundle the mere recording of blobs just to record state on L1 in extremely mundane ways like this:

This article does a decent job of explaining things

but doesn’t try to do the throughput testing behind “the amount of transactions (or other data in some cases) a blockchain can push through” being X MB/s level of throughput. If Polkadot DA is a function of “cores”, engineers would be able to measure it. If its not a priority for Polkadot people, but the tech is ludicrously better, it deserves to be unbundled and not just serve just the “approval checking” architecture alone.

As I said, we do this to have a better undecodable ratio of 2/3 vs the 2d RS undecodable ratio of 25%. Although Celestia has incomparable assumptions, you need way more samples to prove availability with a worse undecodable ratio, so they’ll always need way more total bandwidth for comparable confidence.

I doubt “availability” describes what starknet does. You need either a sampling based protocol, or else a concentration argument from byzantine assumptions, to believe someone would give you the data.

It certainly seems like there’s demand for it to be unbundled so it’s something we should explore. We also haven’t seen how far we can push max_pov_size since parachains are currently bottlenecked by substrate execution time, not witness size (excepting perhaps in migrations), so it’s likely we could offer significantly higher data bandwidth if it had a use. I believe @andronik is currently testing this.

For example, if we even just double the PoV size we’d equal Celestia’s maximum unsharded data bandwidth on their roadmap with 1000 validators.

Isn’t it enough for a DA-only user that the bitfields are signed? Unless I’m misunderstanding something, I don’t see why approval checkers would have to reconstruct the whole blobs.

More the issue I see is we’d need to offer to retain the data for much longer to attract ORUs.

1 Like

Not really. We need the hashes in the chunks Merkle tree to match reconstruction, otherwise honest availability voters might not represent available chunks.

If nobody reconstructs then an adversary could encode multiple blocks, and much garbage, into the same erasure code, which gives them some control over what gets reconstructed later. It’d be like using celestia for availability but then not doing any sampling.

An adversary cannot have more than one block match their declared block hash of course, but this is a read herring since only the systematic chunks vote on that data.

All this said, an optimistic roll up on polkadot could be cheaper than a full parachain by using either of two methods:

  • An availability only parachain core cannot send messages quickly I guess, so even though they’d need approval checkers they could choose needed_approvals less than polkadot demands.
  • As discussed previously, we do know other protocols who avoid availability entirely, but they look more expensive if aiming for similar security levels, again as required by messaging.

We’d need figure out exactly what messaging rules applied in these cases.

1 Like

For the last few months, I thought high throughput Polkadot DA servicing ORUs+ZKRUs were a great revenue generating + user growth plan. But after observing so many Polkadot fellows cite something like The Mantra:

“Polkadot is better than EVM Chains/SmartContracts/ORUs/ZKRUs/…!”

I’ve concluded that trying to attract EVM Chains/ORUs/ZKRUs should NOT be the goal for Polkadot engineering at all, even if the market is large and growing, and even if “Polkadot DA” is so ludicrously good it could serve all of it with enough cores. Instead, I believe its essential that Polkadot engineers answer “The Big Question@gavofyork asks out loud here in October’s fellowship call:

If the answer to The Big Question is:

POLKADOT IS PARACHAINS, COREJAM SHOULD NOT BE ROLLED OUT ON POLKADOT

then to simplify/focus/prioritize engineering, we should have TWO production chain development paths:

  1. Polkadot relay chain pursuing parachains alone
  2. CoreJam/Coreplay/DA system chain pursuing DA+CoreJam Work packages/Services alone

Substrate will make such a split easy, and the basic “um, so how much throughput does the Polkadot/Kusama/CoreJam chain have to do what” should be measurable.

This way, everyone in the ecosystem can EXPAND the mantra of

  1. “Polkadot 1.0 is better than EVM/SmartContracts/ORUs/ZKRUs!”
    to
  2. “CoreJam/DA/Coreplay in 2.0 is better than EVM/SmartContracts/ORUs/ZKRUs!”

The DA solution in 2 can be put back into 1, but after its suitably ludicrously awesome – doing DA improvements for 1 when the future is about 2 is … unnecessary? A sub goal would be to demonstrate CoreJam superior tech + its high throughput DA in 2. Here is a Ton TPS stunt you could use as model of how to present DA throughput.

All the (hopefully unbiased and third-party) DA throughput testing should be done in the service primarily of 2.0 CoreJam’s Availability Protocol, only secondarily for parachains+messaging, and not for ORUs and ZKRUs. This is because the engineers working on it would find it too depressing to serve the ORUs and ZKRUs: it is a betrayal of Polkadot’s uncompromising scalability-first culture and everything they stood for, and wish to stand for.

The blobs of data (and DAS/proofs/…) don’t have any idea what they are about, but the engineering culture definitely does. Converting that culture to be Ethereum-centric with ORUs+ZKRUs etc. is a demoralizing conversion therapy that no one should recommend: Why be something that you are not? Instead, everyone should maintain the pride in the uncompromising scalability-first culture.

Instead of swallowing Polkadot pride and going back to ORUs/ZKRUs enshrined as the only path for Ethereum scaling, this pride needs only to shift from being solely about the Polkadot Relay Chain (with parachains, ink!, WASM) to also being about CoreJam/Coreplay+Availability protocols, deployed whereever it can be (as L1s, L2s), to have maximum ubiquity.

How hard would it be to alter our DA to replace merkle hashes with KZG commitments ala Danksharding? What would be the trade-offs? Then it can be decoupled.

I’d say corejam sounds like an almost unrelated topic to availability.

You’d prove the degree of the polynomial represented by the evaluations, which tells you that any reconstruction yields the same data, but nothing about the data being correct.

At present, validators only check a merkle proof when they receive their availability chunk. It’ll require pairings to do this chunk with KZG, so everyone does maybe around 10,000 times as much work, merely to know they downloaded their piece correctly. That kills it imultiple f you’ve 1000 nodes.

I suppose celesia works with fewer availability server nodes, which creates its own problems. I’ve no idea how ETH handles this per node costs, but if we run with fewer than 500 validators per relay chain then we loose multiple relay chains.

It’ll be worse to do the MSM that creates the KZG commitment. And worse again to prove the degree bound. At least only backers or collators do this work though. All existing benchmarks here employ multi-threading so we cannot cite anyone really doing this under diverse work load, only dedicated provers. You could compare vs bandwidth costs here though, which brings more complexity.

It’s tangentially related because we’d need to bundle PVFs to take full advantage of Polkadot’s data bandwidth given that it’s likely much larger than what we can ever expect to execute with one state transition in Substrate.

You can use sampling to prove the data is correct. That’s what would allow it to be decoupled from approval checking.

In my understanding the problem for us, at least with how Ethereum is proposing to do it, is that the loss factor is much worse. You need 75% of the chunks to reconstruct the data.

Yes, one could do KZG plus approval checkers, but I interpreted Andronik’s question as being about doing the proof in other ways.

It’s expensive for approval checkers to regenerate that KZG commitment too, and it cannot be dumped on the collator, although it’s still much cheaper than the low degree proof I guess.

If they’re doing 2d RS for small fraud proofs, like celestia, then they only need the right 25% for reconstruction, but you can make reconstruction impossible by hiding only 25%. In celestia’s case this requires way more sampling. I doubt it matters if you’ve already paid the costs for the low degree proof though.

If they’re not doing 2d RS then yeah you could set a 75% reconstruction threshold, but imho this feels kinda hard to justify. It’s not as bad as a lot of the NFT tech I guess. :wink:

It’s 2D, but you don’t need fraud proofs so I’m not sure why.

What am I missing here? How can you reconstruct with 25%, but not if 25% is withheld?

I bet they do envision fraud proofs, but maybe only within roll ups?

Any k-of-n permits reconstruction in 1s RD, but a 2d RS has predictable “holes” in that you need different pieces can be reconstructed from different chunks. It’s all connected in a grid an all fills in eventually, but the undecodable ratio winds up worse.

A fountian code is even stranger. They’ve an even worse undecodable ratios, but it’s an NP-complete problem to find these holes. I think approximation algorithms help so you cannot be crypto graphically secure in knowing that an adversary cannot create whatever holes they like. There are academics who pushed doing the fraud proofs trick with fountian codes, which gives some cute theory results, but it’s really expensive.

The section on “A proposal for 25% reconstruction” in this article talks through a reconstruction process with just 25% under very special assignment process:

This proposal is dependent on bivariate polynomial interpolation algorithm research coming together, which might or might not happen for “danksharding”. It gives a decent overview of the basic 75% reconstruction process first and links to a lot of other resources for both novices and experts.

Here is a podcast edition for novices only.

1 Like