Push Kusama Limits with PoV / Weight Limit System Parachains

It’s been a while since we encountered the limit on the number of parachains that a Relay Chain can handle. Further, not every parachain exhausts its PoV limits, meaning that the erasure coded chunks sent over the network are not their maximum size. So we don’t know how the Relay Chain will handle heavy usage on all the existing parachains.

On a call today, an idea came up of creating a parachain runtime that has no transactional API, but has an on_initialize that scrambles some trie values and runs some computations such that the PoV size and time used are near or at the limits.

These would simulate parachains under maximal load. We can then register multiple ParaIds of the same runtime until until Kusama starts to have problems. This would give us insight into a current reasonable max number of parachains and indicate where the next wave of optimizations should be focused.

Some other ways we can experiment:

  • Increase the number of validators assigned to each parachain (e.g. from 5 to 10)
  • Have these limit parachains send messages over XCMP(-lite) to each other

I opened this thread to see what other people think of the idea and if it’s worth pursuing. I think it’s quite aligned with Kusama’s role as canary network.

6 Likes

I like the idea leverage Kusama more and do some experiments and pushing some limits. Expect chaos!

Just to make sure it is not too chaotic as we do have many live parachains running with some real tokens worth something.

It probably won’t cost too much to spam current Kusama parachains and spamming XCMs so we might as well do it ourselves in a controlled manner.

So we could do something like:

  • X testing common good parachain
  • Free transaction for whitelisted account only
  • Bots to spam the parachain with controlled manner (like how any load testing tool does)
  • Monitor the whole Kusama network at the same time
  • Shutdown the bots if other live parachain performance get impacted too much

Yeah, I like this idea as well. Right now we all assume (a) that one parachain’s activity has no effect on other parachains, and (b) that Kusama/Polkadot can support at least 45 parachains. But without them fully using their blockspace, we really don’t know that either of these are true.

The thing that will be tough with transaction bots is that it will probably be easy to fill one dimension of weight (time or PoV size), but not both at the same time.

But in reality, using a variety of testing strategies will probably yield the best results. So I think a mix of both our ideas (plus others if they’re out there) will be a good approach.

We can totally have an extrinsic with two input parameters: compute & storages.

fn fill_block(origin, compute: u32, storage: u32) {
  for _ in 0..compute {
    do_some_math()
  }
  for i in 0..storage {
    read_write_storaeg(i)
  }
}

In that way we can have a good control of compute usage and proof size usage for each block and can try whatever scenarios we want.

Yes, but the read_write_storage function takes time as well, so any compute needs to deduct the storage op time.

I am very confident if you describe this in a well defined issue in substrate and mention as mentor, our contributors will build the pallet/runtime :slight_smile:

Why does this need to happen in on_initialize?
Could we not have a permissioned origin like LoadTester who can submit PoV heavy transactions? Using on_initialize for that seems dangerous to me.

I think we need to test constant load, because we would want to add several of these parachains until Kusama breaks to find what the next limiting factor is. If you have 10 of these, then it’d be impossible to ensure that they’re all getting hit at the same time if you rely on a LoadTester to submit calls.

Not saying an implementation must use on_initialize, but it was my first thought. We could order the pallets such that the DMP queue gets processed first and could have a LoadTester origin from the Relay Chain that says “fill blocks to (s%, t%)” (space, time). So if the network got congested then Kusama could send a message to back off.

We could order the pallets such that the DMP queue gets processed first and could have a LoadTester origin from the Relay Chain that says “fill blocks to (s%, t%)” (space, time).

Or we limit the number of blocks that it goes on for. So like your "fill blocks to (space%, time%)” but for at most x blocks and then stop. Otherwise we nuke the network with no way of stopping it.
But yea, having a way to stop it before x blocks could also be good.

Just thinking out loud: Would the on_idle not be the perfect fit?

Yeah, that is a good idea, probably better indeed.

By setting (space%, time%) and configuring the number of paras, we should be able to do things like:

  • Set all to (50, 50) and see how many paras we can add until things break
  • Set some to (100, 0) and (0, 100) and see if some have problems (I’d expect network to be a bottleneck on large size PoVs but not on compute, but maybe running secondary checks is the bottleneck on compute).