Equivocation within parachains

We’ve got few cases of equivocations on Moonbeam, which lead to some issues.
After investigating with the collators, they are saying that they are running multiple node with the same signing keys.
When asked why, they mentioned some parachains suggesting to use this strategy as a backup in case their primary node goes down.

While it can be harmless to the parachain itself to do so in some cases, It can easily degrades significantly the overall network of the relay.
We already know that the relay network is quite unstable, I would suggest to forbid equivocation for parachains too.

I don’t think I follow. How can this affect the network of the relay chain?

We can not really forbid equivocations for Parachains. I mean we can implement it for Parachains, but that doesn’t mean that everybody needs to follow this “recommendation”. They could also just change the code and allow them.

Here is the issue about equivocations on Parachains: Check for equivocations · Issue #492 · paritytech/cumulus · GitHub

But yeah, not sure who recommends running multiple collators with the same keys. This is very strange :person_shrugging:

I’m curious. Would it not be a better strategy to run and archive node that shadow the collator with the possibility to restart it with the relevant collator key if it went down or otherwise broke? That way you won’t get equivocation happening, but do have a fall back.

Yes @bkchr, I don’t want to prevent equivocation (technically) but suggest that parachains should not allow or encourage equivocations.

@chris I suppose the reason some parachains do that for their collators as a backup solution (to ensure as many blocks as possible are produced).

In Moonbeam we provide a different backup system where the backup needs to have different key and the owner can rotate them on-chain instantly, avoiding equivocation.

That’s an interesting topic.
As a feedback, on Astar using Aura consensus we have tested double signing and never had any issue with it.
Some collators are using it to avoid missing blocks or getting slashed for inactivity, we don’t especially encourage it but see no problem in double signing on parachain since finalization is being managed by relay chain. Validator just selects the block it will include in relay and rejects the other one.

Yes depending on your parachain, there might be no impact.
But I still see 3 issues:

  • sending multiple blocks to the same validators will decrease their chances to actually process the block (more bandwith used, more cpu used). Having 2 blocks might not be a big deal, but what prevents those collator to run 5 collators with the same keys ?
  • With asynchronous backing (@rphmeier correct me), the collators will build directly on top of the previous parachain block without always waiting for the relaychain inclusion. Having equivocation will generate a lot of forks in that case.
  • Some pallets are not well designed to handle equivocation. This is the case of frontier which can produce the same ethereum block during equivocation leading to the bug where an ethereum block can be associated to the wrong substrate block (it is not possible in frontier to use all substrate data when producing the ethereum block to make it unique unfortunately)

Got it, yes async backing may bring a different situation when it arrives, definitely has to be tested.

On relay side, this may happen to a few validators with poor performances but this would naturally lead to low rewards then a decrease in delegation. Strong validators are able to process these blocks without any trouble.
having 5 nodes signing with the same key would make no sense for anyone, the case doesn’t seem relevant.

As per frontier troubles, I’d be pretty interested to see what happens. There was never any trouble on Astar/Shiden side while blocks are still mostly filled with frontier data.

Collators don’t directly send their blocks to the validators. What happens is that they first send a “hey I have a candidate with the following hash for the parachain X”. If the validator is assigned to this Parachain and isn’t already validating any candidate, it will request the candidate from the collator. These duplicated collators only occupy slots of the validator and may generate a little bit more traffic.

However, no one should be running two nodes with the same key.

That sounds very worry some. Slashing based on missed blocks is hard. You will need to run some other protocol to give collators that just didn’t make it (there are multiples reasons why it could not include a block) to proof that they were online. Just slashing because there wasn’t a block in a session doesn’t sound good.

@bkchr thanks for feedback, slashing is not applied on one single missed block but only if all blocks are missed during a session, with a set limited total set to make sure each collator has at least 4 slots to produce blocks.

A collator not producing block is harmful for the network in the round robin selection of Aura (especially on Astar where pending queue is always full). A keep alive signal is not enough, poor perf collators may send it but never be able to produce blocks. As a result, there are very few slashes actually happening. The slash amount is equivalent to a few days of rewards for a collator which is pretty limited.
By the way, the situation of missed blocks has improved a lot these last weeks with a block rate close to 12 seconds, this never happened before.

I agree with what Basti said.

Also, the situation has improved in the last year, but Kusama used to have bad relay validators (not powerful enough) preventing blocks to be included, not because of the collators.

Additionally, if you slash collators not producing blocks, it is easy to DoS a collator to get it slashed regularly.

This is accurate. However, the relay-chain code handles this as gracefully as it can.

@crystalin actually there could be many reasons for a block missed, collator geolocation was one of these and double signing proved to improve significantly block rate.

But I wouldn’t like to make it the thread of our specific case, rather than limiting parachains wee should focus on what caused equivocations you had and how to fix this. I would be pretty interested to have details about incident you had, see if we could help.