The following is a short update on the ~334 validators slashed in era 4543 on Kusama:
Ref 16 will be enacted at block #15722953 (~a little over 5 hours from now). This 1.) cancels the financial consequences of slashes of era 4543 and 2.) submits a new election solution so that a new era will continue afterwards.
- Upon there being a new era, the era reward for the current era will have a larger ~3 days worth of era rewards for 1 era (inflation is determined on a millisecond basis, so longer eras mean more total rewards per era). Unfortunately validators that were slashed will miss out on the majority of these rewards.
- A new runtime upgrade with these changes will be ready to propose in the next day or two. If passed and enacted, it changes slashing slightly so that nominators are not removed upon slashing reversal. It is only upon this runtime change getting enacted that nominators will not have to re-nominate slashed validators. This change getting enacted will restore the state of staking to what nominations were before the slash of era 4571.
- A brief description of the issue many people had is as follows:
- On 9.12.2022 a large amount of disputes flooded the network. For some people’s nodes, this caused a deadlock between parachain sub-systems. This prevented the node from fully shutting down everything and killing the process, so for many people parts of their node shut down (the parts that may do grandpa voting, producing blocks, sending imonline messages, etc) without the entire thing shutting down, so unless they manually restarted their node, it wouldn’t continue to produce liveness, thus getting multiple people slashed for being offline in the same sessions.
- The issue has been reproduced on Versi, and a client fix with this will be out shortly. This client release may be separate from a runtime only release with the changes that restore nominators.
- A proper post-mortem will be made soon once there’s a few more details known
I want to shed more light into Remove implicit approval chilling upon slash. by kianenigma · Pull Request #12420 · paritytech/substrate · GitHub.
This PR has been open for a while and is part of our efforts into redesigning the slashing system. For the sake of transparency, we actually wanted to merge it a long time ago, but we decided to wait until our proposed design for the “next” slashing system is ready, and then move forward.
The Kusama incident was just some extra catalyst to merge this ASAP and propose it to Kusama (and later on, Polkadot).
In short, this PR removed any kind of “automatic-chilling” for nominators. Validators are still disabled until the next era as they get slashed.
In short, the main function of this mechanism was to protect nominators. As we have evolved and have nomination pools now, entities that want to be idle, and expect to “be protected” should move to being a pool members. Nominators are scarce spots reserved for active entities who are monitoring the system, and respond to event like slashes quickly (and in an economically rational way).
The main flaw of this mechanism was that the way it was originally implemented, even if a slash is canceled, the chilling is not. This has historically been a major pain point. See: Full Slash Reversals · Issue #6835 · paritytech/substrate · GitHub
Weighing the benefit of protecting nominators, and the drawback of damaging validators, we argue that this “automatic chilling” is not worthwhile.
As a short update to this, Ref 16 was executed sucessfully:
- The financial consequences of all the slashes were cancelled, however nominators still need to re-nominate for now
- A new era was able to happen and new elections are occuring
- The era reward when it was able to change was much larger than normal (~13.6k ksm compared to ~548 ksm for a normal era)
- A runtime upgrade that reverses the nominator chilling is just about ready to be proposed, and upon enactment will restore the nominators that were chilled
- Client changes that address the issues with disputes will follow shortly
Issue has been fixed: Fix wrong rate limit + add a few logs. by eskimor · Pull Request #6440 · paritytech/polkadot · GitHub
The fix has been verified on Versi. More than 30 disputes per second are raised on average, network is still chugging along nicely. (Slower of course, but no other issues.)
The issue is described in the above referenced PR, in a nutshell it was a detrimental feedback loop. With the fix, dispute-coordinator is handling that large amount of disputes with just 20% CPU usage, dispute-distribution even only at 2%. The system is bottle-necking on candidate validation, which is how it is meant to be: Everything just gets slower with more load, nothing else happens.
Another update here is that Referendum 24 was enacted today, which has restored the chilled nominations to slashed validators. The state of the network is now more or less the same as it was before the slash (there are still a handful of validators that have yet to re-submit the validate intention).
The additions @eskimor mentioned above will be included in the next release (0.9.36), which is currently being prepared and should be out shortly. With this release, the issues that contributed to nodes going offline shouldn’t happen again.
A big thanks to everyone that worked on this (especially the community assisting in the goverance aspects of it)