The polkadot binary is a multithreaded program that will benefit from having more hardware cores available, and all configurations are tuned to take into consideration the worst case scenario on the worse recommended hardware configuration. Currently, the minimum hardware requirements recommends at least 4 hardware cores with SMT disabled.
This thread is started in order to gather feedback on the opportunity of raising the minimum requirements for a polkadot validator to at least 8 hardware cpu cores, so if you got any concerns of any kind about this please let us know.
Why would the network benefit from raising the minimum hardware requirements ?
With async backing support enabled, parachains can now produce bigger blocks which take a lot more time to execute, collator authoring time can be increased from 500ms up to 2000ms, that means the execution of PVF for backing and approving processes might need 4 times more CPU resources to execute those blocks and confirm they are valid.
Currently, PVF execution pool is capped at maximum 50% of the 4 hardware cores, that means for each relay chain block we have a maximum of 12s(relay_chain_frequency * 2 cores) of cpu time for the parachain block execution. In average a validator needs to execute around 7(1 backing + 6 tranche0 approvals) parachains blocks, so if we divide the maximum cpu time by the number of parablocks, it results the fact that the average parachain blocks execution times will not be bigger than 1.7s (12s / 7).
By increasing, the minimum required core count we can then safely increase the hard capacity of the PVF execution pool. For example if we raised the capacity to 50% of 8 hardware cores, then the theoreticall average for parachain block execution times would increase from 1.7s to 3.4s (24s / 7 ), significantly above the 2s backing time recommended by async backing.
Why donât we raise the PVF execution hard capacity ?
Currently, the PVF execution hard capacity is 50% of 4 hardware cores, increasing it would give us more execution time for the parachain blocks, however this analysis concurred that it would not be safe on validators running with the minimum requirements of 4 cores, because it would steal valuable cpu time from other subsystems doing critical work as well.
Other optimisations avenues
In parallel with this, other optimisations are also being progressed on that would reduce the amount of work a validator needs to do.
However, since the optimizations being worked on stacks up with raising the minimum required hardware if we want to be able to support 10x more usage on the network we would actually want to do both.
Current state of afairs
Looking at Polkadot telemetry it seems that the majority of validators already use more than 4 cores.
Note! This data it is not reliable because it includes collators nodes and because on some orchestration systems the binary would see a certain number of cores, but you get rate-limited via other mechanisms
Kusama
Polkadot
This raises the question, that if the majority of validators are already running on more powerful hardware than the minimum required then there is already spare/wasted capacity that we can use to execute more parachain blocks, which would actually lead to better utilization of the provisioned hardware and will result in the network being able to support a significantly number of Polkadot cores
(not to be confused with hardware cores).
Alternatively, if validators are runing at the bare minimum requirements then increasing the core count from 4 to 8, might increase non negligiably the costs of runing the validators for some people. I avoided doing some cost estimations here, because it will vary significantly from setup to setup, but we are looking forward to hear from the validators in the community if they think there will be impacted and how.
Preparing for the future
With all that in mind, we think that raising the minimum core count for validators from 4 to 8 is a change that greatly helps the Polkadot network overall to prepare for a future where usage increases 10x or more and it is a change that we should start proactively rather than wait until the maximum theoretical throughput is met.
Final note
While we think this is the path we should head into future, we are also aware that this might have an impact on the validators cost, so looking forward to hear from the validators community and what we can do to mitigate the impact as much as possible.