Dynamic Backing Groups: Preparing Cores for Agile Scheduling

Here’s a more concrete sketch of what this might look like:

// The (maximal) rate of coretime consumption of this chain
// This is a number treated as a rational over 57600 
// (from RFC-5), where 57600/57600 means one core. This value may be greater than 57600 to signify
// that the chain has more than one core assigned.
CoretimeAssignments: ParaId -> CoretimeRate, 
BackingGroupAssignments: ParaId -> BackingGroupIndex,

// Total coretime rate of all parachains assigned to the backing group.
BackingGroupUtilization: Vec<CoretimeRate>, // len: number of backing cores

// Possibly useful: a partition of group indices around a pivot point.
//
// This is a pivot point + a vector of all group indices (len: number of backing cores) with
// the property that all entries in this vector before the pivot point are full, and all entries in this vector
// after the pivot point are empty.
BackingGroupPartition: (u16, Vec<BackingGroupIndex>);

The rough idea is this:

  • Partition validators into backing groups (as we do now)
  • We track total coretime assignments (a single u32) across all assigned parachains to a backing group. The target consumption rate of a backing group is 1 core’s worth of coretime, but this target is soft. Validators rotate across backing groups, same as today.
  • Each parachain has a mapping to its total coretime consumption rate (a single u32).
  • Each parachain is assigned to a backing group. Many parachains may be assigned to a single backing group
  • Increasing or decreasing the coretime consumption rate for a parachain is O(1) - just update the coretime mapping and the total rate of its assigned backing group. Note that backing groups at this point may be poorly balanced or overfull.
  • When a parachain does not have a bucket yet but has coretime added to it, a group is chosen arbitrarily (or we could use the BackingGroupPartition to choose the first non-full backing group)
  • Validators post rebalance_backing(Vec<(ParaId, BackingGroupIndex)> transactions to the chain (using the new Substrate Tasks API). These transactions are only valid if the new rebalancing is strictly better than the previous one. This runs in O(n) time in the length of the vector (touching 2 groups per mentioned parachain - the old group and the new group), followed by an O(n) fn evaluate_improvement which requires that the new balancing is strictly better than the old one.
  • We may also consume weight in on_idle to eagerly rebalance groups, though this requires a reverse index.
  • The relay chain runtime, when verifying backing signatures, just needs to check the BackingGroupAssignments mapping for the parachain. We will keep around some small amount of historical group assignments for parachains to make asynchronous backing work (i.e. we must know which group a parachain was assigned to at some recent relay-parent).

This approach is inspired by the Phragmén election code in sp-staking, and the staking-miner.
The only missing piece is the evaluation algorithm, which seems like it is easily achievable. The rough interface is here:

// Evaluate the improvement of a rebalancing transaction based on the changes in backing groups and the number of steps.
// 
// This could either take _all_ backing groups or just the ones touched.
// The input is `(group_index, state_before, state_after)`
fn evaluate_improvement(
  buckets: &[(BackingGroupIndex, Bucket, Bucket)],
  steps: u32,
) 
-> u32;

For the moment, the main thing we want to avoid is groups which are overfull. Therefore, a simple algorithm might evaluate the improvement as:

  • 0 if there are any new overfull groups
  • Some positive number when there is a reduction in overfull groups
  • or 0 if the steps taken are over some constant maximum (say, 10 or 20). This avoids wasting time. (some stricter relationship to avoid degenerate situations where there are improvements to be made but which can’t be made in a short transaction may be important here)

Though I am curious to hear if there are strong arguments for doing better balancing (the main motivation would be reducing backing reward variance, as far as I can see) and what proposals there are for better algorithms. It might be useful to “pack” earlier groups and leave later ones empty, for instance. We have to make a tradeoff between tolerating over-full buckets and minimizing churn in the allocations of chains to groups