Pre-RFC Discussion: Candidate Receipt Format v2

rphmeier · August 17, 2023, 9:54pm

This is a Pre-RFC discussion on changing the CandidateReceipt collection of data structures used within Polkadot’s core technology: CandidateDescriptor, CandidateReceipt, CommittedCandidateReceipt, and PersistedValidationData.

Changing these structs is very painful, as it’s highly integrated within runtime storage, inherents, network protocols, and databases. As such, it’s better to do many changes to this format at once rather than one at a time. The outcome of this discussion will be an RFC to update the format, with the justification for each change.

I will edit this top-line post over time to add changes as discussed in the thread below.

Context: Current Data Types

(expand for struct definitions)

/// A unique descriptor of the candidate receipt.
pub struct CandidateDescriptor {
	/// The ID of the para this is a candidate for.
	pub para_id: Id,
	/// The hash of the relay-chain block this is executed in the context of.
	pub relay_parent: Hash,
	/// The collator's sr25519 public key.
	pub collator: CollatorId,
	/// The blake2-256 hash of the persisted validation data. This is extra data derived from
	/// relay-chain state which may vary based on bitfields included before the candidate.
	/// Thus it cannot be derived entirely from the relay-parent.
	pub persisted_validation_data_hash: Hash,
	/// The blake2-256 hash of the PoV.
	pub pov_hash: Hash,
	/// The root of a block's erasure encoding Merkle tree.
	pub erasure_root: Hash,
	/// Signature on blake2-256 of components of this receipt:
	/// The parachain index, the relay parent, the validation data hash, and the `pov_hash`.
	pub signature: CollatorSignature,
	/// Hash of the para header that is being generated by this candidate.
	pub para_head: Hash,
	/// The blake2-256 hash of the validation code bytes.
	pub validation_code_hash: ValidationCodeHash,
}

/// Commitments made in a `CandidateReceipt`. Many of these are outputs of validation.
pub struct CandidateCommitments {
	/// Messages destined to be interpreted by the Relay chain itself.
	pub upward_messages: UpwardMessages,
	/// Horizontal messages sent by the parachain.
	pub horizontal_messages: HorizontalMessages,
	/// New validation code.
	pub new_validation_code: Option<ValidationCode>,
	/// The head-data produced as a result of execution.
	pub head_data: HeadData,
	/// The number of messages processed from the DMQ.
	pub processed_downward_messages: u32,
	/// The mark which specifies the block number up to which all inbound HRMP messages are
	/// processed.
	pub hrmp_watermark: BlockNumber,
}

/// A candidate-receipt.
pub struct CandidateReceipt {
	/// The descriptor of the candidate.
	pub descriptor: CandidateDescriptor,
	/// The hash of the encoded commitments made as a result of candidate execution.
	pub commitments_hash: Hash,
}

/// A candidate-receipt with commitments directly included.
pub struct CommittedCandidateReceipt {
	/// The descriptor of the candidate.
	pub descriptor: CandidateDescriptor,
	/// The commitments of the candidate receipt.
	pub commitments: CandidateCommitments,
}

/// The validation data provides information about how to create the inputs for validation of a
/// candidate. This information is derived from the chain state and will vary from para to para,
/// although some fields may be the same for every para.
///
/// Since this data is used to form inputs to the validation function, it needs to be persisted by
/// the availability system to avoid dependence on availability of the relay-chain state.
pub struct PersistedValidationData {
	/// The parent head-data.
	pub parent_head: HeadData,
	/// The relay-chain block number this is in the context of.
	pub relay_parent_number: BlockNumber,
	/// The relay-chain block storage root this is in the context of.
	pub relay_parent_storage_root: Hash,
	/// The maximum legal size of a POV block, in bytes.
	pub max_pov_size: u32,
}

Commit to PoV Size

Motivation: provide the relay chain and validators with more information regarding the amount of data bandwidth candidates use. This allows for more optimal scheduling and networking.

Candidate descriptors currently only commit to the PoV hash, not the PoV’s size itself.
The descriptor should change the field pov_hash to a pov_commitment.

pov_commitment: (Hash, u32)

Validators in backing, approval-checking, and disputes should reject candidates when the PoV’s size differs from the commitment. The runtime should reject backed candidates where the committed PoV size is greater than the maximum allowed.

Commit to execution time taken (needs discussion)

Motivation: provide the relay chain with information about how much time candidates take to execute. This allows for more fine-grained resource utilization and bundling of candidates into approval-checking workloads.

The exact measurement of execution time is difficult - I suspect this should be either expressed in an instruction count (once metering lands) or a proportion of the maximum execution time value allotted to a single candidate. Since the semantic meaning of this field may change over time, it’d be best to have a future-proof format.

Synchronous Composability Commitments

Motivation: allow parachains to be synchronously interoperable with each other by building parachain candidates which together perform an atomic operation.

This is done by allowing candidates to specify additional constraints on their ability to be accepted to the relay chain in the form of “This candidate is only able to be included when a candidate from another parachain produces this commitment”.

I suggest an additional field on the CandidateCommitments of the form:

enum SyncCommitmentSource {
    Pvf, // The commitment comes from the bare PVF of the the parachain.
    Accord(AccordId), // The commitment comes from an accord which the parachain participates in
}
// 128 bit unique value - a "fingerprint" of the synchronous operation performed.
type SyncCommitment = (SyncCommitmentSource, [u8; 16]); 
struct CandidateCommitments {
    sync_commitments: Vec<SyncCommitment>,
    required_sync_commitments: Vec<(ParaId, Vec<SyncCommitment>)>,
    // ...
}

The sync_commitments are essentially statements of the following:

This candidate has produced an extra commitment from the given source and with the given data.

The required_sync_commitments are statements of the following:

This candidate may only be included when another candidate from each of the listed parachains is included, and each candidate produces the listed commitments.
If any of the the required candidates is reverted, this candidate must also be reverted.

Note that this format only ensures safety in synchronous composability. It is the responsibility of parachains participating in synchronous composability to maintain liveness. With synchronous composability, liveness is a concern, as one parachain may gate itself on a commitment from other parachains which is never issued. This likely implies sharing collator infrastructure. Best practices for using synchronous composability need to be studied higher up the stack.

The relay chain runtime must limit the total amount of sync commitments a single parachain candidate may issue, and the number of dependencies per commitment or in total in order to keep candidate receipts of limited size.

Sequence Numbers

Motivation: more efficient garbage collection and spam protection rules.

Parachains aren’t necessarily required to be chains. This leads to the possibility of parachain state transitions being cyclical, e.g. A → B → C → A → … .

Spam protection must take this into account, which is a source of high complexity in the asynchronous backing implementation.

Adding a sequence number to candidate descriptors and having this be an input to the PVF would simplify spam protection enormously, especially for upcoming features such as bundling and elastic scaling.

This could either go into the candidate descriptor itself, or into the PersistedValidationData. The former is simpler, but the latter is more efficient in terms of on-chain space used per candidate.

Relay Chain State Read Commitments (needs discussion)

Motivation: Allow candidates to draw on data which is stored in the relay chain state.

This is useful for e.g. smart contracts or other large blobs to sidestep being placed in the PoV. Candidates may commit to keys whose values should be read from the relay chain state and passed into the PVF as a parameter.

The difficulty here is in ensuring that the data is still available at all points where a candidate may still be executed, as relay chain full nodes will prune data. While nodes don’t prune data prior to finality, the relay parent of a candidate may be a finalized block even though the relay chain block where the candidate is included will not be finalized during approvals or disputes. Needs further discussion. Packaging these state keys into the AvailableData is little better than having them appear in the PoV.

eskimor · August 19, 2023, 7:47am

We might also look into existing fields, whether they are still relevant or could be replaced with something better:

E.g. the CollatorId I am still not sure what value it brings for the CandidateReceipt to commit to a CollatorId that is completely opaque to the relay chain. Same goes for the collator signature.

I am also not sure, we need the erasure root there. I mean we need validators to commit to it, but why would collators need to bother? There have been discussed various ways of bundling candidates, I am not yet there to fully rule out that we indeed would want to bundle multiple PoVs into a single erasure root. If we ended up doing that, then the erasure root in the candidate receipt would be entirely redundant.

Hence if possible, I think leaving any erasure root calculations to validators would be good in future proofing. This would also make it easier to change/evolve the availability layer. Changes would then no longer break the collator protocol. (Availability recovery by collators would still be affected though.)

dmitry.sinyavin · August 19, 2023, 8:18am

From the candidate validation side, it would be a great improvement to include SessionIndex into the CandidateReceipt. It would significantly simplify the logic of dispute coordinator, approval voting, and candidate validation, which currently rely on caching session indexes for every activated leaf. When executor parameters were introduced, we discussed the inclusion of the session index in the candidate receipt, but that change turned out to be too breaking for the scope of that task so I implemented a two-step parameter request that has later been proven as not 100% reliable.

CC @pepyakin I remember you had some thoughts on that, too

rphmeier · August 20, 2023, 3:13pm

Adding the session index to the candidate receipt makes remote disputes harder to evaluate.

For candidates which actually appear in a relay-chain block, we can verify the session when they are posted to the relay chain. But for candidates which haven’t appeared in any relay chain block, validators would need to check that the claimed session index is accurate for the relay parent (which requires the same lookup paths that this field attempts to optimize around).

eskimor · August 21, 2023, 6:13am

I think this might be fine:

In backing we are able to verify the SessionIndex on chain. It should therefore be possible to rule out any candidate getting backed where the SessionIndex is not correct.
With 1 in place, approval checkers should be able to just not approve (not dispute) if they ever saw such a candidate, as it is guaranteed to not exist - therefore backers can not trigger a finality stall.
With 1 and 2 there will never exist honest backing or approval votes for a candidate where the SessionIndex does not match.
All dispute votes will use the same consistent SessionIndex as specified by the candidate. If the SessionIndex was wrong, then the candidate would not exist on any chain. Only dispute votes can exist and they all use the same SessionIndex. Therefore results would be consistent, which seems to be the only thing that really matters for a candidate that was never included on on any chain.

With 3 there will never be an inconsistency, between honest validators - which could otherwise lead to them getting slashed.

sandreim · August 21, 2023, 11:23am

It might happen often in practice that connectivity issues of a parachain collator will stall all other parachains participating in an accord. I am thinking that composability requires probably some heavy collator coordination, not necessarily sharing the infrastructure (yes, that would be more efficient), but it should work on a best efford basis to avoid liveness issues.

From the top of my head one possible solution could be that the collators craft a normal block and provides a scheduling_affinity: Vec<ParaId> in CandidateCommitments . This includes the set of parachains the candidate wants to exchange information with if scheduled together.

The backers execute the PVF twice, but with different parameters on the second run to describe channels to other candidates of interest (scheduling_affinity) that are assigned to the same backer . We’ll need to add a ComposabilityCommitments structure for the new commitments which are first produced by backers. It must include a new head_data for an additional block that was created by executing the calls to other parachains. So, due to this composability checkers must build a child block based on the initial candidate for each parachain that changed state during.

The major issue with this is that collators will need to import the new block and drop any block they have built already. Importing it might be time consuming and requires replicating the execution environment of the backers.

gavofyork · August 29, 2023, 9:36am

I’m not sure about the general direction of this. Much of it (execution time, PoV size, synchronous composability) seems to be based on the (unstated) assumption that it is helpful for the Relay-chain itself to become a “smart” scheduler of PoVs and thus be based on a dynamic model of PoVs. I’ll unpack the different between “smart” and dumb models.

The dumb model is where PoVs have homogeneous resource requirements for evaluation. They use the same I/O bandwidth and they require the same computation. Validators, we know, provide a fixed and well-known resource in terms of compute time and bandwidth. Thus, any validator group can validate any PoV just as well as any other. As such, the scheduler (read: Relay-chain) must “only” ensure that each validator group is assigned to an economically valuable PoV each block. While this is not necessarily a trivial task, it is much easier than the alternative.

The dynamic model, which implies a smart scheduler, would drop the assumption that PoVs have homogeneous resource requirements needed for their evaluation. I/O, execution time and synchronicity all become negotiable factors in a PoV. The onus is placed on the Relay-chain apparatus to ensure that PoVs are assigned, possibly in a many-to-one fashion, among validator groups in such a way as to maximise the overall economic utility of those groups. Groups themselves may possibly even have different resource provisions. The scope of the problem to be solved by the Relay-chain becomes a dynamic and decentralised constraint optimisation problem.

Overall, it seems to me that this may be placing a little too much responsibility on the Relay-chain and goes against the maxim of “do one job but do it well”. While I’d agree that optimal resource usage is an important goal, could we not find an alternative model whereby PoVs remained static in their resource and synchronicity requirements, but tasks could still be combined in some pre-Relay-chain phase, one level higher? This would mean a single PoV could imply the progression tasks, but PoVs come as atomic parcels which fit well according to the resources which a validator group can be assumed to have.

rphmeier · August 29, 2023, 6:01pm

Beyond optimal resource usage, we have two problems to solve, which actually overlap little in requirements:

We want work packages which land on the relay chain to consume some bounded amount of resources. Those work packages are composed of one or more state transitions bundled together whose data are served by roughly the same validators primarily for the purposes of fast paths in the networking layer. The staked tokens attesting to each state transition’s validity need not overlap, so long as they are sufficient.
We want state transitions to be possibly synchronously composed. There is no hard requirement for the same validators to check composed transitions at any stage in the pipeline or for their data to be served by any of the same validators, so long as they are guaranteed to go in together or not at all. Therefore, synchronous composability is independent from bundling state transitions in work packages.

While we may unify (1) and (2) for implementation simplicity or PoV optimization, this comes with tradeoffs which will bear some further discussion, and a solution which separates those concerns while maintaining simplicity seems possible.

The devil is in the details here. That is, the on-chain scheduling problem is vastly simplified by limiting its specificity as to which tasks are co-scheduled, to what extent and at which times, and which validators are intended to work on which tasks, at least to the extent that these details can be negotiated on the p2p layer and handled within a more powerful and information-rich environment.

Though for the scope of this RFC, we just need to decide on the features which a state transition receipt must have.

gavofyork · August 30, 2023, 10:59am

This definitely seems sensible, yes. The question in my mind is whether these bounds should not just be some system-wide value, the same for all PoVs. Perhaps it’s fine to make it explicit if for no other reason than forward compatibility. I would certainly agree that we need a clear way of stating this, the question is whether it should be stated inside the PoV, and if so, for what reason?

I would say that this is questionable. Not because we don’t want tasks to be synchronously composable: I firmly believe that we do. But as I alluded to earlier, because it’s not clear to me that we want the Relay-chain to be solving the rather difficult problem of combinatoric optimisation as well as all of its other duties. Specifically, we have (at least) two routes to synchronous composability of tasks:

1-PoV-per-Task, Relay-chain-Combines: Here we systemically have PoVs utilize less resources than a validator-set can validator on one Relay-chain block. Each PoV guarantees the state-transition of a single task (as now). The Relay-chain somehow figures out how to schedule potentially 1000s of PoVs across, say, 500 validator groups, all of which have not just differing resource requirements but also different synchronicity (co-scheduling) requirements.
1-PoV-per-Group, PoV-Authors-Combine: Here we “pre-bake” PoVs to have no synchronicity (co-scheduling) requirements and (presumably) to utilize the same amount of resources guaranteed by a single validator group. To allow the possibility of synchronicity, PoVs may reference (and thus transition) more than one task. Canonicality is determined at a per-PoV level (all transitions in a PoV are either canon or not). It is left as a problem for tasks to have their PoV-authors solve the economic-combinatoric puzzle to determine which tasks should be co-scheduled and when.

Spree is an interesting one; the assumption was that it would be a special case of (2), however in principle it could fit into the model of (1).

In my mind, the hardest part of all of this is solving the economic-combinatoric puzzle of co-scheduling tasks in a manner resilient to the usual attacks (eclipsing, censorship). I’m skeptical of bundling this responsibility up with the Relay-chain validators, rather than leaving it as a problem to be solved at a higher level, probably as multiple competing opt-in Task-PoV-Author networks.

rphmeier · August 30, 2023, 2:24pm

We want state transitions to be possibly synchronously composed.

I would say that this is questionable. Not because we don’t want tasks to be synchronously composable: I firmly believe that we do.

I meant this to be a general formulation that encapsulated the end goal as a point of agreement. I want to be clear I’m neither arguing for (1) or (2) and have no favorite solution here as long as the goals are met. I’ll refer to (1) as the ‘bundling’ approach and (2) as the ‘merging’ approach.

1-PoV-per-Group, PoV-Authors-Combine: Here we “pre-bake” PoVs to have no synchronicity (co-scheduling) requirements and (presumably) to utilize the same amount of resources guaranteed by a single validator group

I agree that co-scheduling is a main source of complexity. As long as coretime is scheduled on a per-task basis there must be some co-scheduling requirement for the relay chain to accept simultaneous state transitions of any kind. When building, backing, or posting (to the relay chain) state transitions for multiple tasks, all must have current available coretime and the relay chain runtime must be able to evaluate that a task’s resource consumption is paid for and debit them appropriately.

It’s likely intractable for the relay chain runtime to have to anticipate the combinations of possible state transitions of different sizes and designate the validators which will work on those combinations. This is the basis for my suggestion that we must punt this up to higher off-chain layers, regardless of what those might be. However, even in (2) it does not seem possible to punt this up to only the PoV-builder level as the builder must know which validators to send to and those validators (and others) must be able to recognize this designated responsibility. Duplicated responsibilities correspond to higher load in evaluating prospective state transitions (reduced spam resistance).

As far as I can see, this means validators must play some off-chain role in solving the economic-combinatoric puzzle regardless of which approach is taken.

We might sidestep this scheduling problem by scheduling builders (with custom authentication which can execute code) rather than tasks, though I haven’t thought about this much.

I’m also sensing that we aren’t using the term PoV in quite the same way, so I’ll clarify my view a bit - I usually just use PoV to refer to the specific opaque bytes passed into the Wasm validation function, as it’s the Proof of Validity for something, and that something has certain inputs and outputs (or axioms and theorems, to mirror the ‘proof’ language). This something is a proposition carrying necessary context to know what’s being proved, and that’s the level where all tasks and their resource consumption would be specified both (1) and (2), with only some minor differences. This something needs a name, as we previously have called them “parablock candidates” or candidate for short.

gavofyork · August 30, 2023, 3:59pm

On the subject of definitions here’s the things to which I’d like to refer:

State-transition function, stored on-chain in some deterministic machine-executable language held as comprehensible by the Polkadot protocol (currently Wasm). Since it is deterministic, It strictly defines the set of valid (state, next_state[, input, output, relay_state]) tuples. This doesn’t really have a name, though it might be called a Task’s STF. When the Task is the validation of a parachain (as it always is at present), this might be called the PVF.
A combination of (something which identifies a specific) Task STF and a Task State. At present I’m largely identifying this as an instance of a Task, or just a Task. When the Task is specifically for Validating a parachain, it has been called a “Parachain/thread instance”. I’d use the term “Taskspace” to indicate the composition of these states and STFs.
A blob of data which the assembled Polkadot Validators (starting with a limited grouping of them) may judge represents a valid transition within Polkadot’s Taskspace. I’ve lately been referring to this as a PoV (i.e. validity in accordance with the STFs represented on Polkadot according to any machine execution languages enshrined within the Polkadot protocol).

Thus far PoVs have been limited to proving the correctness of only one single Task (specifically a Parachain/thread) and only in Webassembly. However, AFAICT there’s a clear way forward to loosening this constraint and make Polkadot Validators be a more general resource for ensuring the validity of state-transitions across multiple state-machines defined even in multiple machine-executable languages (RISCV, I’m looking at you).

alice_und_bob · September 12, 2023, 7:24am

rphmeier:

It’s likely intractable for the relay chain runtime to have to anticipate the combinations of possible state transitions of different sizes and designate the validators which will work on those combinations. This is the basis for my suggestion that we must punt this up to higher off-chain layers, regardless of what those might be. However, even in (2) it does not seem possible to punt this up to only the PoV-builder level as the builder must know which validators to send to and those validators (and others) must be able to recognize this designated responsibility. Duplicated responsibilities correspond to higher load in evaluating prospective state transitions (reduced spam resistance).

As far as I can see, this means validators must play some off-chain role in solving the economic-combinatoric puzzle regardless of which approach is taken.

Imo it shouldn’t be the responsibility of Polkadot to schedule different-sized workloads, but rather for higher level providers to do the job.

At this point, it is helpful to consider how Web 2 cloud resources are provisioned. You typically rent VMs with clearly defined resource allocations (e.g. 1vCPU, 4GB RAM). Then, depending on the workload you want to execute, differently sized packages are available, like high RAM, high CPU… This is a helpful analogy to the per-group model, where the consumer rents resources and then uses them or let them run empty.

An analogy to the per-Task model is serverless, which came much later in the development of Web2, giving us an indication that it is much harder to implement (and is just a wrapper on top of a per-group model in a way)

To me, this points to the direction of a per-group model. If you want to refer to the provisioned machine type in the RFC in a forward compatible way, it might be sufficient to reserve space for a simple identifier for the machine type, (similar to how Google has N1, N2, N2D, T2D, T2A, E2, C2… machines) and just refer to the current resource package as first generation.

A bit tangential, but when considering the resource limits of PoVs, we should assume that no matter the size, over time they will be efficiently maxed out. We can be mindful of current developments like orderflow auctions (SUAVE), shared sequencing, and builder cartelization, which all point towards off-chain infrastructure working diligently to bundle as much as possible into the type of resource that is given to them (to optimize revenue). For example, Jehan Tremback from Informal Systems just released an article on Atomic IBC, which is akin to shared sequencing, but packages the state transitions of multiple runtimes into a single block, called “megablocks”. So we already see a push to max out on the resources that are given.

Topic		Replies	Views
Dynamic Backing Groups: Preparing Cores for Agile Scheduling Tech Talk	19	2518	August 21, 2023
Cumulus Consensus Modules Tech Talk	11	776	May 28, 2023
Altering Polkadot's fork-choice to reduce DA Load Tech Talk	12	838	July 17, 2023
Economic Model for System Para Collators Tech Talk	24	2293	November 19, 2022
Elastic Scaling (fka Parachain Scaling by Parablock splitting) Tech Talk wasm	23	1994	April 5, 2023