What do we want from storage proofs?

- Any classic Merkle trees should be binary, not monstrosities like ETH’s radix 16. We can likely optimize how blake2 gets used here further too, as SPHINCS+ does a binary hash in one 512 bit go.
- SNARK friendly hash functions bring strange arities like 4 or 9. These bring strange proof types too, like the SNARKs of course, but likely odd non-SNARK proofs that make up for their bad arity outside SNARKs, although some need classical Merkle proofs sometimes too even with bad arities.
- We need Merkle proofs into our own chain’s past state, sometimes only shallow, but some chains need deeper proofs, implemented via skip list or Merkle mountain range, or occasionally the MMB when you optimize for recent but expose deep. Also sometimes SNARK friendly, ala ZCash sapling.
- We need Merkle proofs into other chain’s recent state, presumably by passing through a recent relay parent hash to a relay chain state, and then passing to the other chain’s state. We’ll sometimes want another SNARK friendly access path across a family of chains though.
- Almost all storage proofs demand aggregation of some form, of course Merkle proofs unify as you go up the tree, but KZG proofs are only really helpful once you batch them. It follows proofs should be aggregated across the whole block, and thus live outside the core block in some PoV-like construct, but anything strange is still part of the block from polkadot’s perspective, not part of the PoV data of course.
- It’s clear Merkle proofs do not aggregate across chains, so the real PoV data does not merge with this in-block PoV-like data for Merkle proofs. There exist proofs types that aggregate across chains, but likely only relevant for swarms of zk parachains, so the block vs PoV distinction survives I think.
- Individual tx that employ strange storage proofs should not sign over the storage proof body, because these storage proof bodies wind up aggregated away, either into another part of the block for Merkle copaths, or disappear almost completely in KZG.
- Individual tx do however need to authenticate the strange proofs they use, which means they should sign enough of the proofs leaves and roots. If not, we risk strange replay attacks changing votes’ roots or whatever.
- I’d expect tx must supply these proofs too, which means whatever generates them should run RPC calls to multiple chains to acquire the proofs. Or smoldot?

I’ve made this sound scary and complex, especially by bringing up KZG, etc., but we should develop a feel for the larger design space, even though we won’t do this all at once. I think the 5+7 vs 8 issue feels fundamental to doing storage proofs well, but the block vs PoV distinction still kinda survives like 6 says, it’s just that the block itself needs some internal proofs data structure.

We’ve had several use cases for this stuff in polkadot itself, async backing and xcmp, and also some places we wisely chose to skip. It comes up in Multichain friendly account abstraction and Interchain Proof Oracle Network too, also other lazy messaging discussion brought up by parachain teams What else?