Expose builder-facing bridging components on Asset Hub

I am posting here the contents of a hackmd doc exploring System Bridges option in the spirit of “The Plaza concept” for bigger reach and public feedback.

Bridging components location: Bridge Hub or Asset Hub?

Bridges on-chain components (Polkadot side)

Very simplified:

1. Consensus layer

This is the protocol layer resposible for following and exposing the other chain’s consensus protocol, with the principal component being an on-chain light client for the other chain.

The “input” of this layer is consensus proofs generated by the bridged chain’s validators.
The on-chain light-client can understand and process these proofs.
The “output” of this layer is a stream of finalized headers belonging to the other chain.

2. Parachain finality layer

This on-chain component verifies proofs that certain parachain headers have been included in a finalized Relay Chain header imported and proven by the previous layer.
Builds on top of previous layer to prove other side’s parachains’ finality.
The “output” of this layer is a stream of finalized headers for each “followed” parachain on the other side of the bridge.

3. Messages layer

This layer is responsible for sending and receiving messages across the bridge.
Sending is relatively easy: enqueue the message on-chain in an outbound-queue and wait for a relayer to send it to the other side and submit back a delivery proof.
Receiving (messages or delivery proofs) is done by verifying storage proofs that message M is part of other side’s outbound-queue. If correct, it is dispatched to the recipient location within Polkadot Ecosystem (aka delivered).
Verifying the claimed storage proofs relies on knowing the other side’s storage root, which is provided by the lower layers already discussed (1&2 above).

4. XCM layer

Executors, Interpreters, Routers, Adapters, etc to go from “vanilla” XCM to blobs of data over bridge (messages).

Users/Developers/dApps interaction points

Without going into details, it’s important to mention that users don’t directly interact with the layers 1 to 3, the bridge being mostly abstracted away by XCM.

“Users” of the bridge (individuals/wallets/smart-contracts/dApps) interact only with the XCM layer (4), and thus indirectly interact with the Messages layer (3). The consensus layers (1&2) are bridge-specific machinery not at all exposed to its users.

Let’s look at how XCM (mostly) abstracts away the bridge:

Whether generated locally (through an extrinsic or by a smart contract), or coming from some other parachain, by executing XCM0 below on Asset Hub, one can transfer any asset known to Asset Hub to some beneficiary on Ethereum:


Figure 1

To support decentralized relayers that can turn a profit but also be as cheap as possible, the relayer price market needs to be dynamic and adjust accordingly to fluctuating congestion and gas prices. Therefore, we’re building offchain tools (libraries) that can check different params on both sides of the bridge and provide this “cheap but still incentivizing” value (eRwd). This value is required to be included to the fees needed inside the XCM program.
In the example above, the user/builder starts with their T0 but also needs to know X0, D0, eRwd values in order to build a working XCM that pays the right fees along the way. They get these using offchain APIs:

  • X0 and D0 - get it from XCM Runtime APIs on AH,
  • eRwd - get it from a Snowbridge API on Ethereum.

So, while the bridge is designed to be abstracted away and not visible to the transported/exported XCM, some things bleed through the abstractions. The XCM builder (UI, wallet, dApp) ultimately does need to understand that their XCM goes over a bridge and do some bridge-specific steps. If all these steps are provided by us in libraries we can still claim we are abstracting away the details, but the larger point remains that the builder cannot be completely oblivious.

Please note that the above still stands even without Bridge Hub (if bridge was directly on Asset Hub). UIs would still have to offchain interrogate on-chain execution/gas fees both Polkadot and Ethereum to provide the lowest economically viable fee (viable: some relayer will actually relay it).

Bridges on Asset Hub instead of Bridge Hub

Effects on UX: does it make any difference to end users?

For the vast majority of end-users it doesn’t make any practical difference. It would be slightly cheaper and latency slightly smaller because of eliminating the AH->BH hop. But the gains are practically irrelevant at the time of writing this:

  • 0.15 USDT reduction out of 26 USDT,
  • ~12seconds reduction out of ~30 mins.

The same applies to smart contracts running on Asset Hub, they would have to build and send the same XCM to Ethereum (or Kusama), regardless of the bridge’s location (AH or BH), so no effects to Smart Contract UX.

The main UX benefits are:

  • Better error handling in case of XCM errors. If XCM execution fails on Bridge Hub fails, custom tooling is required to claim these assets that aren’t even understood by BH (BH only understands DOT). If same XCM fails on Asset Hub, the trapped assets can be easily recovered.
  • Easier tracking of XCM messages (less hops).

Effects on DevX: does it make any difference to builders?

Again, the vast majority of builders should use the existing AssetHub <> Ethereum (or PolkadotAssetHub <> KusamaAssetHub) lane. The vast majority of usecases can be built to go through Asset Hub before going to Ethereum and everything works nicely. Transferring any AH-assets (DOT, WETH, USDT, USDC, bridged-ERC20s, etc) requires going through Asset Hub anyway, so might as well do everything in one go.

For some advanced usecases however, Parachains might want to use dedicated bridge lanes. This allows them to do custom handling of bridged assets, gives them better control over traffic flow, allows them to have dedicated relayers with custom properties around QoS, pricing, latency, etc.

dedicated-bh-lanes
Figure 2

BH dedicated lanes: Bad DevX and bad UX:

  • [UX & DX] cannot use the dedicated lane for transferring most assets (that have reserve on AH),
  • [UX & DX] cannot use WETH for fees since WETH has reserve on AH
    (message has to go through AH),
  • [UX & DX] have to use DOT for fees,
  • [UX & DX] DOT has to be prefunded on BH, it cannot be included in the exported message (same as WETH, DOT reserve is on AH),
  • [DX] have to build and maintain custom relayers that take DOT rewards,
  • [DX] have to build and maintain separate fees estimation APIs
    for dedicated lanes.


Figure 3

AH dedicated lanes: Good/better DevX and UX:
To use dedicated lanes, builder still need to build custom XCMs, but UX is much better:

  • can use WETH for fees,
  • no need to prefund anything,
  • can use the existing relayers that take WETH rewards,
  • can use existing fees estimation APIs even with dedicated lanes,
  • could use the dedicated lane for transferring most assets (if we build some “smart” things within AH bridging pallets - doesn’t work out of the box).

Effects on maintainers and operators of existing bridges

Clear win on lower code maintenance: one less runtime to maintain (more than one actually: Polkadot BH, Kusama BH, Paseo BH, Westend BH) - this applies to Parity but also ecosystem codebases like tools, wallets, etc.

Clear win on operational overhead: fewer system chains to run nodes (collator, RPCs, etc) for.

Clear win for ecosystem tooling: fewer block explorers/indexers required, message tracking is simpler with fewer hops.

Also a win for the bridges maintainers as it brings down complexity.

Equivalent example of Figure 1 but without Bride Hub:


Figure 4

If you compare Figure 1 and Figure 4, you can see that the flow is simplified with fewer XCMs flying around and less logic to abstract them away. We won’t have to work so hard on “making it seem” the message goes straight from AH to Ethereum, and we can get rid of a big bunch of code.

Another win is that we (Parity/BH-maintainers) don’t have to implement bridge-hub congestion mechanisms, which is something we will have to do once the bridges start getting more traffic.


Figure 5

versus

ah-no-congestion
Figure 6

In Figure 5 we can see that the current architecture is vulnerable to one bridge being able to DoS the other(s). This is not a security concern yet, but it could be later.
With Bridge Hub architecture we’d need to implement HRMP logical subchannels capable of independent backpressure which is yet more complexity to be added to a generic component (HRMP) that is only relevant to the bridges usecase.

Without Bridge Hub, as can be seen in Figure 5, the problem simply goes away naturally. There are no shared queues or channels between pairs of chains.

Effects on the network/ecosystem

Currently BH blocks are mostly empty, we are mostly wasting a Polkadot core.

  • We could make BH an on-demand parachain and use less than a core on average, but that’s more work/more complexity.

From that point of view, moving bridges to Asset Hub and removing Bridge Hub would improve System Chains blockspace efficiency.

The counter-argument is that we’d be increasing the load on Asset Hub, but as of now this increased load is negligible, and even with high future bridge traffic and increased load, Asset Hub should be able to handle it through elastic scaling or agile coretime.

Migrate bridges to Asset Hub cost and timeline

Starting work on this would be at the earliest sometime in Q1 2025, after we launch the currently in-progress features and enhancements:

  • Lower fees for P<>E (through a new fees and rewards model that uses only WETH),
  • Unordered messages for P<>E (improves legal compliance as relayers can choose to ignore known bad actors or poisoned funds),
  • Polkadot-Native-Assets on Ethereum,
  • Transact over P<>E,
  • Transact over P<>K,
  • Polkadot (BridgeHub) <> Bulletin chain launch (required for Proof of Personhood launch),
  • Kusama controlled over bridge by Polkadot Fellowship, retire Kusama Fellowship,
  • ?? am I missing anything that is in flux on BH ??

There’s too many moving pieces now to attempt migrating bridges to AH before stabilizing above.

In terms of cost, very dirty back of the envelope estimate is one dev, 3 months.

Also note that such a migration would not be “atomic” because of the async nature of the bridges and the async nature of runtime upgrades. There will be a while where we are running asymetric bridges (one side on AH one side on BH), and maybe even both AH and BH bridges active (after switching to AH, but until we purge the queue of old BH bridge). Even so, I believe we can do the migration in the background without disruptions - even while operating asymmetrical.

Personal recommendation

With the new Polkadot capabilities allowing Asset Hub chain to scale up as needed, there is no real need to having a separate Bridge Hub system chain. Furthermore, long-term maintenance costs and system complexity would be both reduced if the bridges would be hosted directly on Asset Hub.

In conclusion, I believe we should move the bridges to Asset Hub, but there are no significant short-term gains for doing it, so it is not high priority.

Hybrid setup during transition

However, I do believe we should start working in that direction by deploying new bridging features and enhancements directly on Asset Hub where possible.

I suggest we run in a hybrid setup where we keep layers 1&2 on Bridge Hub, but launch any new versions of layers 3&4 on Asset Hub. This way, we get all the UX and DX benefits for new bridging functionality without having to invest a lot of resources now to move everything over to Asset Hub. For example, support for parachain dedicated bridge lanes should go directly on Asset Hub.

Over time, we can slowly move the machinery for both bridges to Asset Hub, in the background without impacting the Ecosystem or the existing roadmap/timeline for delivering new bridge features.
E.g. Snowbridge roadmap is pretty tightly packed until late 2025, so moving Snowbridge L3&4 to AH will happen much later than Polkadot<>Kusama bridge.
In the end we might even decide to keep Bridge Hub hosting layers 1&2 in perpetuity; it’s the builder facing components (layers 3&4) we really care to improve access, UX, DX to.

5 Likes

This sound dangerous & pointless. Cool. :slight_smile:

At a high level, polkadot only uses on-chain messaging, because polkadot has badly designed storage, no storage abstractions, and everyone was rushed. I’d hope HRMP etc winds up entirely depricated in a few years, replaced by cross chain Merkle proofs ala off-chain messaging. BridgeHub sending assets via messages is a symptom here. Afaik BridgeHub should not send or recieve any messages.

What is BridgeHub really then?

BridgeHub communicates state roots. BridgeHub being the only real ETH bridge says nothing about direct communication between ETH and parachains. Instead of “sending tokens” via messages, we should directly validate Merkle proofs into state roots, so that arbitrary ETH state could be proven to any parachain, and arbitrary parachain state could be proven to any ETH contract, assuming the Merkle tree uses ETH friendly hash functions.

BridgeHub tracks ETH validators’ BLS keys, which requires checking many BLS signatures. ETH aggregates these, but not tightly enough. At a guess, aggregated they remain 6-10 x more expensive than unbatched Ed25519, and ETH has lots of validators. This must run or else the bridge fails, but if the bridge failed only in this direction then governance could restart it.

BridgeHub must run “enough” per epoch to update ETH’s view of the polkadot beefy keys. If this direction fails, then all bridge funds could become irrecoverable. As a rule, smart contracts chains always admit bizarre griefing attacks, so they cannot be trusted to prioiritize specific functionality. BridgeHub should be kept simple and audited.

We do have DoS vulnerabilities in Polakdot, like all distributed systems, including ones that dramatically reduce throughput, both slashed and unslashable. We’ll hopefully prioritize “critical” system parachains over others when under attack, where critical means validator elections, validator DKGs, bridge key transfers, and emergency governance, like maybe collectives. “On-demand” describes these poorly, we run them whenever they require running, even if that means ignoring paying work, like OS kernel tasks. Ideally, smart contract should never be supported by “critical” system parachains.

1 Like

@burdges your description is inline with the proposal above. Everything you mentioned covers only consensus/state layers referenced as “layers 1&2” in the original post.

The protocol for these 1&2 layers will continue to stay in on Bridge Hub, the proposal is to move the rest of the stack to Asset Hub and thus, as you say, keep Bridge Hub simple and audited.
The “rest of the stack” meaning messages/XCM transport and messages/XCM execution logic.

Once Bridge Hub imports consensus and state information about the other chain (say Ethereum), offchain message relayers can transport/deliver XCM messages from Ethereum straight to Asset Hub, alongside Bridge Hub storage proofs that the message is actually part Ethereum’s state, thus actually achieving exactly what you suggest:

BridgeHub sending assets via messages is a symptom here. Afaik BridgeHub should not send or recieve any (XCM) messages.

1 Like

Ahh cool, I miss-read then, sorry & thanks. :slight_smile:

Plaza has maybe created some confusion more broadly. We’ll do some concensus work on parachains, including some governance, and validator elections, which read staking information etc elsewhere. We should keep this seperated from whatever users do, especially using contracts, but overall this need not be inconvenient for users.

2 Likes

Using this argument, it would not make sense at all to have build the Polkadot architecture. For sure, putting every on one chain will make it easier, but it will hurt you in the end if you need to separate the logic again when the traffic has increased. Also elastic scaling is not going to magically solve all performance issues.

What is also left out by this proposal is that there will be ever some kind of use case that is not solely token related. Like use cases where a parachain wants to call an ETH smart contract or vice-versa. They would also need to go through AH?

I’m personally not a fan of trying to trying to force push everything on AH and then assume it will solve all the issues. If the changes applied to AH, will attract the expected builders etc, we will run very fast into problems of having on central point of failure. I mean for sure, this is a luxury problem to have :slight_smile:

1 Like