Co-authored by Aidan Musnitsky
A recent article Altair has no Light Client highlighted flaws in the Altair Light Client protocol, and questioned its suitability for on-chain light clients.
The team at Snowbridge broadly agree with the article and this document serves to dig into the issues further and evaluate their implications for Snowbridge.
Traditionally, light clients are expected to preserve the entire security and trust assumptions from the consensus of the underlying chain. With the ALC protocol, that is clearly not the case - the ALC protocol does not entirely preserve the trust from Layer 1 and does result in a slightly weaker trust model.
At the heart of the problem is the Sync Committee, a group of beacon chain validators selected to provide extra signing duties:
-
The committee consists of 512 members randomly chosen roughly every day (27 hours)
-
A quorum is reached when two-thirds of the committee sign an attestation
-
Members are not slashed for signing fraudulent attestations
-
Members are penalised an inconsequential amount for inactivity
We should have performed more due diligence when evaluating the ALC protocol for fitness for purpose, as we took for granted the assumption that the ALC protocol would preserve the same security as L1.
Having said that, upon digging deeper, we have found that this weakness in ALC is not something that will materially impact Snowbridge’s trust model and that a light client bridge is still radically more trustless and decentralised than any other solution.
Trust Assumptions
Ethereum
Let’s first consider some basic trust assumptions for Ethereum itself, and their implications:
-
~67% of validators are honest, ~33% or less are dishonest: Ethereum should be stable and secure
-
~66% of validators are honest, ~34% are dishonest: A major attack on availability and on consensus becomes possible, including creating 2 separate forks: https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/attack-and-defense/#attackers-with-33-stake. This attack is very expensive though, and so will only make sense if there is a way to profit beyond that expense
-
50% of validators are honest, 50% are dishonest: The above attack becomes easier, cheaper and more sustainable but is still quite expensive. The attackers can likely force Ethereum to need to defer to off-chain governance for true recovery.
-
< 50% of validators are honest: Sustainable censorship, short-term block re-ordering and major MEV becomes possible and cheap. The dishonest validators cannot be slashed on-chain, though off-chain governance would likely step in to fork and slash them off-chain.
-
<33% of validators are honest: Full censorship and ability to do double spends becomes easy. The dishonest validators cannot be slashed on-chain, though off-chain governance could step in to fork and slash them off-chain.
Next, let’s look at the attacks on ALC, rather than on Ethereum itself, and evaluate what weaker trust model emerges due to them
Possible ALC takeover attacks
For us to rely on an assumption of honesty, any light client bridge needs to take, at the very least, these same trust assumptions and accept these numbers. Even if these hold, we need to defend against several additional attack vectors that may only affect the sync committee validator set rather than the full Ethereum validator set.
Manipulation of RANDAO to gain control of the sync committee selection
The sync committee is formed every epoch by randomly selecting active validators. RANDAO is used as source of on-chain randomness for this activity.
RANDAO is not a perfect source of randomness, it is biasable and predictable to a certain degree. Assigned block proposers are able influence RANDAO by either contributing or withholding their block candidate.
As shown in RANDAO Takeover, ~50% of the stake is required to gain control over RANDAO.
In the event there is a RANDAO takeover, the Ethereum community would likely have bigger things to worry about than fraud in the sync committee, but even so, we need to accept that our trust level for complete bridge manipulation is now down to a 50% honesty assumption, rather than a 66% honesty assumption.
Probability of a sync committee being dominated by a dishonest minority
On average, the membership of the sync committee should mirror that of the whole active validator set. However we need to think of outlier days (epochs) where a random selection results in the membership of the sync committee being dominated a smaller minority stake.
This probability can be approximated using a binomial probability distribution:
-
Total number of trials: 512 (Size of sync-committee)
-
Endpoint: ≥ 342 (number of trials in which enough dishonest validators were successfully selected for membership to take over)
-
Probability of success: The probability that a single dishonest validator will be selected for the sync committee is based on the % of dishonest validators on Ethereum as whole. With 50% dishonest validators, this will be 1/2, with 33% dishonest validators, this will be 1/3.
Plugging these parameters into Wolfram Alpha, we get the probability that a single sync committee will dominated by a dishonest majority in various scenario:
Dishonest Validators | Honest or Neutral Validators | Chance of sync committee takeover |
---|---|---|
33% | 66% | 8.285 × 10-54 |
40% | 60% | 1.86 × 10-34 |
45% | 55% | 2.4577 × 10-23 |
50% | 50% | 1.1814 × 10-14 |
These are all exceedingly low probabilities. Even with 50% of the full Ethereum validator set being dishonest, attempting a takeover every day (epoch) for 5 years, ie, ~1825 attempts at a takeover, it is almost impossible for a takeover to occur.
Disclaimers:
-
These calculations are just a basic analysis and are approximations. There is a lot more nuance in these systems and calculations, for example, an attack similar to the RANDAO Takeover as described above could be done to bias the RANDAO and slightly reduce the numbers in our calculation, however we are confident that these approximations are practical and that with ~50%+ honest validators, these nuances are likely not relevant.
-
Our team is not mathematics/statistic heavy. This is a basic analysis and an approximation, and although we are confident in the above, it is certainly possible that there is a major flaw in our argument or calculations. We invite others that are more familiar with this to audit and critique our analysis and provide any feedback or suggestions if something major is off.
Part 2 - Practical Security
It’s important to understand how we define trust and honesty in this argument.
For the Ethereum trust assumptions, these are assumptions that are part of the protocol design and the game theory of the protocol. They are based on a model where protocol participants are rational economic actors looking to play the game to maximize their economic return. Trust in this model refers to a game-theoretic idea that one needs to trust that validators will act in their own self-interest and do whatever makes them the most profit. In Ethereum’s case, playing by the rules leads to profits, since if validators break the rules they will lose money, so we trust Ethereum validators based on their own selfish interest.
For the ALC trust assumptions, there are no meaningful slashing conditions, so sync committee members do not stand to lose their stake by committing fraud. In the game-theoretic model, a rational, selfish committee member can commit fraud whenever an opportunity arises without consequence. This means that we cannot rely on game-theoretic trust for ALC, and so our assumption of honesty for ALC is based on real-world assumptions of honesty rather than game-theoretic ones. This is a weaker trust model in terms of game theory, but in practice we don’t believe that it introduces additional meaningful risk. ALC remains secure so long as there is not collusion across over 50% of the total Ethereum validator set.
This section discusses further practical considerations that further mitigate the risk of an Ethereum validator-based attack.
Validator statistics
Broadly, there are three kinds of beacon chain validators:
-
Solo stakers
-
Anonymous validators (including whales)
-
Staking pools
Historically and at the time of writing, over two-thirds of active validators are controlled by a relatively small set of staking pools. The top five according to dune.com:
-
Staking companies appointed by Lido: 31.36%
-
Coinbase: 12.69%
-
Kraken: 6.96%
-
Binance: 5.79%
-
Stakefish: 3.27%
The validator set and sync committee have additional skin in the game
Although committee members are not penalized on-chain for misbehavior, participating in fraud carries second-order, off-chain economic consequences.
According to the statistics presented in the previous section, subverting the sync committee would require major staking pools and companies to become compromised or engage in fraud, which could have significant consequences for their business concerns. Even if such fraud does not directly affect Ethereum, it could make staking customers think twice about the security of their deposits. Coincidentally, many of these staking pools also stake on Polkadot, where the effects of their actions would be even greater.
A large portion of the validator set are vulnerable to off-chain slashing, ie, reputation loss, that could affect their corporate treasuries or assets. Any practical analysis of an attack needs to consider these costs and risks in addition to the game-theoretic analysis and on-chain system.
We recognise that this honesty assumption relies largely on Ethereum’s relatively centralised validating power. If most validators were controlled by solo stakers or anonymous whales, there would be no second-order economic consequences for fraud, as those groups would face no repercussions.
Pseudoanonymous validators
Validators are identified by their BLS public key, which is pseudonymous. In case the sync-committee is subverted, there must be a way to identify the organisations behind the attacks. Otherwise, these organisations would not face any second-order economic consequences.
In practice, over two-thirds of the validator set have been de-anonymized through several methods:
-
Linking depositor addresses to known organisations using blockchain analysis.
-
Lido, which controls 31% of the Ethereum stake, explicitly links addresses of third-party validators to identities through an on-chain DAO.
-
Validators selected to act as block proposers can include arbitrary data, known as graffiti, in their blocks. This feature is often used by staking pools to identify the blocks they have authored. Staking pools have external incentives to truthfully and consistently identify the blocks they have authored.
Suppose conspiring validators (including staking pools) decide to commit fraud in the sync committee. They want to hide their fraud by anonymizing their validator addresses that may have already been unmasked through the above techniques. This could be done as follows:
-
Withdrawing their stake and stop validating
-
Use Tornado cash or similar approaches to anonymize their ETH
-
Register as a new validator and deposit their ETH
Cases (1) and (2) will be very visible to the Ethereum community. Users of staking pools will wonder why their staking rewards have stopped coming in and why these pools are no longer performing validator duties. Practically this makes anonymization difficult to pull off successfully, especially for larger staking pools.
Co-ordinating an attack
The above attacks all assume that a dishonest validator set can effectively coordinate enough stake to make an attack, both before sync committee selection, and dynamically on the fly after sync committee selection.
This would not be an easy feat to achieve in practice. We can certainly imagine some set of smart contracts or Dark-DAO style constructions that could facilitate this kind of on-the-fly co-ordination, but these are likely to be quite complicated to build, deploy and get buy in to interact with on the fly after each epoch-selection occurs, especially with a group of anonymous validators.
Such coordination will likely leave detectable fingerprints. The benefit of bridging to Polkadot is that its on-chain governance can react quickly to block brewing attacks.
Part 3 - Mitigations
We believe that the risk of more than 50% of Ethereum’s validators colluding to break the bridge is very low and does not undermine the trustlessness of our bridge. However, some in the community may disagree. Even in such a case, our altair light client could implement various band-aids or augmentations to restore its security.
Avoid ALC by ZK proofs of Casper-FFG consensus
There has been some discussion on developing a ZK light client thats follows Casper-FFG, Ethereum’s PoS consensus protocol.
Nevertheless, this is still an area of research, and is unlikely to result in a production solution this year.
Shield against dishonest sync committee attacks
The unconstrained behavior of sync committee members can result in several classes of attacks:
-
Equivocation: The sync committee signs a fraudulent header at the same slot number as a valid header
-
Data withholding: The sync committee is inactive and refuses to sign valid headers
-
Data withholding in conjunction with fraud: The sync committee withholds signatures on valid headers and instead signs fraudulent headers
There are two different ways in which these attacks can be shielded against
Fishermen
These are permissionless agents which watch the updates being posted to the light client. If they detect equivocation, they can post a fraud proof to the light client.
Polytope Labs is researching solutions for this issue, as documented in https://research.polytope.technology/consensus-proofs. However, while this mechanism is permissionless, it has certain limitations.
-
The light client will need to introduce a challenge window to give fishermen time to submit fraud proofs. This will delays settlement of bridging activity for users.
-
Fishermen and relayers will need to put up collateral. This could significantly increase the cost-of-capital needed for the bridge and heavily constrain the economic security of the bridge by the staked amount, and constrain the maximum value of assets that can flow through it securely.
-
Fishermen would need to use data withholding challenges to detect data withholding, because this cannot be detected directly on-chain. Challenges introduce many new kinds of griefing attacks into the system and radically increase the complexity of game-theoretic analysis on stability of the system. The security model becomes much more dynamic as it now depends on stake value, bridge asset value and slashing costs, all of which may change on the fly, so security becomes much harder to reason about and verify.
Approval Committee
The bridge can be augmented with a trusted additional approval committee (essentially a multi-sig) that must approve any updates posted to the on-chain light client. The responsibility of the committee is to verify these updates against a full Beacon chain node. The committee is only able to approve data, it cannot introduce data on its own.
This mechanism can safely mitigate sync committee attacks, however it does introduce a censorship risk and does mean that committee participants needs to be selected and trusted.
Having said that, it is important to understand the nuance of this approach as opposed to a traditional multisig bridge:
-
In a pure multisig bridge (like Wormhole or Axelar), the multisig and any economics/token/chain related to it need to be trusted to provide availability and protect against censorship, fraud and theft of funds.
-
In this light client + approval committee design, the multisig only needs to be trusted to protect against availability and censorship. The multisig cannot commit fraud or steal funds unless both the multisig and the beacon chain committee are taken over together, at the same time, in a co-ordinated group.
This distinction is significant - the risk for an end user are significantly lower in the latter design, as the multisig cannot steal funds and the bridge can always fall back to Polkadot governance in the event of availability or censorship issues.
Although we don’t think layering either additional shield mechanism on top of the base ALC protocol is needed, we do think that the ALC + Approval Committee design is the most trustless and secure fallback option for end users in the event that our earlier assumptions are invalidated.
Conclusion and Summary
The trust assumptions for ALC and Snowbridge, despite the weaknesses in ALC that have been uncovered, remain very similar to the practical trust assumptions for Ethereum itself. A trust assumption of 50%+ honest validators in practice is similar to the game-theoretic 50%+ honest validator assumption that the bridge had before these ALC weaknesses were discovered. In a world where this kind of collusion can occur, it is far more likely that we would see attacks on Ethereum itself than on the bridge.
We conclude that the ALC weaknesses are likely immaterial for our light client bridge, as well as other light clients. Some other teams developing beacon light clients are in agreement on this matter.
-
Succinct Labs: https://blog.succinct.xyz/blog/sync-committee
-
t3rn: https://polkadot.polkassembly.io/treasury/261#nCI7KgirR6AB749E9kxa
However, if the community disagrees and sees the risk that over 50% of Ethereums validators could collude as significant, we would be willing to consider the above mitigations. We believe that the approval committee solution is the most realistic addition that could actually be shipped reliably, in a reasonable time frame in practice. This would be a smaller piece to add on to the project, although it has various pros and cons, including:
-
Adds security, as it also brings an extra safety net against bugs in the light client
-
removes ALC collusion risk
-
Could be added as an extra piece post-launch, as a response if TVL scales up very high
-
Will delay the launch date slightly
-
Adds risk of temporary censorship to the bridge
-
May be complicated to setup, co-ordinate and automate the committees responsibilities
Our current plan is to continue with the rollout of the launch of the bridge as planned, without any additional committee