Security Audit & Tech Debt Remediation: BEEFY, GRANDPA, BABE, Snowbridge, XCM, staking-async + Compensation Request

prodigalwon · May 3, 2026, 1:40am

Reviewed fork: paritytech/polkadot-sdk at stable2603 Audit window: May 2026 (post-Hyperbridge) Author: Independent researcher / nominator

TLDR: I fixed the bugs from the hyperbridge hack, and found additional bugs then fixed those too.

Preface

I forked the Polkadot SDK and conducted a full independent security audit and remediation pass, with primary focus on consensus, bridges, and cross-chain messaging; informed by the April 13, 2026 Hyperbridge incident and the bug classes it exposed.

This post is a disclosure of what I found and what I fixed. It is also a compensation request. I’ll address both plainly.

What I Found

The pattern that should concern you most

Three of the four major consensus systems audited:

-BEEFY

-GRANDPA

-BABE

had the same category of bug, all at the same time.

Not similar bugs. The same bug, copied across three systems, because when it was fixed in one place nobody made sure the fix was applied to the others.

Findings Summary

41 distinct findings remediated.

Severity	Count
CRITICAL	3
HIGH	18
MEDIUM	10
LOW	2
INFO / hardening	8

~1,094 tests passing across the audit surface after remediation.

CRITICAL Findings

CRIT-1 Validators can escape being penalized for misbehavior

substrate/frame/staking-async/[redacted]

When a validator misbehaves, the network is supposed to slash (penalize) their stake. There are two ways a slash gets applied; automatically by the runtime, or permissionlessly by anyone calling a public function.

The automatic path does the math correctly. The public path has an off-by-one era error that shifts the slashable window far enough into the future that the validator’s unbonding funds fall outside it entirely, and are released clean.

In plain terms: a validator who knows about this bug can misbehave, start unbonding their stake in the same era, and deliberately back up the automatic slash queue. Once they trigger the public path instead, the math is wrong and their funds escape without being touched. With Polkadot-sized self-stakes, this is a multi-million DOT escape hatch.

CRIT-2 The network was keeping track of only 1,024 parachains at a time, and lying about it

polkadot/runtime/parachains/[redacted]

BEEFY produces a cryptographically signed summary of all parachain states. An official snapshot that bridges and light clients rely on to know what’s happening across the network.

That snapshot was being silently cut off at 1,024 parachains. Anything past that limit was just dropped with no error and no warning. The network was producing a signed, official-looking summary that was quietly incomplete.

The code’s own comment described this as an “intermediate non-fix.” It had been sitting there unaddressed. Any bridge or light client trusting this summary to make decisions about parachain inclusion could be manipulated using the gap between what the summary claimed and what was actually true.

CRIT-3 A security feature was added to the bridge but nobody hooked it up

bridges/modules/beefy/[redacted]

The BEEFY bridge was updated with a new field specifically designed to prevent a class of attack where someone pairs legitimate-looking old data with a fresh malicious request… exactly what happened in the Hyperbridge hack.

The field was added. The check that actually uses it was never written. So the protection existed on paper but did nothing in practice.

An attacker could take a real, valid piece of historical bridge data from an old validator set and pair it with a fresh commitment from the current one. The bridge would accept the combination and update its state as if everything was legitimate. This is the same attack shape that drained Hyperbridge in April.

HIGH Findings (summary)

HIGH-1 A single malformed message could crash a node

polkadot/xcm/[redacted]

A counting bug in the XCM message decoder means that crafting a message with a specific length causes the safety counter to wrap around to zero, bypassing the instruction limit entirely. An attacker could send a single message that causes the node to try and allocate roughly 128 gigabytes of memory. This is reachable from every XCM path on the network.

HIGH-2 Cross-chain asset transfers could silently destroy funds

polkadot/xcm/[redacted]

When assets are moved between chains, there’s a step where the asset’s address gets translated for the destination. If that translation fails for any asset in the batch, the failed asset is quietly dropped, but the sending chain already debited the full amount. The difference gets locked with no recovery path. In the teleport variant, the asset is burned on the sending side and simply never appears on the receiving side, permanently reducing total supply.

HIGH-3 / HIGH-4 A misconfiguration could silently disable validator slashing in GRANDPA and BABE

substrate/frame/grandpa/[redacted], substrate/frame/babe/[redacted]

Both GRANDPA and BABE use a pluggable component to look up session numbers when recording and verifying misbehavior reports. If that component is wired to anything other than the canonical session pallet, a valid configuration choice, the numbers it writes and reads will silently disagree. Legitimate slash reports get rejected. Validators who should be penalized walk away clean. No errors, no alarms.

HIGH-5 / HIGH-6 Anyone could consume block space for free by spamming GRANDPA and BABE

substrate/frame/grandpa/[redacted], substrate/frame/babe/[redacted]

Misbehavior reports are unsigned transactions. They don’t require an account or fees. Before processing, they’re supposed to be validated. The step that checks whether the actual misbehavior proof is legitimate was being skipped. An attacker could flood the network with reports that have a real-looking wrapper but fake signatures inside. Each one passes the initial check, consumes a full block weight allocation, then fails… with no fee collected. Effectively free denial of service.

HIGH-7 / HIGH-8 / HIGH-9 Misbehavior reports in BEEFY, GRANDPA, and BABE didn’t verify the accused was actually in the validator set

substrate/frame/beefy/[redacted], substrate/frame/grandpa/[redacted], substrate/frame/babe/[redacted]

When reporting a validator for double-signing, the code verified that the reported key existed somewhere in session history, but not that it was specifically a member of the consensus set for the session being reported. On networks where the consensus set is a strict subset of the full session set, this allows reporting keys that had no business being in that consensus round at all.

HIGH-10 Snowbridge’s proof verification ignored position entirely

bridges/snowbridge/primitives/[redacted]

A Merkle proof is supposed to prove that a specific piece of data is at a specific position in a tree. Snowbridge’s verifier was reconstructing the root correctly but ignoring the position. A valid proof for position A would also pass verification for positions B, C, D; any position in the tree. This is the exact same pattern that the Hyperbridge Solidity port inherited from this codebase and was exploited over.

HIGH-11 Snowbridge delivery rewards could be stolen by anyone watching the mempool

bridges/snowbridge/pallets/[redacted]

When a message is successfully delivered through Snowbridge, the relayer who delivered it gets a reward. The pallet was storing the pending delivery nonce but not who should receive the reward. When the receipt arrived from Ethereum, the reward defaulted to whoever submitted the receipt transaction, meaning anyone watching the mempool could jump ahead of the legitimate relayer and steal their payout.

HIGH-12 / HIGH-13 / HIGH-14 The BEEFY bridge had three compounding problems

bridges/modules/beefy/[redacted], [redacted]

First, the MMR proof verifier was relying on a single safety check deep inside a third-party library rather than enforcing bounds itself, one missing check in a port of that library is exactly how Hyperbridge was drained. Second, there was no binding between a proof and the specific block it was supposed to represent. Someone could pair a real proof for block M with a commitment claiming to be for block N, and the bridge would write M’s data under N’s key. Third, the entire submission function was declared as having zero computational cost, making it free to hammer with as much work as an attacker wanted.

HIGH-15 Bridge light clients couldn’t verify which validator set signed what they were looking at

substrate/primitives/consensus/[redacted]

The data format used to communicate with bridge light clients didn’t include which validator set produced it. The assumption was that pallets executing in a fixed order was enough of a guarantee. It isn’t. Conventions that aren’t enforced cryptographically are exactly what bridge attacks exploit.

HIGH-16 / HIGH-17 / HIGH-18 Three separate timing bugs in slash application could let slashes be missed or escaped

substrate/frame/staking-async/[redacted, [redacted], [redacted]

The withdrawal gate only checked one era when deciding if funds were safe to release. The automatic slash processor only handled one slash per block, meaning a backlog could be deliberately created by spamming offence reports until slashes aged past their window. And the public slash function had no lower bound on how old a slash record it would accept, allowing stale records to be replayed with wrong math that zeroed out the slash.

Connection to the Hyperbridge Hack

The April 2026 Hyperbridge incident combined three bug classes into an exploit that let someone mint a billion wrapped DOT and sell it for over $200k:

No bounds checking on proof positions
Stale historical data accepted alongside fresh malicious requests
Weak authorization on critical functions

bridges/modules/beefy and bridges/snowbridge both had structurally identical versions of this exact attack chain. CRIT-3, HIGH-10, HIGH-12, and HIGH-13 are all variations of the same bugs that made Hyperbridge exploitable. They didn’t come from the Hyperbridge port. Hyperbridge ported them from here.

The Hyperbridge hack already happened. The conditions that made it possible are still present in the in-tree bridge code. That’s what this audit found.

Tech Debt Remediation

Beyond the security findings, the codebase had accumulated years of structural mess that was making everything above harder to find and harder to fix.

Pallets were borrowing code directly from each other instead of communicating through clean, defined interfaces. Think of a building where the plumbing runs through the walls of the neighboring apartment instead of through the shared utility corridor. You can’t inspect or repair one unit without tearing into another. It inflated compile times, obscured what code was actually responsible for what, and made it nearly impossible to audit any single component in isolation because its behavior silently depended on the internal state of components it had no formal contract with.

The dependency declarations didn’t reflect reality. What was declared as a dependency and what was actually being relied on had drifted apart over years of patching. The result was that security scanners, build tooling, and human reviewers were all working from a map that didn’t match the territory.

Both have been cleaned up in my fork. The codebase compiles cleanly with proper boundaries, and the dependency graph now accurately reflects what the code actually does.

The Ask

I am willing to provide the remediated source code.

My ask is $20,000 USDC.

To put that in context: a single critical finding on a production blockchain typically commands $50,000–$500,000 on a standard bug bounty program. I found three criticals and eighteen highs.

The tech debt remediation; the pallet decoupling and dependency cleanup:

I’m throwing in for free.

$20k is for the security work alone, and it’s a significant discount on what that work is worth.

To be clear: I am not disclosing the fixes or the specific lines affected. The remediated code is the product. Those are two different things and I am keeping them separate.

If governance is interested, reach out. Payment address available on request.

Why This Is Also a Litmus Test

I want to be measured here, not hostile. But I think it’s worth saying plainly.

This post exists. The fixed code exists. What happens next is the signal. It’s a simple yes or no, and either answer tells you something true about what’s actually driving decisions on this network.

Full file:line citations, regression test names, and CWE classifications available on request or as part of code delivery upon payment.

kukabi · May 3, 2026, 8:43am

@prodigalwon it’s interesting, thanks for the post. I’m not on top of the sections of Polkadot SDK code you referred to, but just out of curiosity, what’s stopping Parity engineers, or anyone else just going ahead and applying the fixes, possibly with the help of LLMs, rather than paying the fee through OpenGov or directly? Your detailed explanations with code locations would make it simpler. So maybe you should’ve contacted them privately? Or it’s another story if you have found more than you listed here.

alice_und_bob · May 3, 2026, 2:19pm

Crazy to think that everything beyond parachain 1024 gets signed off by Beefy

Thank you for taking a look ChatGPT!

Vincent · May 3, 2026, 2:23pm

This is false. The reward recipient is selected by the P->E relayer who specifies the final reward account here. That transaction is practically always submitted through Flashbots to prevent front-running. Any relayer can relay the delivery receipt back to BridgeHub, and the reward recipient cannot be altered because the relayer needs to supply a valid Ethereum receipt proof that is cryptographically verified onchain.

I can’t take the Snowbridge reports seriously unless you give provide some actual evidence. Now that we’re in the age of AI, serious-sounding but invalid security reports are a dime a dozen.

Our bug bounty is hosted at https://hackenproof.com/programs/snowbridge-on-chain-code.

Vincent · May 3, 2026, 2:58pm

CRIT-2 The network was keeping track of only 1,024 parachains at a time, and lying about it

Invalid. The BEEFY protocol currently only provides verification services for the first 1024 parachains sorted by parachain id. This is a pragmatic choice so as to not degrade performance. Only two parachains (BridgeHub and HyperBridge) are using BEEFY at the moment.

It’s a hardcoded limit, like any other limit in the Polkadot codebase. If further scaling work is deemed necessary, then the limit can be altered or removed.

prodigalwon · May 3, 2026, 5:16pm

beefy just got explioted last month, and the same bug is in other critical crates. Meanwhile you have YouTube videos up on ways to solve problems that wouldn’t have worked. Keep laughing.

prodigalwon · May 3, 2026, 5:35pm

Thats exactly what I expect them to do. But here’s the thing, that too is signal. I nominated on polkadot for years. After losing 60% of my RWAs that were diluted and spent on things that did not benefit the ecosystem, I rolled up my sleeves and built real things with the goal of propping the price up. When those products were met with silence, on top of DAP with no sybil resistance in sight, and a MITM pool to separate era rewards from validators it appeared the network has been captured and opengov’s 1 token = 1 vote mechanism was the vehicle. Plato predicted 2.5 thousand years ago in ‘Republic’ that OpenGov would become tyranical oligarchy.

I expect them to to ignore me. 20k is awfully low. I already forked the sdk, removed the tech debt, removed the attack surfaces to beefy, added defense in depth, and that cut compile times in half. Then I culled hundreds of GPL crates.

So you see, this post wasnt about the money. Its a spotlight.

I do appreciate your question though. I was wondering who was going to ask it.

I know you didnt ask but removing the tech debt dropped the compile time by half. Dropping the GPL crates by 75%.

prodigalwon · May 3, 2026, 5:40pm

Ive been on hackerone for years and can say ive seen plently of programs who want the free security research. Not saying this is you, but im not interested in signing up for some new site that likely treats new members like cattle. That said, feel free to DM me if your interests are aligned with what’s best for the network.

Topic		Replies	Views
Update on Snowbridge Ecosystem	7	5223	July 29, 2024
ParityTech update for April 2024 Ecosystem	10	1639	June 6, 2024
Relay Chain Vulnerability: False Validator Slashing Due to Proof Verification Bug Tech Talk postmortem	3	856	April 15, 2026
2025-10-21 Technical Fellowship OpenDev Call Ecosystem fellowship	2	334	March 16, 2026
Decentralized DOT<>ETH Bridges: A Comparison Thread Tech Talk	33	5196	August 3, 2023