I have been in close discussion with the Moonbeam community recently regarding the whole issue. tldr, I believe we’ve been totally focusing on the wrong problem. EVM performance has not really been the bottleneck at all, but storage IO. Off-chain JIT/AOT is probably the only viable solution to speed up EVM (but due to Polkadot’s architectural limitations, off-chain JIT/AOT is not as easy to implement as it is on Ethereum mainnet). In any case, EVM interpreting is probably more than sufficient for the foreseeable future. For compute-heavy smart contracts, it’s perfectly viable to treat them as “precompiles” and we allow “pure PolkaVM contracts” to be deployed on-chain.
It’s therefore our belief that Frontier has made the correct decisions regarding Ethereum and EVM compatibility all along. Recently we started adding optional PolkaVM support to Frontier to implement the “pure PolkaVM precompile” idea.
I also invite you to read:
The optimization guide of Frontier, where we discuss in more detail the benchmarks on Ethereum mainnet and why storage IO (but not VM performance) is the bottleneck.
Pallet Revive clearly contains traces of code that is inspired by Frontier. Even the Revive compiler is not an original but a fork of zkSync’s era-compiler.
We’re just not as pity as you when it comes to code reuse. In fact, we welcome it. The Frontier code has been forked by many Polkadot parachains to adopt to their own needs. In open source, code reuse means power.
Frontier does not use CLA. This means that the code is owned by all developers who wrote it. This includes me, Moonbeam, Acala, many other parachains, and of course also Parity. We always write “Frontier developers” as the attribution.
We do also want to question your intention as why you bring up this instead of a technical argument. You have been making personal attacks, accusing those who disagree with you on the revive project of writing “GPT”. Many of your comments fall short of anything technical. Now Parity has admitted you’re wrong on the revive project, so I do hope you can apologize to those people, and also to the whole community – we basically lost a whole year due to this strategic mistake.
You can keep repeating yourself as much as you want. The fact that there is no such thing as 100% compatibility won’t go away. Frontier will always fall into your “almost compatibility”.
The hard problems we have to solve which frontier conveniently ignores are:
Gas model incompatibilities?
Currency denomination incompatibilities?
AccountId32 compatibility?
Storage deposit and ED incompatibilities
JIT compilation is hard (needs either to be done on-chain or proven correctness which is both, at the time of writing, infeasible for optimizing compilers)?
Can the chain run all possible EVM bytecode unchanged? Frontier can, Revive can’t. That’s what differentiate those two. Of course sometimes there are bugs, but we treat specification incompatibility as security issue and have actually already published several security advisories with fixes.
Some of those are solved problem in Frontier (like AccountId32). For others, the problem is rather you chose a bad strategy and were solving totally the wrong problem. The constraints you’ve chosen are rather unnecessary and impractical.
As an example, if you have actually benchmarked real-world Ethereum blocks, you would know that 80% is storage IO and only 20% is VM execution. With actual information but not “appeal to cool tech”, we would have known from the very beginning that the compiler project is probably unnecessary.
We could instead have further developed ink! or some other more important projects. But we spent more than a year to learn a hard and expensive lesson. Premature optimization is the root of all evil. Donlad Knuth has already told us this lesson in the 1970s.
By the way, the JIT compilation limitation is only a Polkadot architectural limitation. In Ethereum, JIT compiled EVM has already been developed (and tested against mainnet blocks) by the Reth team.
I am genuinely interested in how this was - or wasn’t - solved technically. Would appreciate your answer please (you are welcome to keep your political and philosophical insights for yourself).
Moonbeam developers will have the authority to answer this.
But from what I see, this seems to be a spamming + state bloat issue, which can be elegantly fixed by customization of gas metering (which we already have many in Frontier).
If you’re asking the throughput in this case, that’s rather a Polkadot problem and the limitation is PoV. The bottleneck is almost always storage IO, and seldom VM execution. If the PoV is used up, then of course users will feel things to be slow.
See, on a critical system chain like Asset Hub we definitively don’t want to introduce state bloat vulnerabilities. But we do want as high throughput and compatibility as possible. I don’t think it’s as simple as it may seem on a glance.
The whole thing is of course not simple. I was involved in fighting state bloat in Ethereum back in the days and I know first hand how difficult it is sometimes to get the correct solution. There are dozens of EIPs out there purely to fight state bloat.
In Ethereum, diversity is what helped a lot for them on problems like this. Diversified clients allowed teams to experiment differently on states, so if spamming / bloating happens, usually not all clients are affected at the same time. This gives breathing room.
Benchmarking is also key. We need to understand where exactly is slow, and what is taking up huge storage space, and optimize on that.
Polkadot unfortunately has neither at this moment. OpenGov straight up kills any alternative projects (like Epico). And there’s no real-world benchmarking at all – otherwise Revive team would have easily spotted their strategic mistakes and wouldn’t have wasted a year.
This is really valuable time lost. In the past year we have:
Storage optimization solutions like NOMT that is being applied first on Sovereign SDK, but not on Polkadot/Substrate.
Off-chain EVM-to-native JIT/AOT already developed and used in production in Ethereum L2s.
The competition on Ethereum-compatibility is as intense as one can imagine and Polkadot has practically zero advantage in it. Not even first-mover advantage (which goes to Moonbeam).
If I were Parity, I would seriously consider eating up the sunken cost and cancel revive. What is currently happening clearly shows that this team doesn’t understand Ethereum at all. It’ll risk spending another year working on something that no one will use.
Rather, we need a new strategy for Polkadot doing what we’re good at. We will not be the fastest Ethereum-compatible chain. Accept this fact.
Benchmarks is what lead to revive. We realized multiple bottlenecks: Storage (FIY the # of storage ops is more problematic than total bytes r/w), POV, compilation times, execution time. The whole point of revive and PVM contracts is to address those.
NOMT will be integrated into substrate and we have more in the pipeline to address PoV performance. It has also nothing to do with the revive or frontier.
revmc creates large contract sizes too, comparable or even larger than resolc (which isn’t easy fixable unlike in resolc), without even helping much. It doesn’t work.
You bring out the DEI argument (?) but call for revive to be cancelled because only your stuff should exist.
You really need to stop that passive aggressive FUD tone. We’re here working for the Polkadot community, not to fill your ego after we realized that revive doesn’t work.
You’ve been calling nearly all oppositions of the Revive project FUD. I’ve been calling out the incompatibility issues (and the relevant security issues) as well as the storage issue for a long time. Last week I also started some more serious research on the benchmarks and PolkaVM’s PoV issue.
I’m not sure if you referenced my work making your decision admitting that revive’s current strategy will not work, or if you reached your conclusions independently, but either way, if you have, for one moment, thought about the possibility that you might be wrong instead of dismissing everyone else as FUD, we would not have been in this bad situation. Please have some professionalism.
Then the benchmark is not done correctly or sufficiently. I know you have some benchmarks of toy programs, but those do not replace real-world testing. If revive is to address those storage problems you mentioned, then it’s making things worse, not better.
In a recent discussion with the Nervos team and also checking how Paradigm’s revmc actually worked in production in Reth, I was introduced a new design pattern that may completely change our view on how we optimize on-chain smart contract bytecode. The current view from Polkadot is that there are huge restrictions for an on-chain JIT/AOT recompiler. It must be really fast, and it must avoid JIT bomb. This means we’re left with single-pass recompiler that cannot do much optimization. The bytecode then must be easy to recompile and map as close to native as possible. That’s why we spent so much efforts searching for the “perfect” instruction set. From EVM to Wasm to eBPF and finally to PolkaVM.
What if we’re wrong?
The Polkadot view assumes that the recompilation process always happens immediately before a contract is ran. Therefore that the benchmark of compile+runtime is the most important. This may not have to be the case, because the compilation only needs to happen once! Therefore one can imagine the following design:
A main thread that handle normal block/transaction processing. It uses already compiled blob when available, or interpreter as fallback.
An optimization thread in the background that gradually optimizes contracts in the state.
We can imagine that the optimization thread would first try a simple (inefficient) one-pass AOT, and when time permits, try an advanced LLVM AOT with aggressive optimization.
(…)
If this design is possible, unfortunately, it’ll also mean that what we’ve spent years on searching the “perfect” instruction set is in vain.
You have spent at least a full year on resolc, we haven’t fixed it, and have to call things off. It may be another year or two. Ethereum will not wait for us and there are a lot more teams trying better Ethereum-compatibility at different chains. If you have actually looked at their work, but not just develop in your own ivory tower, you would know how intense the competition is. We really have zero advantage regardless of PVM.
Paradigm’s revmc is mainly designed to be used off-chain. As an off-chain JIT/AOT we can remove the code size concern and thus it can work much better.
But in any case, as I mentioned millions of times already. VM execution is not the bottleneck at all. It’s storage IO. So at this stage, EVM interpreting is more than sufficient. We would have better throughput gain focusing on IO/storage problems first.
Regarding XEN spamming issue, it is important to note that this has been a broader problem affecting multiple chains, not just Moonbeam.
Moonbeam came up with MBIP-5 proposal, which added new gas heuristics specific to the storage consumption of ethereum transactions. Also, more recently we have increased the gas/storage limits and have decreased the target block fullness, which activates dynamic fee mechanism earlier.
There is nothing wrong with disagreeing. But the arguments brought forward are just not logically consistent. That is what I am trying to point out. I think Cyrill really took his time to engage on a technical level with you. Of course a lot of it is pointing out logical fallacies and wrong assumptions on your side. But what else are we supposed to do? We can’t just engage “technical” when we can’t even agree on some basic facts. I understand that this doesn’t feel nice. But there is a difference between feeling attacked and an actual personal attack. I went through the threads and I can’t find an example of an ad hominem.
Yes, the tone is not always friendly. But I am also human and get annoyed. Especially when people are contradicting themselves. But in no way I am questioning your character or you as a person.
In this specific example I see the contradiction in those statements of yours. I am paraphrasing. Correct me if thats not your opinion:
Using PolkaVM is unsafe because it is “almost compatible”.
Adding a PolkaVM backend to frontier is fine. And yes I understand that you want to always to store both byte codes. But I think this approach has its own set of problems and will also be subject to a lot of problems you are criticizing about pallet-revive.
The decision not to use frontier as the basis for PolkaVM contracts is not a value judgement about frontier or you as a developer. Frontier is fine. It just wasn’t the most sensible starting point for what we are trying to do.
Using Solidity->PolkaVM as what revive is doing, is unsafe because it’s “almost compatible”. Adding a PolkaVM backend to Frontier is fine, as long as it’s done in a “full compatible” way. There’s no logical inconsistency.
Based on what happens in revive, I do fear that you have taken your tone too far, to the point that you have not listened to any opposing opinions for a long time. You could have just politely asked in the beginning, yet you deliberately chose a passive aggressive tone without even asking the question. Multiplying this I believe it has caused the whole team to have been building in a closed circle without listening to external inputs, without checking out the actual problems, and without understanding what other Ethereum-compatibility teams in other blockchains are doing.
And by the way, the method we integrated PolkaVM to Frontier uses EIP-3541. It’s therefore also guaranteed forward compatible with future Ethereum upgrades.
This is provided as an optional feature for those chains who may want them. But as we have stated in multipleplaces, we believe a premature deployment of PolkaVM will make one’s blockchain slower. This is based on our benchmarking results. This applies to both Pallet Revive and Frontier. We therefore recommend that PolkaVM is used only in a limited setting (only for specific compute-heavy contracts).
revive is doing YUL → PolkaVM. We are not re-implementing Solidity. We are using the original solc compiler to compile Solidity.
Starting with the EVM bytecode (as you are suggesting) is only safe if:
You can guarantee that the original EVM bytecode and its PVM version do the same thing. The only way I see you can do that is by either compiling on chain or proofing the compilation off chain. The former won’t fly as compilation time is unbounded. The latter seems far off but interesting.
The execution environment for both EVM and PVM is exactly the same as they represent the same program. But this is a problem: If you charge the same gas for PVM and EVM instructions you are not benefiting from the extra compute of PVM. But if you change it you might break contracts in subtle ways. We realized this early on: You cannot automatically convert. There has to be a manual step involved. So we carefully decided what are the necessary changes we need to make in order to create platform that improves on the status quo. The gas model is one of them.
We looked a lot into storage access lately. One finding is that the storage access Weights should actually reflect the access the costs of validating a block not building it. This means that the storage size shouldn’t affect those Weights much as the validator is stateless. Yes, the merkle proof size increases with bigger states but not to a degree where you would experience the kind of problems you are referring to.
For AssetHub we will massively increase the trie cache and make our collators RAM beefy enough too never hit the db. We also discovered and fixed a bug where the cache wasn’t properly working.
With all those improvements we get in a situation where:
Block builders storage access is faster than storage access on the validator. ie. your collator will be able to fill up the blocks to their weight limit. If not give more RAM to the collator.
Now that the weights represent validator storage access times we were able to lower them to point where most time is spent in compute and not storage access. For example, a benchmark that does just ERC20 transfers now only spends 8% of its time accessing storage (down from 35%). Given that this interaction isn’t exactly computation heavy I think it is a fair conclusion to assume that speeding up execution is beneficial.
It’s done off-chain. It’s not far off. Reth team has already done it on Ethereum mainnet. It worked. We don’t need proofing. It’s simply that each node JIT/AOT compile the contract off-chain.
In Polkadot we had this huge misunderstanding about on-chain smart contract bytecode. The current view is that there are huge restrictions for an on-chain JIT/AOT recompiler. It must be really fast, and it must avoid JIT bomb. This means we’re left with single-pass recompiler that cannot do much optimization. The bytecode then must be easy to recompile and map as close to native as possible. That’s why we thought PolkaVM is the silver bullet.
Several months ago I talked briefly with Nervos team (who did RISC-V well before us) and also checked out in detail about Reth’s work and their revmc. We’re not necessarily correct on PolkaVM and there are alternative ways. Compilation only needs to happen once, and can be incrementally done in a background thread.
It’s not that complicated. We simply use PolkaVM as pure precompiles because based on our benchmark that’s the only place where it may be beneficial. In all other situations we recommend deploying EVM bytecode.
Ok bear with me. I am really trying to steel man you here. I assume what you are trying to say is: Validators should compile all contracts on startup and cache the result. Similar to what we are doing with the runtimes right now.
While possible this is already a source of headaches for runtimes: Compilation times are unbounded and subjective. Disputes need to be resolved on the protocol level. Nothing that a parachain is just free to do for contracts.
Another issue is that the validators just don’t have the bytecode on startup. They are stateless and receive the code in the PoV since it lives on the parachain. They simply can’t compile ahead of time. This fact is the whole reason for the invention of PolkaVM: Having a fast O(n) compiler that produces code on par with optimizing compiler.
Any links?
What do you mean with recommend? I think your whole idea is to transparently convert contracts to PVM off chain somehow.
No. Not compile on startup, but compile on background. This is what makes compilation time unimportant. If compilation is not available yet, then interpret it.
I know at least one Ethereum L2 who is already doing this and launched a testnet.
Yeah that’s an architectural limitation of Polkadot. However, this does not stop other chains from doing this, and they’ll be faster than Revive.