Yeah, and one extra thing that’s worth highlighting here is that compared with wasmi (as we currently use it) it is faster than what the numbers in my original post may imply, as I was running wasmi natively for my benchmarks while we’re currently running wasmi under wasmtime (where wasmi under wasmtime can be up to twice as slow as wasmi on bare metal; see here for some numbers https://github.com/paritytech/substrate/pull/12173#issuecomment-1240466579)
Yes, but that’s code optimized for running on RISC-V hardware, and not necessarily for recompiling it to something else. (Although you’re right that it might also be an issue of a less mature LLVM backend.)
It’s possible we could still improve the performance if we’d extend it with e.g. some higher level meta instructions that’d be just as easy and simple to JIT (or even simpler) but could make it possible to emit more optimal host code. (But again, I don’t know by how much; I’d have to investigate why the JITted RISC-V code runs slower than the native code. In theory it should be possible to get them closer performance-wise.)