Smoldot updates threads

tomaka · January 17, 2024, 9:31am

Additionally, I wanted to give a quick project update.
Here is a list of all the main areas of the code of smoldot, and what I think of them:

Trie: Everything regarding encoding/decoding trie nodes or trie proofs is more or less fleshed out. Two things that remain to do is split the proof decoding code in multiple steps in order to avoid freezing the browser if a big proof needs to be decoded, and implement compact trie proofs (which is low priority as if I’m not mistaken they are a parachain implementation detail).
Wasm execution: Executing the runtime, including the changes overlay, is in a pretty good shape. A lot of tests are unfortunately missing because it is often not clear what the expected behavior is. One change that remains to do is to provide the changes to the storage that are being done during the executing in a streaming way rather than buffer them in memory, in order to avoid being limited by memory in case a block performs a lot of storage changes.
Syncing: “Syncing” is what we call the algorithm that synchronizes the local chain with the one of the other nodes we’re connected to (including warp syncing). The main difficulty is to avoid attacks of all kinds. While the code is relatively robust, it is in the middle of a refactoring that would allow a warp syncing to be done if it turns out that we are far behind the head of the chain, which is important for example in case of a netsplit. I am also unsatisfied with the API of the syncing code, as this API causes a few pieces of code to be O(n) (n being the number of peers) or even O(m*n) (number of peers and number of non-finalized blocks).
Low-level networking: This covers encryption, multiplexing substreams over a single connection, interacting with the operating system, etc. I personally strongly dislike the “idiomatic” Rust way of doing things through the AsyncRead and AsyncWrite traits. I have created a “read-write” API that requires way fewer copies. This system works well now, but I’m still a bit conflicted over some very small details of this API. Apart from this, smoldot doesn’t support clean connection shutdowns. Disconnecting from a peer is always abrupt (through a TCP RST for example). Given that Substrate doesn’t do clean disconnections either anyway, I consider this low priority.
High-level networking: This covers choosing which peers to connect to, a ban system for misbehaving nodes, sending requests and trying again if peers are unresponsive, and so on. After a lot of trial an error, I believe that the high-level networking is finally in a pretty good state. Given that it has recently been refactored, some debug assertions are unfortunately triggering from now and then, but they’re all slowly but surely getting fixed. Both the full node and light client should be able to properly recover if Internet connectivity is lost, which for a long time wasn’t the case. Some small features are missing, such as sending periodic identify requests to peers for debugging purposes. Note that none of the parachain-related networking is implemented. Kademlia is also not completely implemented, as we only use it for discovering other nodes, and implementing it will be necessary if the smoldot full node is to ever become production-ready.
Chain spec, light client checkpoint, etc.: The existing “checkpoint” and “database” system of the light client will be reworked so that you can instead ask smoldot to send back an updated version of the chain specification. This system is overall more simple, as the only concept remaining would be chain specs, and you’re just manipulating chain specs.
JSON-RPC server: I am overall pretty dissatisfied with the code quality of the JSON-RPC servers of both the full node and light client. This code has gone through several refactorings, and I’ve had trouble finding a code design that leads to simple-to-read code. My latest attempt at simplifying the code is to completely split the code that answers requests from the code that deals with potential attacks, so that the code that answers requests doesn’t have to deal with that. This change is in progress.
Light client transactions pool. The code that validates then tracks transactions is way more complicated than one might initially think. Some very rare corner cases are unfortunately not handled properly, but overall the code is pretty robust. Proper tests are missing.
Full node database. The full node uses an SQLite database. The code is missing two important features: proper blocks pruning (i.e. removing old blocks and storage that we no longer need in order to save space), and an in-memory cache. The lack of in-memory cache unfortunately makes the full node so slow that it is unusable right now.
Light client and full node in general. I’ve been working on unifying all the light client and full node code to use the same code paradigm, which I find easy to read. I’ve also been working on making the light client more robust to internal panics by restarting services that crash. This of course needs to be done carefully in order to not break any logic. I am for the moment not doing this for the full node, as I think that it makes more sense for the full node to simply crash given that it is connected to a single chain, as opposed to the light client which can be connected to many different chains at once. The light client is also waiting for a refactoring: at the moment, adding a chain performs a lot of CPU-heavy operations synchronously, while it would be better to perform this in a background task and make the Client object a simple wrapper that sends messages.
JavaScript code on top of the light client. The JavaScript code that actually provides the API to users of smoldot is overall in a good shape, and there’s not much to say about it.

tomaka · March 18, 2024, 7:50am

After some reflection time and due to long-standing tensions, I’ve decided to reduce my involvement in Polkadot and focus my time more on a personal (non-blockchain-related) project.
This means, concretely, that I’m not going to continue the work on the smoldot full node, or very little, and not spend my energy on RFCs or similar.

As explained in the post just above, the smoldot light client is mostly feature-complete, and mostly needs bugfixes, performance improvements, and maintenance. Keeping up with the changes to the Polkadot protocol is fortunately (or unfortunately, depending on your point of view) relatively low effort.
One exception right now is BEEFY, which I haven’t started implementing at all, but most of the code paths should be relatively similar to the ones of GRANDPA.

Overall, my plan is to sketch a finish line and slowly but surely turn the smoldot light client into a completely finished polished product.

bkchr · March 19, 2024, 1:40pm

Sad to hear!

Does this mean you will abandon smoldot completely after this point? Or will you still be around for maintenance etc? I’m asking this because we otherwise need to start looking for someone that could take on the work.

tomaka · March 19, 2024, 3:14pm

What I want to stop doing is push for changes in general, in other words opening RFCs, pushing for PRs in the Polkadot SDK, and so on, because the impression of constantly swimming against the current in an environment where nobody wants to take responsibility for anything is wearing me down too much.

Note that I’ve never liked doing this kind of design work anyway, I’ve always been doing it by necessity, because nobody else does it. If you find someone who is willing to do this design work of improving the light client protocol (and pushing for it to be implemented), I’m completely in favor of this and I’ve always been completely in favor of delegating this.

Similarly, writing a full node without a precise spec and without the guarantee that the protocol isn’t going to silently change (there are still protocol-breaking PRs periodically being merged without an RFC, not to mention all the development happening behind Parity’s closed doors) is too much cognitive effort for one person, which is why I’m abandoning it.

What I want to do is simply return to writing code without depending too much on other people, which is the passive tragedy-of-the-common attitude that the overwhelming majority of core devs follow.

Topic		Replies	Views
New JSON-RPC API mega Q&A Tech Talk rpc	18	2799	February 20, 2024
Polkadot Digest 20 Mar 2023 Digest	2	336	March 21, 2023
Smoldot Full Node Tech Talk	0	151	April 24, 2024
Polkadot-API updates thread Tech Talk	23	1631	July 26, 2025
Full node in browser?	10	879	March 20, 2023

Smoldot updates threads

Related topics