Replica_IO - An open-source framework for building practical distributed replication mechanisms

Hey folks! :wave:

I want to make a breakthrough in designing and implementing distributed protocols. To that end, I’m bootstrapping Replica_IO, an open-source project to develop a well-supported and widely used state-of-the-art framework for building distributed replication mechanisms.

I believe this is something Substrate (and possibly other projects in the Polkadot ecosystem) would benefit from. WDYT?

I also thought you might be interested in my recent long-read post sharing observations and conclusions from exploring 14 code bases of some notable distributed protocol implementations: On Implementation of Distributed Protocols | Replica_IO

It’s always nice to have broader knowledge, but you’re kinda all over the place there. And nothing of the messy details which actually make things happen. Ironically I noticed Reality has a surprising amount of detail today via Reality has a surprising amount of detail | MetaFilter

It’s maybe worth focusing on something narrorer?

We do not particularly like libp2p, but we’ve worked through it, and now we have litep2p which does less but does it better. It’s trashes all the pointless high level protocols like pubsub.

QUIC is very cool, but the messy details really eat you. As I understand it, nobody outside Google & Apple landed performance implementations or deployments. Parity dropped work on it after considerable investment. Tor similarly.

It’s interesting anemo uses quinn since I wouldn’t have expected performant code on quinn, despite the quinn authors being good. Interesting quesitons: How good is anemo? I think relative both to litep2p and what Google & Apple claim for QUIC in browsers? If good, then how complex would porting substrate to anemo be?

I’ll hazard a wild guess than anemo has nothing like the performance Google & Apple achieve from QUIC, but it’d be impressive even if not a pessimization relative to libp2p.

QUIC can only be fundamentally better than the TCP+Yamux stack that Substrate uses. The fact that it uses UDP rather than TCP unlocks some optimizations that are hard to use, but even more importantly is the fact that we can get rid of Yamux, the multiplexing protocol.
It is fundamentally not possible to write a correct multiplexing protocol on top of TCP, as demonstrated by HTTP2’s failure. The multiplexing has to be baked in the lower-level protocol.

I don’t know for Tor (I imagine that they have specific security-related challenges), but just because Parity dropped work on it doesn’t make it QUIC’s fault, and as far as I know most newly-developed protocols are rightfully based on top of QUIC.

Litep2p is still the same thing as libp2p, just a different implementation.

I’m personally extremely skeptical of the fact that the performance improvements that litep2p bring couldn’t also have been put into libp2p. The fact that a different implementation has been written is more a political decision (we don’t control libp2p) than a technical decision.

The suggestion that Substrate could be ported to use Anemo, whatever Anemo is, is in my opinion extremely misguided. While the libp2p protocol has several flaws, it’s mostly fine.
Even if Anemo (or whatever other library) was five times faster than libp2p, there’s nothing in libp2p that prevents achieving the same speed.
The question is not whether Substrate should use Anemo or libp2p, but: do you want to spend manpower porting Substrate to Anemo, or do you want to use that same manpower to improve libp2p?
While the first option is probably very seducing from the eyes of an MBA person, the latter is significantly less effort and doesn’t bring any breaking change.

3 Likes

This sounds mostly fair, in that Tor works in a different model, and gnunet is wierd.

Anyways, my point was: p2p people keep saying quic is too difficult for their budget. Yes, these sorts of thing change once someone changes them. If anemo changed this, then that’s interesting. It’s maybe not even recognizable as p2p, like anemo & sui do not obviously have a DHT.

This maybe answers my “how complex” question above. Note it was a question, not a suggestion.

As an aside, I’ve always been interested in UDP primarily because it impacts how approval assigment annoucements escape under nasty DoS attacks. TCP streams have the somewhat okay property that DoS attackers must silence the whole connection. I’m pretty sure QUIC could deliever better assurances though, in that “this spam-worthy non-stream packet arrives with high probability”, even under DoS. A naive switch of polkadot from TCP to UDP would probably harm soundness, in that the approval annoucement would spin up a stream.

Not sure why this discussion turned up to revolve around the networking layer (libp2p, TCP, QUIC, etc.) It’s not the point of Replica_IO to implement P2P communication, it’s about developing a framework for implementing distributed protocols like consensus. It would be rather agnostic to the underlying networking stack, though trying to take advantage of the useful properties provided there.

Doing less but better is actually my way to go. I believe the messy details that make things happen do only make sense in larger contexts. That’s why I’m broadly exploring the state of the art before dwelling on those details. The approach this project is taking is iterative: starting small and growing strong. The first thing I want to do after the initial exploration stage is finished is come up with some structure and notation for implementing distributed protocols that would support composability in concurrency and communication mechanisms, as well as controlling nondeterminism.