A little background, I’ve mostly done fullstack web app development in my career, but I took a detour in 2014-ish to work at a startup called FoundationDB to do feature development on distributed database systems with ACID properties. (The technology still exists; the company was acquired by Apple in 2015 and the project was later re-open sourced, and can be found here.)
Recently, I’ve been working myself up to creating decentralized apps on Polkadot for fun side projects, and started thinking about web app development generally vis a vis blockchain, what the constraints are. Usually when I start working on a side project, I start with the data model and then do a little guesswork about what kind of DBMS I will want for it. Often the answer to that has fallen somewhere between a relational database and a KV-store, or some combination of those ideas. I’ve seen companies use elasticsearch as a primary database. For substrate, it’s a mixed bag depending on whether you want to go with a smart contract-based app or build your own collator. I imagine if you built your own collator/parachain then you could model the data it manages any way you like, but there’s no sort of traditional relational database built on top of this technology yet (right?).
Recently, I’ve seen TPS measurements over 100k for Kusama during the spammening with only a fraction of the “cores” in use. But finality still takes on the order of seconds, so effectively writes take as long as finality takes (?). If I remember it right, FoundationDB was able to do millions of TPS with a mixed load of reads and writes on “commodity hardware” with ACID properties using a distributed (but not decentralized) network of nodes with a variety of roles that changed over time depending on network conditions (which ultimately meant that outages could occur randomly and the system would still be able to handle ACID transactions until the bitter end). I remember being impressed by that, but this is different from operating in an environment where some of the nodes could be bad actors, and I recognize that this is the innovation of blockchain and what makes those systems decentralized rather than merely distributed. It’s “easy” to get millions of TPS when trust isn’t a concern. (edit: an overview of the FDB technology can be found here for anyone interested)
That said, one of the things that FoundationDB was trying to achieve was to separate the storage from the data model. In other words, they had independent server processes that acted as layers (SQL, Document, Tuple, etc.) that would interact with the distributed KV-store over TCP that served as a foundation (hence, the name). The Document layer, for example, was a server process that acted as a drop-in replacement for MongoDB, where you could interact with it over Mongo’s wire protocol using the existing CLI or other client libraries, but ultimately those reads and writes were made to the KV-store with ACID transactions, something Mongo couldn’t do at the time (still doesn’t?).
So I’ve been thinking about this “layers” idea and how it could relate to polkadot/substrate/JAM. What I’ve seen for data management in the blockchain world is mostly IPFS and their ilk, which is more or less like a replacement for s3, but not a database. And I keep coming back to the same question – is blockchain a database? It’s data stored in a linked-list like structure, the writes are basically ACID, but what’s missing I guess is all the indexing that makes querying faster. If I imagine a situation where we simply treat the data on-chain as the single source of truth, then we just create indexes to make that data more accessible/queriable, then is that all that’s required for making a traditional full-stack app experience? Would it be possible to create a layered architecture like FoundationDB over top of blockchain, where the blockchain is the KV-store? And then developers could just use existing tools (psql, mongo, etc.) to create apps on top of the blockchain?
If any of this sounds uninformed, it’s because it is. I’m still a beginner when it comes to the technical side of bockchain and Polkadot/Substrate. I keep meaning to dive deeper, but I have had lots of distractions. Also, my role at FoundationDB didn’t require me to know that much about distributed systems. I was mainly involved with translating the mongo wire protocol and document layer correctness testing while I was there.