I looked through all the pallets in frame and didn’t find anything similar, so I’m offering to build dynamic blob storage pallet which would allow low level byte modifications.
Motivation:
- Allow more useful user facing applications to be built using blockchain as a storage
- Offload all the complex parts about low level storage management details to third party libraries, i.e. SQLite and app developers
- Bring more developers who are familiar with traditional Web2 technologies to build on Polkadot
- Minimize complex interaction with the pallet from application developer side, only two functions to change the data and delete blob
Other proposal I’ve seen suggests storing any binary blobs off-chain not to bloat the blockchain.
Approach presented here is simple, and could be implemented quickly with what we already have, in a month or two.
Assuming there will not be many users at the beginning, developers could already be building a lot of useful applications, basically stating “blockchain is not just for DeFi, money operations and memecoins” while the sophisticated and more complex advanced storage solution is eventually built.
The pallet would have one storage map
#[pallet::storage]
pub type Storage<T: Config> = StorageDoubleMap<
_,
Twox64Concat,
String, // - user account + blob name hash
Twox64Concat,
u64, // - page id
(Consideration, BoundedVec<u8, T::PageSize>),
>;
- There could be storage map for permissions of accounts who also could be permitted to modify the blob state.
- We would have
Consideration
of each page so the parachain token has utility
We could provide these pallet calls for the user:
pub fn mutate_blob(
origin: OriginFor<T>,
blob_name: String,
encoded_instructions: Vec<UpdateInstruction>
) -> DispatchResult
Mutate blob call would receive encoded instructions to update the state, this would be update instruction enum
enum UpdateInstruction {
WritePages {
start_page: u64,
pages: BoundedVec<
BoundedVec<u8, T::PageSize>,
T::MaxPageUpdates
>,
},
DeletePages {
start_page: u64,
end_page: u64,
},
}
The other call would be to delete the data allowing user to receive his consideration back for the storage used. If blob becomes very large over time, this could possibly be called multiple times, with each delete deleting up to some fixed amount of pages.
pub fn delete_blob(
origin: OriginFor<T>,
blob_name: String,
) -> DispatchResult
From frontend developer’s perspective, we could write a library that fetches the current sqlite state from the blockchain, frontend developer can do arbitrary updates on the database, migrate sqlite schemas and all we would need to do is detect dirty pages and send the transaction to apply the diff on the blockchain.
If change set is too big to apply to one transaction user could submit partial updates to the storage with as many transactions as needed, and rely on SQLite’s built in safety guarantees not to corrupt the database, i.e. as long as we write to our virtual file system in the same order Sqlite does, sqlite database wouldn’t corrupt from a partial uncommitted state and discard partial writes and user could try again without worrying about database corruption.
To detect dirty pages we could hook into the shim of sqlite vfs to see what writes to the database it does The SQLite OS Interface or "VFS"
Also, we could adopt VFS interface, that even if database becomes quite large on chain, users using smoldot on the frontend can still quickly retrieve data because sqlite would figure out the least amount of pages we need to read to satisfy each query.
User would be paying only the delta storage size and sqlite now could manage the complex logic of where to store and mutate the database, while we don’t need to provide specific storage items in the pallet for specific domain tables. Say, storage item for user todos, storage item for user calendar and etc. etc. and we don’t need to worry about migrations of those items as they would be offloaded to the Dapp developer.
Of course, in the library we could have an option so that data stored is encrypted under symmetric key encrypted by user’s public key. User cold also choose with whom he shares his data by sharing symmetric key to the blob storage.
This will allow to bring in developers who mainly know Web2 technologies to easily develop powerful user applications mostly without touching or making user aware that it is running on a blockchain.
With SQLite App developer could perform drastic changes to the database state but we don’t need to implement many different complex calls to alter the user data and care about their migrations.
Eventually we could also build deterministic SQLite layer implementing VFS so that web assembly code could write queries interacting with a database, but for now even if sqlite is running only on the frontend side it already allows implementing a lot of every day apps easily.
The only dependencies to change are:
- implement dynamic storage pallet
- implement library for using blockchain based SQLite database
We could have a common good parachain dedicated to this, with its own storage token.
With storage token we could:
- Limit the entire state storage size, say, if we release n tokens and they’re all used up for storage consideration blockchain state would never take more than, for example, 100GB
- Token would have utility directly related to storage. Anyone only reading the database wouldn’t even need to use the token.
- If we see that users need more storage overall, we could gradually release more of the token expanding total state size of the blockchain
- After proof of personhood is implemented, small amount of storage could be just granted to the user, so they could use apps for free with reasonable data limit
Example of non financial every day applications off the top of my head this would allow developers to build:
- Todo lists - database will be small
- Calendars - database will be small
- Weight loss, calorie counting apps - database will be small
- Search engines - Sqlite has fts5 full text search capability, it would figure out minimal pages to fetch with appropriate indexes
- Soundcloud equivalent, users could upload their own library of music readable by anyone, so if someone is running a smoldot on their phone they could fetch mp3 files stored in sqlite database and just listen music on their car for free
Developers wouldn’t need to worry about running their own databases and servers to store data.
Other options and their drawbacks for user facing apps:
- EVM Smart contracts: they can only be deployed once, so if you build a todo list in a smart contract you can’t change them, or you need to juggle with proxy contracts and etc. Plus, users must do transaction per every operation, add todo, remove todo unless complex batching is implemented. Here we’d get batching for free done by the frontend developer, he’d just need to upload changes to the database back to the blockchain.
- Parachains/pallets: it is not realistic to build a parachain or a pallet dedicated for a todo list, it is a steep learning curve and someone who would build the app with Web2 technologies easily would never go to such lengths to build a runtime and deploy it for something as simple as a todo list. Plus, if user wants to change schema of todo lists app, he’d need to know about writing frame migrations, which also have a steep learning curve.
- Web assembly smart contracts: I imagine similar drawbacks with EVM smart contracts, I’m not familiar with those yet so someone could elaborate more