I’m looking for feedback on this proposal.
Since the last post, I’ve focused more on what governance insights are and what they’re for
and have tried to make it more understandable what kinds of data the tool will ingest.
So, community, please let me know … is the proposal clear?
Can you understand from this what the tool does?
Why we need governance insights?
Why it’s necessary for the project and tool to be agile and ongoing?
Squidsway Governance Report and Tool
Actionable governance insights, from a rich data chain indexer
GOVERNANCE FAILURES ARE A TREASURY ISSUE.
SQUIDSWAY WILL SOLVE THOSE FAILURES FASTER.
I want to improve Polkadot governance because I’m a cypherpunk and I think Polkadot can lead the world, not in just governance of blockchains, but in blockchain-based governance of the offchain world.
Governance is a product on Polkadot, its a field we are leading in, and we should invest in growing the lead we have - make it something to showcase.
But you, dear tokenholder, should fund improving Polkadot governance because
GOVERNANCE FAILURES ARE A TREASURY ISSUE
We are iterating our processes based on assumption, hunches and louder voices, instead of evidence.
That wastes time and costs money.
The alternative to iterating based on vibes is data.
Squidsway is a proposal to collect and compile specific bespoke data, targeted at objectively assessing how OpenGov users respond to everything we do in OpenGov - and to generate insights from these assessments, in order to inform how we continue to iterate OpenGov.
The proposal for Squidsway funds two things:
The Squidsway tool:
A chain indexer with rich data ingestion modules,
for testing and quickly iterating hypotheses and generating actionable insights about user behaviour.
The Squidsway project:
Publishing governance insight reports roughly every 3 months (and on shorter timescales case-by-case).
Continually adding modules to the tool, to support investigation through the tool, for the purpose of generating insights.
The tool will be open source, for any dev (eg, ecosystem product teams) to use, and future work includes an LLM-based frontend for non-devs to query it.
The project will be funded by the community on an ongoing basis, so will be focused on live, open questions that the community is discussing at any given time. There will be a mechanism for the community to request data on issues of interest.
This proposal funds only the first three months. If the community likes what it sees, then subsequent proposals will fund ongoing work.
Deliverables
This first proposal is for $8k USDC, to fund 80 (=40+40) hours over around 3 months,
being the development of an MVP, followed by the first half of the validation phase.
At the end of the work funded by this proposal, the tool should consist of:
modules to:
. ingest relevant governance events from chain data
. ingest structured/quantitative offchain data (e.g. from Polkassembly)
. curate data (using queries to assign tags, e.g. “whale”, “shrimp”)
and
. an indexer capable of reindexing based on these types of data.
The second proposal would fund the second half of the validation phase.
By the end of that work, I intend that the tool will be ingesting qualitative (natural language) data and outputs would begin to demonstrate what is possible with the tool. I should also have some basic benchmarking to flag up any feasibility questions and potential non-labour costs for the future.
Methodology
The methodology is intended to be very, very agile.
The idea of generating insights is to tell us something we didn’t know, rather than setting out to prove or disprove a pre-defined set of hypotheses.
Central to that is the ability to, in investigative terms, ‘pull on threads’ - or, in software terms, to ‘rapidly iterate’. This means that the treasury will, for each sprint/for each proposal, be funding something that it does not know what it will be.
This agile way of working is necessary because:
1 - We need to go where the evidence takes us
2 - It’s likely that many of each of the small technical steps that would make up a milestone can only be identified once a previous step is complete, so identifying and costing out these small technical steps in advance would either lead to wasted labour or force investigations down an inflexible path.
The fact that, in the base case of Squidsway funding referenda, the treasury will be funding something unknown should be mitigated by the ongoing nature of the project, and the fact that each ‘milestone’ (ie funding period) is a small amount.
What kind of user behaviour are we trying to encourage?
Defining and encouraging the desired outcomes is a question for OpenGov or for the teams making use of Squidsway.
Squidsway is not the part which incentivises or encourages user behaviours
– it’s the part which identifies where the opportunities are to do that.
‘user behaviour’:
To illustrate the meaning, though: ‘user behaviours’ will generally be (individual or aggregated) measurable actions that can be taken onchain in the Polkdaot ecosystem, such as voting, staking, liquidity provision, but likely in more specific detail then this, such as “voting by pre-existing wallets that never voted before”.
‘encourage’:
Already, we seek to change user behaviours all the time - incentivising adoption and liquidity, using social norms to encourage delegation and voting, working on UX to reduce friction and using (some pretty blunt) technical instruments to encourage the adoption of procedures and norms for proposer and delegate behaviours.
The mechanisms we are using - game theory, finely targeted incentivisation and the like - are powerful but we are often applying them amateurishly, iterating our processes and mechanisms based on just guessing what works.
The idea behind Squidsway is that we encourage these kinds of user behaviours by more empirical (ie more reliable) means.
WTF is ‘rich data’ / ‘chain indexer’?
A chain indexer is a tool that indexes and stores, in greater or lesser detail, a blockchain’s data. Most relevant data in a blockchain (even data as basic as account balances) is not accessible unless you consult a node or an indexer. RPCs under the hood of polkadot.js or your wallet software connect to full nodes but data applications like most block explorers, or data dashboards, use chain indexers on their backend.
Applications that process blockchain data usually index and store the information which is easiest to obtain, and when they want to combine of these different data sources (such as comparing voting frequency of wallets against those wallets’ balances), they combine already indexed datasets. This is faster, but limits the complexity of the combination.
More complex data applications such as Chainalysis’s perform some degree of multi-step indexing, allowing them to retrieve during index time, so they can treat their datasets as graph data (meaning the indexer can follow trails at index time).
The Squidsway tool takes this a couple of steps further with what I’ll call 'compiled and ‘curated’ data.
compiled data is just data that has been indexed and combined through multi-step context-aware indexing. It could be, for example, “average conviction” for each account (across the accounts’ lifetimes), or “voted on with higher/ lower/ usual conviction” for each proposal.
curated data uses tags for fast reindexing of commonly used conclusions - for example, accounts could be tagged with categories from “whale” to “shrimp” despite the fact that account balances change over time.
In addition to these, the third and most powerful kind of rich data the Squidsway tool will index is
offchain data
Since the tool will reindex multiple times, there is less need for its data sources to be fast.
This opens up the possibility to make use of (API-based) web data and, at a higher processing cost, scraped web data and LLM outputs.
For example, the tool will ingest discussions from Polkassembly/ Subsquare / Polkadot forum and process the natural language in discussions there in order to generate tags for sentiment, contentiousness, compliance with each norm, etc.
I hope that this particular feature will help proposers avoid creating proposals that fail for predictable reasons, and create a healthier environment in online governance discussions in general.