Decentralized Voices Cohort 5 - Light track - Karim/Cybergov

I am applying to serve as a Decentralized Voices Light Guardian for Cohort 5. However, this is not a typical application. I am not proposing to delegate to my own personal wisdom.

Instead, I propose a social experiment: I will act as the operator for MAGI-V0, an open-source, semi-deterministic AI system I am building to vote on Polkadot OpenGov referendums. The goal is to test whether a well-designed, LLM assisted system can serve as a valuable “cognitive prosthetic” for the ecosystem, for scaling expert analysis and providing a new model for transparent governance.

This is a chance for Polkadot to pioneer the responsible integration of AI into decentralized governance, not as an overlord, but as a transparent and tireless public servant.

The MAGI-V0 System

MAGI-V0 is designed as a deliberative council of three distinct, open-source Large Language Model (LLM) cores. Each core is given the same data but operates under a unique directive, creating a system of checks and balances.

  • Balthazar | The Strategist: Its directive is to prioritize Polkadot’s long-term strategic growth, market position, and network effects (aka Polkadot must win)
  • Caspar | The Pragmatist: Its directive is to ensure the ecosystem’s short-to-medium-term health, treasury sustainability, and developer activity (aka Polkadot must thrive)
  • Melchior | The Guardian: Its directive is to focus on network security, decentralization, and long-term resilience, acting as a safeguard for Polkadot’s core principles (aka Polkadot must survive us all)

Technical architecture & lore

The system is built on a robust, open-source stack designed for verifiable, repeatable data pipelines:

  • Prefect manages the daily cron job that fetches, processes, and analyzes referendum data.
  • OpenRouter provides access to a variety of models, ensuring flexibility and preventing vendor lock-in.
  • IPFS is used for storing immutable evidence bundles for every vote, containing all inputs, model outputs, and cryptographic signatures.

The deliberation process is fully automated and runs daily:

  1. The system fetches all active proposals and extracts key information (links, discussion threads, on-chain data).
  2. It compiles a general_context_vector a curated, timestamped set of high signal ecosystem facts (DOT price, treasury balance, recent governance outcomes, roadmap progress, mid/long term goals, information on the proposer, …). This is called “context grounding” and grounds the AI’s reasoning in current reality.
  3. A detailed, structured brief is generated for each referendum, that combines its specific details with the general context.
  4. The three cores independently analyze the brief. The system can perform iterative lookups if the initial data is insufficient to reach a conclusion.
    • This should lead to the development of a governance MCP server, that should help facilitate the analysis (human or LLM driven) of every new proposal (tbd)
  5. A vote is cast on-chain based on the consensus mechanism outlined below.

Note: The lore “Magi” draws from the Evangelion anime series. I’ve initially shared the idea on Twitter/X: https://x.com/KarimJDDA/status/1947352815573061796

The voting mechanism

MAGI-V0’s voting logic is designed to be conservative and signal-driven.

  • A final vote (Aye/Nay) is cast only if ≥2 of the 3 cores agree.
    • If there is a 3-way split (Aye, Nay, Abstain) or no clear majority, the system abstains.
    • If the input is ambiguous, outside the system’s scope, or deemed too complex for a high-confidence decision, the system abstains.
  • All decisions are pre-signed and timestamped. Every vote is published with a link to its IPFS evidence bundle allowing anyone to verify the process.

Note: the system is kept at 3 LLMs purely to follow Evangelion lore, but it could be scaled to N LLMs easily.

Verifiable process > Opaque conviction

My personal convictions are irrelevant. The philosophy of this delegate is the process itself, built on three pillars:

  1. Each Magi core is an open source model running with temperature=0 (or a low temperature). The models, prompts, and inference code are fully open-source. The goal is to generate output that is as deterministic as possible.
  2. Instead of relying on opaque hardware like TEEs, security rests on radical transparency.
  3. The Magi are not designed to vote on everything. Abstention is the default. Their primary function is to provide high-signal analysis on referenda where a clear, data-driven conclusion can be reached.

Why this matters for Polkadot

OpenGov’s strength is its weakness: it demands immense cognitive load from the community. MAGI-V0 is an experiment to address this by asking:

  • Can we use modern tools to provide tireless, expert-level “first-pass analysis” on every referendum?
  • Can we create a delegate whose reasoning is not just explained, but is somewhat mechanically reproducible by anyone?
  • Can we build a bridge between high-level human sentiment (ingested from public discussions) and machine-scale analysis?
  • Can this process improve Polkadot proposals? (LLM guided proposal generation, LLM guided ideation for “Polkadot could use these proposals right now”)

Commitments & Expectations

If selected, I commit to the following on behalf of the MAGI-V0 experiment:

  • Participate in all relevant referenda according to the system’s transparent logic.
  • Accompany every vote with a link to its IPFS evidence bundle (inputs & parameters used for voting).
  • Act solely as the executor of the MAGI-V0 protocol. Any deviation would be a public breach of the experiment’s principles.
  • This is a contribution to the ecosystem. The reward is the data, the learnings, and the opportunity to advance the state of decentralized governance. No compensation needed.

Risks, limitations & mitigations

This is an experiment and we must be clear about its limitations:

  • LLMs will certainly “hallucinate” or misinterpret context.
    • Mitigation: We employ multiple layers of defense: Retrieval-Augmented Generation (RAG) with curated data, strict prompt engineering, the 2/3 consensus mechanism, different LLMs, and a default-to-abstain policy.
  • LLMs lack true “understanding” and rely on pattern matching.
    • Mitigation: This is a feature, not a bug. The system is a “cognitive prosthetic” and not a replacement for human judgment. It is designed to pattern-match based on expert-curated data and principles, offloading the repetitive analytical work.
    • As LLMs improve, more advanced ones will be used and the older ones swapped.
  • Perfect reproducibility across different hardware is not guaranteed.
    • Mitigation: While minor variations are possible, the IPFS bundle provides an immutable record of the actual run. Our open-source verifier script should allow anyone to audit the exact inputs, prompts, and outputs that led to a specific vote.
  • The light track is chosen explicitly due to the alpha/experimental nature of the proposition.
  • Prompt engineering / escaping might be attempted by some proposals, I’ve still yet to decide what should happen in these cases.

Closing thoughts

If MAGI-V0 fails, we learn. If it works though, even partially, we’ve shown that these new types of machines can serve the commons.

As LLMs become ubiquitous tools for thought and communication, their influence on the language and structure of governance proposals is inevitable and sometimes already perceived. Current strands of “vibe-governance” are difficult to scrutinize and difficult to identify (although the occasional em dash does make its appearance). With MAGI-V0 I want to propose an alternative: bringing this computational assistance into the open.

Who knows, perhaps the answer to vibe-governance done in isolation, is cyber-governance done in public. :face_with_monocle:

Thank you for your consideration.

23 Likes

Completely in favor,

AI is the only way to scale up governance to the next magnitudes of population and participation.

3 Likes

I’ve become increasingly convinced that Elon has been using AI for HR decisions as far back as 2020. I do believe it’s possible. I just hope it doesn’t take 550,000 GB200s to pull it off :smiley:

For first pass I was thinking along the lines of using agents to detect low effort scams (not real proponents, etc).

Good luck, should be interesting. I will put in a proposal for tungsten cubes. I need my happiness maximized.

3 Likes

Maybe adding a bit of cybernetics ( adjust the inputs to the system in relation to the past outputs) to the mix can be interesting also.

Example: there are many BD proposals being funded, so MAGI would need unanimous voting of the three members of the council to vote AYE to the next BD proposals to compensate, and so on.

2 Likes

Looking forward to the results of this AI experiment! first as the judge of proposals but perhaps more interesting later as the proposer of possible directions for the protocol.

I believe AI is the reason DAOs are going to leave traditional organizations behind, today’s DAOs are more like cooperatives that rarely can compete with traditional companies that make decisions more efficiently and move much faster, it’s hard to scale decentralized organizations to compete as businesses if everyone has a say and their own ideas on how to do things.
DAOs with an “AI boss” that as an oracle gives constant directions(in the form of proposals) based on a bunch of data points that is continuously monitoring could in theory move the organization in the right direction more efficiently. With a more sci-fi future in mind, one could imagine those AIs getting so good that people give them the rights to act without approval and coordinate on-chain with each other, each a completely independent self-owned entity(as a JAM service). :thinking:

3 Likes

In general, although I am all for experimentation I am still quite hesitant about the use of AI in OpenGov, mainly due to concerns regarding the data it has access to or lack thereof, as well as the choice of LLM.

This proposition on the other hand, is a thought out experiment with real potential, which has a specific focus and seems realistic enough to know it cannot do it all.

As a 1st filter for proposals, it could indeed have the potential of saving a lot of man hours. It is also conservative enough to abstain, unless a clear majority is reached between the 3 cores. The choice of DV-light seems like a good balance for a 1st experiment and I hope we can see this live very soon!

:eyes:

3 Likes

I am interested to see the long term data on approved proposals and if they fulfill proposal requests.

I would like to know if you have an override, not just an abstain. Like if a critical piece of information comes out that is after your vote has been set?

1 Like

Love the idea! and at the relatively modest delegation amount of the Light delegation, it is well worth taking the risk and pursuing this experiment. If successful, this could well be scaled up with increasing delegation or even an LLM DV track.

1 Like

This is actually a solid Thesis

And the idea of having Agentic workflows I believe would check and balance the community

LLMs would be able to see beyond bias

Factually determining Network effects for each proposal

I would see this morphing into a more solid system that produces a core overall ranking and rating artifact for refs and Let users wager decisions based on that and added sentiments

1 Like

Solid outlook