Building the Future of Decentralized AI with Jam

Hi! :fire:

This entire concept was imagined based on my own ideas, with the help of artificial intelligence to structure and deepen certain technical reflections. My intention here is not to propose a definitive solution, but rather to invite everyone to open their minds and reflect together on the possibilities offered by JamChain and the new horizons that such a decentralized architecture could open for artificial intelligence and beyond.

From a technical perspective, it is important to emphasize that the approach I propose with the integration of JamChain far exceeds the ambitions and capabilities of current projects like Ethernal AI or Bittensor. While these projects are innovative, they lack both the scalability and flexibility needed to efficiently manage massive artificial intelligence models in a decentralized way. With JamChain, we are entering a new era where cryptographic security, resource optimization, and distributed management allow us to overcome these challenges, offering a technically viable, relevant, and disruptive solution for decentralized AI.

Technical challenge: Why JamChain is necessary for decentralized AI

Current blockchains, even those like Ethereum after its transition to Proof of Stake, are not capable of effectively supporting the massive computational loads required by a large-scale decentralized artificial intelligence project, such as the training and inference of large language models (LLM). Here are the major limitations and why JamChain presents itself as the ideal solution.

  1. Limited Scalability of Current Blockchains
    Blockchain infrastructures like Ethereum or other current smart contract systems rely on a model where each transaction must be verified and validated by all nodes in the network. This poses constraints on:
  • Parallel processing: LLM models require massive parallel computations, which are often impossible to execute efficiently on blockchains where each node must process every transaction.
  • Global state accumulation: On-chain data accumulates, creating bottlenecks for node synchronization and increasing latency, particularly for systems that must handle large volumes of data like an LLM model.

Even with Layer 2 solutions, reliance on a central layer for transaction finality remains a barrier to truly decentralized intensive computations.

  1. JamChain: An Infrastructure Optimized for Decentralized AI

In-Core / On-Chain Duality
JamChain proposes a hybrid In-Core / On-Chain approach that effectively manages heavy off-chain computations while maintaining the security and transparency of on-chain results. This approach is essential for a project based on LLM models where intensive computation and frequent model weight updates are required.

  • In-Core intensive computation: Heavy operations, such as token vectorization and model weight adjustments, can be executed off-chain, in the core. This ensures optimized resource usage, reducing the load on the blockchain and increasing performance.
  • On-Chain finality: Once computations are validated, the results (e.g., adjusted weights or validated inferences) are transferred on-chain, ensuring traceability and cryptographic security of critical information. This maintains transparency while reducing chain saturation.

Massive Scalability via Parachains and Asynchronous Blocks
JamChain’s architecture relies on dedicated parachains, which allow for managing different parts of the computational process in parallel. These parachains operate independently but are secured by the central relay chain, enabling horizontal scalability and reducing processing delays.

  • Specialized parachains: By creating parachains specific to certain tasks (such as LLM training), we ensure that the network can grow as demand increases while distributing the computational load among nodes.
  • Asynchronous blocks: The use of asynchronous blocks allows for processing multiple transactions in parallel, without waiting for each block to be fully validated before moving to the next. This significantly improves transaction throughput and reduces latency, which are crucial for real-time inferences on an LLM model.
  1. Advanced Security and Cryptographic Validation
    JamChain offers validation mechanisms such as Proof of Validity, which ensure that every transaction or computation (including inferences and model adjustments) is cryptographically verified before being recorded on-chain. This model is perfectly suited to the demands of a decentralized AI system.
  • Inference validation: Inferences submitted by PoI nodes are validated by PoC nodes before being integrated into the chain. This guarantees result quality and prevents the submission of incorrect or malicious data.
  • Fast and immutable finality: Once validated, results are recorded on-chain with fast finality, ensuring that all modifications and model adjustments are traceable and transparent to all network participants.

Invitation to Reflection

This JamChain-based model could truly transform how we envision decentralized AI. The current limitations of blockchains in terms of intensive computation and scalability make it impossible to implement large-scale decentralized training and inference systems.

However, with JamChain, we have the opportunity to explore a model where the decentralization of AI is not only possible but also optimized. It is an infrastructure tailored to these needs and capable of addressing the technical challenges we face today. I invite you to reflect on the possibilities offered by JamChain for decentralized AI projects and consider how this new architecture can truly change the game in the field.

Introduction :wave:

In a world where technological advancements increasingly shape our societies, artificial intelligence (AI) plays a central role in many sectors. However, the current centralization of AI presents significant challenges. Large companies dominating this field monopolize access to computational resources, the massive datasets required for model training, and the platforms that provide access to AI results.

This concentration of power raises critical questions: How can we make AI more accessible, transparent, and inclusive? How can we enable individuals, small businesses, and communities to have an active role in its development? Moreover, how can we create an open economy where every contributor, large or small, can benefit from the growth of this technology?

The project we are proposing positions itself as a solution to these issues. By combining the latest innovations in decentralized blockchain, distributed computing, and decentralized governance, our infrastructure aims to democratize access to AI. We enable all participants to contribute actively, whether by providing computing power, validating inferences, or submitting datasets, while being rewarded fairly through a circular economy based on the DOT token.

With a radically new approach, we believe that AI should not be the preserve of a small group of powerful actors. It must become a common good, where innovation, transparency, and inclusivity are at the heart of the process. Through a secure on-chain infrastructure, fair economic mechanisms, and participatory governance, our project embodies this vision of the future.

Problems :eyes:

Artificial intelligence, while promising, is currently dominated by a small number of large companies with the resources to train massive models. This centralization poses several major challenges that limit access to innovation and restrict participation in the evolution of AI.

  1. Centralization of computational power and data
    The training process for AI models, especially large language models (LLMs), requires enormous amounts of computational resources. These infrastructures, often composed of data centers equipped with specialized GPUs and TPUs, are out of reach for most individuals and small organizations. This situation creates a significant gap, where only a few entities can access the tools needed to train and exploit AI at scale.

At the same time, access to the massive datasets required for model training is also centralized. Dominant companies collect and own much larger volumes of data than are accessible to the public or small organizations. This gives these players a monopoly on AI advancements, limiting innovation to a small circle of privileged actors.

  1. Lack of transparency and governance
    The processes of training, inference, and model evaluation are often opaque. End users do not have access to the underlying data or the methods used to train the models. Moreover, it is difficult to know how inferences are generated, what biases exist in the models, or how decisions are made.

Centralized governance of these systems presents another problem. Updates, parameter adjustments, and resource management are decided by a small group of individuals without real transparency or open participation. This lack of control and participation makes it impossible to ensure the fairness and justice of decisions made by these AI systems.

  1. Economic barriers and limited participation
    Training and using AI models are expensive. Cloud computing infrastructures charge high prices for data processing and large-scale inference execution. This limits access to artificial intelligence for small organizations and individuals, making innovation and AI research inaccessible to many.

Furthermore, there are few incentive mechanisms for individuals or small organizations to contribute to model training or provide computational power. Those with modest computational resources or specific datasets have no incentive to participate in training or developing current models, as the benefits are centralized within large companies.

  1. Risk of power concentration and lack of diversity
    The control of AI by a few centralized companies creates a risk of excessive concentration of power. These entities not only control the models they train but also the ethical and technical decisions that affect millions of people. The lack of diversity in perspectives and contributions can exacerbate biases in AI models, limiting their fairness and ability to address the needs of a diverse society.

These challenges highlight the need for a decentralized infrastructure, where participation is open to everyone, and where each contributor can benefit from the growth of artificial intelligence. Our project aims to address these challenges by building a system that promotes transparency, accessibility, and equitable participation in the development of AI.

Project Vision:bulb:

  1. Decentralized Architecture for the LLM Model
    Our project is based on a hybrid architecture combining Proof of Inference (PoI) and Proof of Contribution (PoC), which enables decentralized training and inference for the language model (LLM). This approach ensures that every actor in the network can actively participate in the development of the model, whether by providing computational power, validating data, or contributing to governance.

Proof of Inference (PoI): Distributed computing and real-time inferences

PoI nodes play a central role in performing inferences and distributing the training of the LLM model. Each PoI node participates in the following tasks:

  • Token vectorization: When a dataset is submitted or when an external user request is processed, PoI nodes begin with token vectorization. This critical step transforms textual data into numerical vectors, representing semantic relationships between words and phrases. These vectors are essential for the model to process the data effectively.
  • Distributed training: PoI nodes are responsible for training specific portions of the model. By working on independent chunks of data, they generate gradients, which are then sent to the central client. The client aggregates these gradients to adjust the global weights of the model. This distributed training process ensures that the workload is shared among multiple nodes, allowing the network to scale without relying on a single actor.
  • Real-time inferences: When a user submits a query to the AI, PoI nodes handle the token vectorization and the calculation of inference results in real-time. This ensures a fast and accurate response, allowing the AI infrastructure to remain responsive to increasing demand.
  • Open participation: The system is designed to allow any actor with computational resources to participate. Whether using GPUs, TPUs, or CPUs, each node can contribute to model training and inference according to its capabilities. This promotes the democratization of access to AI training and reduces dependence on centralized data centers.

Proof of Contribution (PoC): Validation of inferences and decentralized governance

PoC nodes ensure the validation of inferences performed by PoI nodes and actively participate in the decentralized governance of the network. Their role is essential to guarantee the data quality and transparency in managing the model.

  • Validation of inferences and datasets: Each inference executed by a PoI node is subject to a validation process by PoC nodes. They ensure that the results are correct and that the datasets submitted for training meet the required quality standards. This validation mechanism is crucial to maintain the integrity and reliability of the LLM model.
  • Submission and validation of datasets: PoC nodes can also submit datasets for model training. Before being used, these datasets are validated by other PoC nodes, ensuring that only relevant and reliable data are integrated into the training process. This helps avoid the introduction of bias or inappropriate data.
  • Participation in decentralized governance: In addition to validating inferences, PoC nodes are responsible for the governance of the network. They participate in strategic decisions, such as model updates, economic adjustments, or network improvement proposals. Each node has a voting power proportional to its contribution, ensuring that network control remains shared among participants.

2. Separation of the Original Model Training

One of the major challenges in developing an LLM model is ensuring that the original model remains intact and high-quality, while allowing it to be adapted to specific cases or specialized data.

To achieve this, we propose several mechanisms for separating the training phases:

Fine-tuning and Version Management

Fine-tuning is a process where the base model is used to train specific layers without compromising the integrity of the original model. This allows us to:

  • Freeze the original model: Only certain layers of the model are adjusted, while the fundamental layers remain intact.
  • Adapt the model to specific tasks: New data can be integrated to adjust the model’s parameters for specific tasks without affecting its overall capabilities.

Checkpoints and Versioning

Each phase of the model’s training is versioned through checkpoints, allowing us to clearly distinguish between different stages of the model’s evolution. These checkpoints are stored on-chain, providing complete traceability of the adjustments made to the model.

  • Training from specific versions: Inferences or adjustments can be made on specific versions of the model, ensuring that the original model is never compromised during specialized training phases.
  • Comparison between versions: Thanks to the checkpoints, it is possible to compare the performance of different versions of the model, ensuring that each adjustment brings a measurable improvement.

Multi-task Learning and Model Distillation

To ensure that the original model can be used in different contexts, we adopt a multi-task learning approach, where the base model is shared across several distinct tasks while maintaining the independence of the results.

  • Model distillation: The results of the original model can also be used to train simpler specialized models without altering the base model. This distillation simplifies the model for specific applications while retaining the general knowledge acquired during the initial training.

3. On-Chain Infrastructure: Security and Transparency

Our project relies on a fully on-chain infrastructure to ensure maximum security, complete transparency, and efficient resource management. By leveraging JamChain, an innovative blockchain from the Polkadot ecosystem, we have designed a framework where model training operations, inference validation, and dataset management are conducted directly on-chain, without relying on external centralized solutions.

JamChain: A blockchain optimized for artificial intelligence

JamChain is designed to handle intensive on-chain computations, making it ideal for executing language models such as LLMs. PoI and PoC nodes utilize this infrastructure to perform complex computational tasks while remaining decentralized and transparent.

  • On-chain computational tasks: All inferences and adjustments to model weights are performed directly on the blockchain. This ensures that every decision made by the model is traceable and can be audited at any time. This level of transparency is impossible to achieve with traditional centralized infrastructures.
  • Intermittent blocks for optimal efficiency: Instead of storing all data continuously, JamChain uses intermittent blocks to integrate only critical data such as model parameters, validated datasets, and adjusted weights. This reduces the load on the chain while ensuring that essential information is stored immutably and securely.

On-chain storage of datasets and parameters

In our architecture, validated datasets and model parameters are stored on-chain. This choice ensures total transparency regarding how data is processed and used for model training. Additionally, on-chain storage guarantees the longevity of the data while minimizing the risks of manipulation or corruption of the information.

  • Data security and immutability: Each validated dataset and weight adjustment is immutable once recorded in a block. This immutability protects against any unauthorized modification or attempts to manipulate the data used for model training.
  • Efficient parameter management: The adjusted weights and gradients from training are also stored on-chain. This allows for complete traceability of model versions while facilitating the evolution of its performance. Each training stage is documented, providing a clear view of the model’s progression.

Transparency and traceability via JamChain

Complete decentralization of operations allows tracking of every step in the model training and inference process. The blockchain ensures that decisions and adjustments are public and accessible to any participant, ensuring unmatched transparency.

  • Inference traceability: Every inference performed by PoI nodes is recorded on-chain, allowing any network participant to verify the results, the methodology used, and the adjustments made to the model.
  • Model auditability: Model parameter adjustments, dataset submissions, and validations performed by PoC nodes are all audited in real-time through JamChain. This ensures that the network remains impartial and equitable, with every decision visible to all.

4. Circular Economic Model Based on the DOT Token

The project’s economic infrastructure relies on the use of the DOT token, which is central to our circular economic model. This token ensures both the rewards for contributions made by PoI and PoC nodes and allows external users to access AI services smoothly and fairly. Through this approach, we create a sustainable ecosystem where contributions are rewarded and users benefit from value-added AI services.

Rewards for Contributions

PoI and PoC nodes play a fundamental role in the network’s operation and are rewarded in DOT based on their contributions.

  • Rewards for PoI nodes: PoI nodes, which provide computational power to execute inferences and train the model, are rewarded proportionally to their participation. Rewards are calculated based on computation time and the quality of results obtained during inferences and training.
  • Rewards for PoC nodes: PoC nodes, which validate inferences and datasets, are also rewarded in DOT. Their role is crucial in ensuring the quality and reliability of data, and these rewards encourage active participation and high-quality contributions to the network.

Payments for AI Services

The DOT token is used by external users to access the services provided by the AI. Whether for executing inferences, accessing complex analyses, or utilizing advanced services, users pay in DOT for each interaction.

  • Constant demand for DOT: By paying to access AI services, external users and businesses generate a constant demand for DOT. This drives the project’s economy, creating a virtuous cycle where users fund the network, and the nodes that contribute to its functioning are rewarded.
  • Premium AI services: Businesses can access premium services (in-depth analysis, specific data processing, etc.) by paying an additional amount in DOT. This adds a level of customization while strengthening the network’s economy.

Circular Economy and Redistribution

The project’s economic model is based on a virtuous cycle of token redistribution, ensuring both economic sustainability and incentives for participation.

  • Fair reward redistribution: The DOT collected from users is redistributed to PoI and PoC nodes based on their respective contributions. This model ensures that every participant in the network benefits from AI growth while maintaining a continuous incentive to actively participate.
  • Sustainable ecosystem growth: By ensuring constant demand for AI services and reinvesting a portion of the DOT through buybacks, the project’s economy is designed to grow in a stable and sustainable manner. Every user, whether contributor or consumer, plays a role in driving the ecosystem’s dynamic growth.

5. Decentralized Governance and Active Participation

Decentralized governance is a central element of our project, ensuring that strategic decisions about the network are not in the hands of a small group of actors but are shared among all participants. Through an on-chain voting structure and the active participation of PoI and PoC nodes, each community member can influence the model’s future, its updates, and its economic adjustments.

On-Chain Voting Process

In our system, all important decisions regarding the network, model training, and economic management are made collectively through a fully decentralized voting process. Each participating node, whether PoI or PoC, has voting rights proportional to its contribution and can propose or vote on changes within the network.

  • Participation of PoC nodes: PoC nodes play a key role in the governance process. By validating inferences and submitting datasets, they earn voting power and can influence decisions made within the network. This includes issues such as model updates, parameter adjustments, or changes in the reward structure.
  • Open and transparent proposals: Every participant can propose modifications or improvements to the model or infrastructure. These proposals are then submitted to an on-chain vote via OpenGov, ensuring that all decisions are public and transparent. The result of each vote is recorded and accessible to all participants.

Transparency and Inclusion

The governance process is designed to be inclusive and ensure that every participant in the network can play an active role. Whether a PoI node contributing computing power or a PoC node validating inferences, each has the opportunity to participate in the network’s strategic decisions.

  • Votes proportional to contributions: Voting rights are allocated based on the actual contribution of nodes to the network. PoI nodes that execute more inferences or computations, or PoC nodes that validate more data, see their voting power increase proportionally. This ensures that decisions are made by those who actively contribute to the network’s growth.
  • Decentralized control of the model: Every aspect of the model (weight adjustments, parameter changes, or the integration of new datasets) is subject to voting, ensuring that the model remains under community control. This mechanism prevents any form of power centralization, making the network resilient and adaptable to changes.

Economic Regulation Mechanisms

Governance also plays a key role in the network’s economic management. Decisions on rewards, staking rate adjustments, or the buyback schedule are all subject to the voting process. This helps maintain economic balance and adjust the network dynamics to meet the community’s evolving needs.

  • Reward adjustments: Nodes can vote to readjust the rewards allocated based on contributions. For instance, if more nodes join the network, rewards can be distributed more evenly to ensure fair compensation.
  • Economic parameter modifications: Staking rates or reward caps can be adjusted to maintain the network’s economic stability and respond to fluctuations in demand for AI services.

6. Security and Scalability

The decentralized infrastructure of our project is based on robust security mechanisms and flexible scalability. The goal is to ensure that the network can grow while maintaining high security standards to protect user data, inference results, and training processes.

Data and Transaction Security

Data security is at the core of our infrastructure. Every operation, whether it involves dataset submission, inference, or parameter adjustment, is protected by an on-chain cryptographic system.

  • Data immutability: All critical data (validated datasets, model weights, validated inferences) is stored immutably on-chain. Once recorded, this data cannot be modified, ensuring its integrity and protection against any attempts at tampering.
  • Protection against malicious behavior: The PoI and PoC consensus model prevents nodes from submitting malicious data or inferences. Nodes attempting to manipulate the system are quickly identified through decentralized validation mechanisms and can be excluded or penalized.
  • Process auditability: The network ensures that all transactions and operations are traceable and auditable. Every inference, adjustment, or dataset submission is associated with a verifiable history, allowing network participants to verify the legitimacy of the actions taken.

Performance Optimization and Distributed Computing

To ensure the network’s scalability, we have implemented a distributed computing architecture, which allows each node to contribute to model training and inference. This approach ensures that the network can grow exponentially while maintaining optimal performance.

  • Distributed computing via PoI nodes: Training and inference tasks are distributed among PoI nodes, enabling the processing of large volumes of data without overloading a single node. This approach distributes the workload equitably and ensures high efficiency in data processing.
  • Intermittent blocks to minimize on-chain overload: By using intermittent blocks, the network can integrate only the critical data necessary for the model’s operation. This minimizes on-chain resource consumption while allowing for efficient management of parameters and inference results.

Scalability and Resource Management

The modular design of the infrastructure allows the network to scale as new nodes join the system. Whether managing an increasing number of datasets, performing more inferences, or meeting greater demand for AI services, the architecture is designed to adapt to changing needs.

  • Horizontal scalability: The network can grow by simply adding more PoI or PoC nodes, allowing for balanced workload distribution. The more nodes there are, the more powerful and responsive the network becomes.
  • Flexible resource management: Staking and reward distribution mechanisms dynamically adjust node contributions based on the network’s needs. This ensures optimized resource usage and automatic adaptation to the workload.

Conclusion: A Vision for the Future of Decentralized AI :dart:

The project we propose represents a significant step toward a decentralized, accessible, and democratized artificial intelligence. By relying on innovative consensus mechanisms like Proof of Inference (PoI) and Proof of Contribution (PoC), we are creating an open infrastructure where each participant can play an essential role in training and inference for LLM models.

The use of JamChain and on-chain technologies ensures not only maximum security and complete transparency, but also a circular economy powered by the DOT token. Every interaction with the AI model strengthens the network, every contribution is rewarded, and users can access AI services in an equitable and inclusive manner.

By offering decentralized governance, where each participant can influence the model’s future, and enabling exponential scalability through a distributed computing architecture, we are laying the foundation for a system that is not only resilient, but also capable of growing sustainably while adhering to the fundamental principles of decentralization.

Artificial intelligence should not be confined to large corporations or centralized platforms. With this project, we are taking a step toward open AI, where innovation is the result of global collaboration and the benefits are shared fairly among all contributors. By creating a decentralized AI infrastructure, we offer users, businesses, and developers the opportunity to actively participate in building the future of AI in a responsible and transparent manner.

In conclusion, our project’s vision is not only a solution to the current issues of AI centralization but a technological revolution that will allow everyone to have an impact on the development of tomorrow’s AI models. With a solid infrastructure, a sustainable economy, and inclusive governance, we believe this project has the potential to transform the AI landscape for years to come.

This concept is truly an idea I came up with through brainstorming with AI, resulting in a technically advanced approach that could be viable. The core idea here is that such an initiative is, in my opinion, not only obvious but necessary for everyone. It addresses the growing need for decentralization and control over the global dataset with a worldwide AI. This vision is ambitious, but let’s be clear—it is not unrealistic. At present, we are undeniably limited by the power and architecture provided by existing blockchain infrastructures. I’m open to discussions, and while my previous post is somewhat lengthy, for those who want to dig deeper, I believe this is something we should truly consider and reflect upon.

4 Likes

I like many have been thinking on how to merge AI and web3.

I agree with the benefits of the approach but there are a few areas where I am not convinced:

  1. large scale training

    • LST is mostly limited by network bandwidth (and not by cpu/gpu). It is extremely inefficient to train across datacenters (and performance would degrade again if you are to train across a network of loosely coupled machines). There has been improvement recently to distribute the workload accross machines (see: [2204.02311] PaLM: Scaling Language Modeling with Pathways from 2022; some success recently INTELLECT–1: Launching the First Decentralized Training of a 10B Parameter Model in 2024). The cost of training will be higher on a blockchain and it will use more power than in a centralise solution.
    • For training, you need to have large and expensive GPU (not the gaming one like 4090) and they are expensive (>10k or >20k each) and usually you put 8 per box. You will need to think carefully on how to incentivize this people to buy and run the hardware. It is also possible that the HW is even hard to buy (small quantity of high end cards is not the best business for NVidia and friends). This machine need power, cooling and network bandwidth that you cannot have at home. They will need to be decentralized as well.
    • The amount of data for this trainings is staggering. IPFS is cheap but slow. How do you plan to copy the data from IPFS to the GPU? Where would you store the checkpoints (they are large).
  2. fine tuning is a better use case. It can usually run on 1 large 8 GPU machines or possibly a small cluster of them (±8).

  3. Inference: maybe.

    • Would people pay N x cost to get the guarantees coming from web3? if yes, then that’s an excellent use case. For ex, people that want to use AI inference to manipulate their portfolio would pay a premium to be sure of what exactly the model is doing and which model is executed. Batch inference should work well, for real time inference, it depends on the latency you can get, if you go on-chain for each request it is unlikely to be fast enough.
    • Same issue with the hardware: gpu with 24GB of ram are common enough but larger one definitively give a answers.

Overall : AI will be more expensive on Web3. Will customers pay for the difference?

7 Likes

Thank you very much for your thoughtful and pertinent comment! :heart:

You’ve raised crucial points, particularly on the challenges of large-scale training, hardware costs, network bandwidth, and real-time inference latencies in a decentralized environment. These concerns are essential for any AI-Web3 integration. I’ll address each point by proposing technical solutions and insights based on the latest advancements in this field.

Problem 1: Large Scale Training (LST) and Network Bandwidth

As you mentioned, large-scale training of language models is limited by network bandwidth, not by the CPU/GPU itself. This poses a significant challenge in a decentralized environment where machines are geographically dispersed, and each node may be far from the rest of the network. These network latencies can quickly become a bottleneck for efficiency.

Proposed Solutions:

  • Gradient Quantization: To reduce bandwidth needs, some solutions, like those used in projects such as INTELLECT-1, apply gradient quantization techniques (reducing the number of bits needed to represent the data). This approach reduces the amount of information exchanged between nodes while maintaining the model’s efficiency. By applying this, network congestion during large-scale training can be alleviated.

  • RoCEv2 (Remote Direct Memory Access over Ethernet): Another critical aspect is optimizing communication between GPUs via the network. RoCEv2 allows bypassing bottlenecks by enabling direct communication between GPU memory and the network, reducing the latency caused by transmissions between CPU and GPU. This technology has proven effective in large-scale centralized environments, such as those used by Meta and Google, and could be adapted for decentralized systems if we develop a governance model around these technologies.

Conclusion to Problem 1:

While bandwidth is a significant challenge for large-scale training in a decentralized environment, solutions like gradient quantization and advanced network technologies like RoCEv2 can help mitigate these limitations. However, this would require a specific infrastructure, which in a decentralized context, could still pose economic and technical challenges.

Problem 2: Specialized GPU Infrastructure and Associated Costs

You rightly pointed out that training large-scale models requires specialized GPUs such as A100 or H100, which are extremely expensive (over $10,000 per GPU) and often difficult to obtain. These machines also require specific infrastructure for cooling, power, and bandwidth, making them unsuitable for home or low-cost decentralized environments.

Proposed Solutions:

  • Economic Incentives for GPU Contributors: In a decentralized infrastructure, strong financial incentives are crucial to motivate participants to invest in such expensive infrastructures. A possible model could consist of a reward system proportional to computing contribution. For example, nodes providing high-performance GPUs for training or inference could be rewarded in tokens, proportional to the resource usage. This active participation model is essential for encouraging the purchase and operation of these expensive infrastructures.

  • Load Distribution via Swarm Parallelism: Another approach is to use methods like Swarm parallelism, which allows the training load to be distributed across multiple heterogeneous machines, thus reducing the load on each individual node while maintaining the network’s overall efficiency. This technique has been tested in projects like INTELLECT-1 and could facilitate decentralized training without requiring hundreds of specialized GPUs per node.

Conclusion to Problem 2:

Although the required GPU infrastructure for large-scale model training is expensive, economic reward mechanisms and distributed computing approaches like Swarm parallelism offer viable solutions. These methods would reduce the participation costs for contributors while distributing the load among multiple actors.

Problem 3: Storage of Datasets and Checkpoints

The storage of massive datasets and intermediate checkpoints is another major challenge in a decentralized AI context. As you noted, solutions like IPFS are too slow to meet the real-time needs of large-scale language model training. Moreover, checkpoints themselves can be extremely large and require fast and secure access to avoid slowdowns.

Proposed Solutions:

  • Hybrid On-chain/Off-chain Model: An efficient solution to meet storage and data processing needs would be to adopt a hybrid model. In this model, critical datasets and checkpoints would be stored off-chain in centralized or semi-centralized data centers, while validated results are recorded on-chain to ensure traceability and transparency. This would reduce the load on the blockchain while retaining the advantages of decentralized and secure validation. This model leverages on-chain verifiability while providing faster access to large volumes of data.

  • RAM-backed Storage Systems: Inspired by temporary storage systems like those tested in the INTELLECT-1 project, this approach uses RAM-backed systems to store intermediate checkpoints during model training. This provides very fast access to data, avoiding bottlenecks caused by slower storage solutions like IPFS. RAM being significantly faster than traditional disks can facilitate critical tasks during training while reducing latencies caused by data transfers.

  • Storage via Irys (formerly Bundlr on Arweave): With the recent evolution of Bundlr into Irys, this solution benefits from improved scalability and processing speed through faster transactions on Arweave. Using Irys, datasets and checkpoints can be stored permanently and decentralized, while maintaining accessibility. Although still limited in terms of performance for real-time needs, the improvements brought by Bundlr/Irys can address some use cases requiring better data distribution and decentralized accessibility.

Conclusion to Problem 3:

While decentralized storage still presents challenges, adopting a hybrid model combining off-chain storage and on-chain validation remains a viable approach to balance performance and security. RAM-backed systems offer increased performance for intermediate processing, while Irys (formerly Bundlr) and other technologies like Arweave ensure long-term persistence and security for large-scale storage. These technologies help reduce latency and guarantee the accessibility of critical data for model training.

Problem 4: Latency for Real-Time Inferences

As you correctly mentioned, real-time inference in a decentralized environment poses significant challenges, especially due to the latency induced by on-chain transactions. If every request has to pass through the blockchain for validation, it can quickly slow down response times, which is unacceptable in critical use cases like real-time trading or rapid interactions with AI.

Proposed Solutions:

  • Batch Inference: A common approach is to group multiple inference requests into “batches” and process them simultaneously. This method optimizes resource usage and reduces network load. Batch inference is well-suited to cases where immediate responses are not required, such as in bulk processing or post-event data analysis.

  • Hybrid On/Off-Chain Model: Rather than validating each inference request on-chain, a hybrid solution could be implemented where the inferences themselves are performed off-chain, and only critical verifications (such as the hash of the model used and the final result) are recorded on-chain to ensure traceability and transparency. This would reduce latency while maintaining the security guarantees of a decentralized infrastructure.

  • Advanced Network Technologies (RoCEv2): By using network technologies like RoCEv2, it’s possible to significantly reduce the latency between nodes during inference. RoCEv2 allows results to be transmitted directly via Ethernet, bypassing processors, thus reducing the communication delays between GPUs and speeding up inference, even in distributed environments.

Conclusion to Problem 4:

Real-time inference remains a challenge in a decentralized environment, but solutions like batch inference and advanced network technologies can reduce latency. The hybrid On/Off-Chain model seems to be a realistic solution to ensure both optimal performance and the transparency of the blockchain.

Problem 5: Economic Incentive Mechanisms

One of the most delicate points is how to create economic incentives to motivate participants to provide expensive resources like high-performance GPUs. These resources are essential for AI training and inference, but they require significant investments, both in terms of hardware and infrastructure (power, cooling, etc.).

Proposed Solutions:

  • Rewards Proportional to Contribution: In a decentralized model, each contributor must be rewarded based on their actual participation. For example, nodes that provide GPUs for training or inference could be compensated in tokens proportionate to the amount of computation performed. This incentive model would ensure that participants with high-performance GPUs are fairly rewarded.

  • Staking and Long-Term Engagement: Another incentive mechanism is to allow contributors to stake tokens to access certain rewards or advantages. Those who stake a certain number of tokens could access more profitable computing opportunities or larger shares of the profits. This would create long-term incentives and encourage network stability.

  • Peer Validation to Avoid Abuse: To ensure the quality of contributions, a peer validation system (Proof of Contribution - PoC) could be implemented. PoC nodes would be responsible for verifying the quality of submitted inferences and datasets. This model would prevent malicious or opportunistic behavior, where users submit unnecessary data in hopes of gaining rewards without making real contributions. Nodes that validate quality data would also be rewarded in tokens.

  • Occasional Resource Usage: Another strategy is to allow contributors to provide their resources occasionally when they’re not being fully utilized. This would optimize the use of available GPUs while reducing costs for end-users. Such a market could be integrated into the network to allow contributors to monetize their resources flexibly.

Conclusion to Problem 5:

Economic incentive mechanisms are essential to ensure the viability of the decentralized network. Solutions like reward distribution based on contribution, staking, and peer validation seem to be effective approaches to encourage active and quality participation, while avoiding abuse.

:dart: Conclusion

In conclusion, if no one takes the initiative to create truly decentralized artificial intelligence and web infrastructures, we risk ending up with a web that is far more centralized than we imagine. Large corporations will continue to monopolize resources and data, further tightening their control over the systems that shape our digital future.

While current technical limitations are a significant obstacle, I firmly believe there are hardware solutions and innovative approaches to overcome them. If we dare to explore these possibilities, we could lay the foundation for a new era where AI and the web are genuinely decentralized, accessible to everyone, and safeguarded from the dangers of centralization.

Additional insights and technical perspectives

1. TensorOpera and Aethir: Distributed GPU solutions

One of the key challenges we discussed is the need to decentralize access to high-end GPUs for large-scale training. The partnership between TensorOpera and Aethir introduces an innovative solution by distributing GPU resources through an edge computing network. By bringing GPU resources closer to end users and optimizing workload distribution, this reduces latency while increasing computational capacity. TensorOpera and Aethir Partner to Advance Massive-Scale LLM Training on Decentralized Cloud.

2. Neurosymbolic AI: Improving inferences while reducing data requirements

Another experimental approach worth exploring is neurosymbolic AI, a hybrid technology combining deep learning with symbolic reasoning techniques to improve complex problem-solving capabilities. Integrating this type of AI into our decentralized architecture could reduce the massive data requirements typically needed for training, while also boosting inference efficiency. Experimental AI Technologies 2024 | Restackio.

3. Federated Learning for distributed model training

Federated Learning is another promising technology for our project. It enables AI model training on distributed datasets without the need to centralize data, protecting data privacy while benefiting from collective learning. Experimental AI Technologies 2024 | Restackio.

Decentralized AI Ecosystem with JamChain

PoI and PoC Nodes

PoI (Proof of Inference) Nodes and PoC (Proof of Contribution) Nodes are the two types of nodes actively participating in the network to provide computing power, validate inferences, submit datasets, and contribute to governance.

  • A1: PoI Node – Layer 1
    The PoI Node provides the necessary computing power for real-time inferences of the large language model (LLM) on Layer 1. These nodes actively participate in inference collection.

  • A2: PoI Node – Layer 1 Training
    These nodes are responsible for training the LLM model on datasets via Swarm Parallelism. The data is processed in parallel to maximize efficiency and reduce training time.

  • A3: PoC Node – Validation
    PoC Nodes validate the inferences on Layer 1 and participate in validating model updates. They provide cryptographic proof that the inferences have been properly executed and validated.

  • A4: PoI Node – Layer 2 Fine-tuning
    PoI Nodes on Layer 2 perform fine-tuning tasks for the model using techniques like Federated Learning. This allows continuous improvement of the model based on additional data.

  • A5: PoC Node – Layer 2 Validation
    These nodes validate the fine-tuning adjustments and inference results performed on Layer 2 using RAG (Retrieval-Augmented Generation). Governance related to validation is handled through a decentralized process.

  • A6: PoI Node – Layer 3 Model Training
    PoI Nodes on Layer 3 participate in training a new model based on either reused or new datasets. This allows the creation of new iterations of the AI model.

  • A7: PoC Node – Layer 3 Validation
    These nodes validate the new model iterations on Layer 3 and propose rare updates to the Layer 1 model after governance validation.


Central Client Layer 1

The Central Client Layer 1 manages the centralized operations for Layer 1.

  • B: Central Client for Layer 1
    This component collects inferences and gradients from the LLM model on Layer 1. It ensures that all inferences are processed and gradients are properly collected for model weight adjustment.

  • G1: Collect and Apply Updates
    Once gradients are collected, model weights are adjusted using cryptographic proof to ensure that the changes are valid and secure.

  • H: On-chain Updates
    This component updates the model parameters directly on-chain, ensuring that all updates are public and immutable.

  • I: Deploy Model for Real-time Inference
    Once the model is updated, it is deployed to be used in real-time by users.

Central Client Layer 2

The Central Client Layer 2 handles model fine-tuning tasks and the application of RAG techniques.

  • J1: Collect Fine-tuned Gradients
    This component collects gradients resulting from fine-tuning on Layer 2.

  • K1: Apply Weight Updates
    After collecting gradients, model weights are adjusted based on the new inferences.

  • L1: On-chain Model Update for Layer 2
    The adjustments to the model resulting from fine-tuning tasks are applied on-chain to ensure integrity and traceability of the changes.

Central Client Layer 3 (Integrated into Layer 1)

The Central Client Layer 3 is now integrated within the Central Client Layer 1. It manages the training of new models and rare proposals for updating Layer 1.

  • P1: Collects Gradients for Layer 3
    This component collects gradients from model training on Layer 3.

  • Q1: Apply Layer 3 Updates
    Updates are applied to the Layer 3 model, allowing the management of a parallel or future iteration of the model.

  • Layer 1 Update Proposal from Layer 3
    A connection from P1 to B indicates that in rare cases, model update proposals from Layer 3 can be suggested for Layer 1 and validated through governance.

Rare Updates from Layer 3 to Layer 1 via Governance

Rare model updates for Layer 1 can be proposed based on new models trained on Layer 3.

  • A7: PoC Node Validation for Layer 3
    PoC Nodes validate new models on Layer 3 and, if deemed sufficient, can propose updates for Layer 1.

  • G1: Governance Decision for Update
    A governance decision is required to approve any rare update coming from Layer 3 before it affects Layer 1.

User Interactions and Circular Economy via DOT

The economic model of this ecosystem relies on user interactions with the AI model via the DOT token.

  • F1: Regular Users
    Regular users pay in DOT to access the AI model on Layer 1.

  • F2: Enterprise Users
    Enterprise users pay in DOT for advanced services on Layer 2, such as complex analysis or fine-tuning.

  • F3: Specialized Users
    Specialized users pay in DOT to access newly trained models on Layer 3.

DOT Token Utilization and Redistribution

The DOT system allows the collection of user payments and redistribution of rewards to contributing nodes (PoI and PoC).

  • DOT Payments Collection
    DOT payments made by users are collected at each level (Layer 1, Layer 2, Layer 3).

  • Redistribution to PoI and PoC Nodes
    DOT payments are redistributed to PoI nodes for their computing contributions and PoC nodes for validating inferences and models.

  • Periodic DOT Buybacks
    A portion of the collected DOT is used for periodic buybacks, helping to stabilize the token economy.

DOT Staking and Incentives

DOT staking allows access to premium services and participation in governance.

  • Premium Access via DOT Staking
    Users can stake DOT to access premium services such as faster inferences or more powerful models.

  • Governance Participation
    DOT staking also allows users to participate in governance decisions regarding model validation and reward redistribution.

Off-chain Data and Parameter Storage

Datasets and parameters are stored off-chain with intermittent on-chain validation to ensure data integrity.

  • D: Off-chain Storage
    The datasets used for model training are stored off-chain to avoid overloading the blockchain.

  • On-chain Validation
    The off-chain stored data is intermittently validated on-chain to guarantee its authenticity.

Decentralized Validation and Governance

Decisions regarding inferences and model updates are made through a decentralized validation and governance process.

  • G1: PoC Node Voting for Layer 1
    PoC Nodes vote to validate inferences on Layer 1 and for associated economic decisions.

  • G3: PoC Node Voting for Layer 2
    PoC Nodes vote to validate fine-tuning adjustments and RAG inferences on Layer 2.

  • G4: PoC Node Voting for Layer 3
    PoC Nodes vote to validate newly trained models on Layer 3.

Conclusion: Understanding the Decentralized AI Ecosystem with JamChain

This ecosystem is built on three main layers (Layer 1, Layer 2, Layer 3), each playing a specific role in the training, improvement, and validation of decentralized machine learning models. The Proof of Inference (PoI) and Proof of Contribution (PoC) mechanisms are central to this ecosystem, where nodes either contribute computing power or validate inferences.

Model updates in this decentralized AI network are managed through governance on each layer. Additionally, the economic mechanisms around the DOT token ensure active participation from users, whether they are technical contributors or end users of AI services.

The architecture also includes off-chain data management to avoid blockchain overload, while maintaining an on-chain validation mechanism to guarantee the integrity of the data used by the models.

The integration of the three layers within the same framework, with the rare possibility of proposing updates from Layer 3 to Layer 1, allows for continuous innovation and flexibility within the network.

Key Takeaways:

  • Layer 1: Real-time inferences with on-chain validation.
  • Layer 2: Fine-tuning and RAG for more precise model adjustments.
  • Layer 3: New model iterations that can propose updates to Layer 1 after validation.
  • Decentralized Governance: Ensures transparency and effectiveness in economic and technical decisions.
  • DOT Token Economy: Used to pay for AI services and reward contributors (PoI and PoC nodes).

This model offers an innovative and scalable approach to decentralized artificial intelligence, leveraging JamChain technology to enable continuous updates while maintaining decentralized governance and a circular economy based on the DOT token.

Glossary:

  • LLM: Large Language Model, used for learning and inference.
  • PoI (Proof of Inference): A mechanism that provides computing power for model inferences.
  • PoC (Proof of Contribution): A mechanism for validating inferences and decentralized governance.
  • RAG (Retrieval-Augmented Generation): A technique that enhances model results by retrieving external data.
  • Swarm Parallelism: A parallel training method using multiple nodes to process datasets.
  • DOT: The token used for economic interactions within the network.

Documentation: Decentralized AI Ecosystem with JamChain

This documentation presents an innovative approach to managing multiple AI model versions within a decentralized ecosystem by using sub-layers under the Central Client Layer 1. The approach allows for the creation and experimentation of new datasets validated by PoC Nodes, while maintaining the integrity and modularity of the system.


Central Client Layer 1 with Sub-layers for Version Management

Sub-layers to manage different model versions

The Central Client Layer 1 is divided into sub-layers (e.g., Layer 1.1, Layer 1.2, Layer 1.3) to manage multiple versions of the model in parallel, each dedicated to a specific use case or experimental setup.

  • Layer 1.1: Stable version of the model, used for real-time inferences with performance and reliability guarantees.
  • Layer 1.2: Optimized version for specific cases (e.g., fast inference tasks, personalized models).
  • Layer 1.3: Experimental version, allowing the testing of new models or algorithms with specialized datasets or innovative configurations.

Each sub-layer operates in isolation, allowing users to select the version that fits their needs while maintaining centralized management via the Central Client Layer 1.


Creation of New Datasets Validated by PoC Nodes on Sub-layers

Dataset management on sub-layers

PoC Nodes (Proof of Contribution) play a central role in the creation and validation of new datasets for specific model versions on sub-layers like Layer 1.3. This approach allows for enriching and experimenting with models without affecting stable versions.

Steps for creating datasets on a sub-layer:

  • Data collection by PoC Nodes: PoC Nodes submit new data or datasets to be used to train a specific model on a sub-layer.
  • Dataset validation: Once the data is submitted, PoC Nodes validate these datasets through a decentralized consensus, ensuring they meet the quality and relevance criteria for their intended use.
  • Adding to sub-layers: Validated data is added to the experimental version’s dataset (e.g., Layer 1.3) and used to train a specialized or innovative model.

Example of use:

  • Layer 1.3 could use a new dataset validated by PoC Nodes to train a model suited for specific use cases like medical data, financial forecasts, or any other specialized domain.

Interaction of Models with New Datasets

The new datasets validated by PoC Nodes on a sub-layer are used to train an experimental or specialized model. These models can be tested in isolation on the sub-layer without affecting the main model.

  • Experimental model on Layer 1.3: For example, a specialized dataset can be used to test new approaches or algorithms on Layer 1.3, allowing users and companies interested in these new use cases to test models without impacting other users.
  • Stable model on Layer 1.1: Meanwhile, the stable version on Layer 1.1 remains unchanged and continues to provide reliable real-time inferences for regular users.

Decentralized Validation and Governance of New Datasets

Role of PoC Nodes

PoC Nodes are responsible for validating each dataset submitted on a sub-layer. They ensure that the new data meets the criteria defined by the community and is relevant for training the model.

Decentralized validation process:

  • On-chain validation: Although the datasets are stored off-chain to avoid blockchain overload, PoC Nodes regularly validate the integrity and quality of the data through on-chain processes.
  • Intermittent validation: PoC Nodes perform validations at regular intervals to ensure that the new data submitted meets the quality standards.

Consensus for dataset validation:

  • A decentralized consensus is required to integrate new datasets into a specific sub-layer, ensuring transparency and security throughout the process.

Rewards and Incentives via the DOT Token

Economic contribution of PoC Nodes

PoC Nodes are rewarded in DOT for their contribution to creating and validating datasets. This encourages active participation and maintains the integrity of the data used to train models.

Reward distribution:

  • Data submission: PoC Nodes receive DOT for submitting valid data that enriches the datasets.
  • Dataset validation: PoC Nodes are also rewarded for decentralized dataset validation, ensuring that only high-quality data is used to train models.
1 Like

It is possible that it works well enough. If that’s the case, the centralized application will also use it and we are back into it is more expensive and less efficient to do it in a decentralized way. Intuitively the gap in cost will decrease if you can do it well.

Another option that was used 30y ago in FEM is to partition the large matrix with some overlapping bands. You compute on all blocks and then you swap the results on the borders for each block. You iterate a few times and if lucky things will converge. Look at solid/fluid dynamics fem where it was a common process.

Roce and friends (infiniband) are local strategies, aka on a floor or across a building (AFAIK). I have seen them implemented across long distance.

All the TPU/GPU clusters use some form of fast networking that does not hit the CPU. Typical machines have 4 or 8 tpus, same numbers of network cards and the connection between the gpus between 2 machines does not go through the cpu but flows directly between them. So basically in the datacenter, you have 2 networks, 1 for machine to machine and 1 for gpu to gpu. Order of magnitude are 200Gb/s are common per GPU. A medium size cluster will have 1024 (or 4096) cards all connected point to point (or via a torus or different topology). New network cards are all 400Gb/s).

Going into a distributed world, even if you could use the gpu to gpu one you will bottleneck on the fiber between the datacenters. Note that you can play with the concept at home with nvme over tcp if you like to tinker a bit.

Long term yes. What makes infrastructure “cheap” is when you have a lot of jobs to run on them and you can keep the hardware busy. That’s very very hard to do and that’s why cloud providers make money. A customer buy 1 core but they use 10% (and I am generous) of it. If you have 100 customers and 16 cores, it is likely that you can keep them all happy since they are unlikely to spike at the same time. In the ML world, that means a lot of customers that would agree to wait in a queue could maximise the usage of the gpus and lower the cost. You need a state of the art scheduler for that. Not all customers want to wait and that means “reservation”, premium price and so on.

Since the ticket of entry is high, you would need to sponsor the collators until they have enough traffic (let’s say a 18 months, expensive business).

I am convinced it is doable but unconvinced that it will efficient (and then cheap).

Fair! At the same time, it pulls back to complicated infrastructure (distributed storage at scale is hard, see lustre).

Yes. That’s what the gafam are doing. RAM is not cheap. One of the reason they can do it when you train, the memory of the host of the gpu is not doing a lot. If you have 1000 machines with 4 gpus each, you have 1000x 4TB of RAM to play with. Note that since you are already network bounded, read/write on ram on another machine use the same bandwidth. We are again very much into datacenter machines big 2 sockets stuffed to the brim. Some demonstrated excellent results for checkpointing which leads also to much better resilience to failure.

I dont know them and will have a look.

I agree with you proposed solution. There is a lot of reason why it could fail and they are mostly linked to the overall price. If it is cheaper or close to train on a decentralized system then it will have a lot of success. If it is 2x more expensive, it is unlikely to have success, companies can barely afford the current cost.

1 Like

Hi Cyphertux, thanks for all the sharing and the debate around the topic.

Distro, a project from Nous Research, recently proposed a decentralised architecture for LLMs using common GPUs and less than 100 times the power needed in legacy systems. These bold affirmations change the way model training has been done and thought of and the way we position ourselves on the topic.
They wanted to release the code soon.

If I understood correctly, your project/ideas are intended to work as a system chain for Polkadot; is this correct?
In my opinion, users can connect Decentralised AI and Jam. The interesting challenge is to figure out how we can build and collectively own the systems that bridge them.

I’ve been working on Kinera. We offer users tools to build organisations where they co-create and distribute media content. Since we use open-source text-to-video tools and GPTs, we are coding a way for members to offer, exchange or sell their GPU power. Reading the Distro release last week, we realised that, with our existing code, model tuning for characters and small-scale GPT specialists was within reach. A skilled GPT consultant to test and connect the tuning algorithms with our GPU power is still needed. In a month or less, we want to start onboarding people who are eager to test and use these tools. As a self-financed project, we’re open to volunteers. If you know anyone who might be interested in joining us, please let me know!

Our discord.

The website is still a work in progress and not public.