(This article was first published on Zhihu)

On-chain AI is an important trend, and I believe it is crucial for the future of both Web3 and AI. It mainly addresses two major issues with current AI:

  1. Computational power on-chain, although there are many companies offering AI inference services, each service is an isolated island. Although pricing is competitive, it has not yet reached full marketization. Moreover, Web3 services (such as smart contracts) currently do not have a good way to use AI services on-chain.
  2. On-chain AI Agent platform, solving the production, sales, and profit-sharing issues of AI Agents. Platforms like Character AI, where users contribute out of passion, mean that all income from AI Agents goes to the platform, naturally leaving users with little incentive to fine-tune their AI Agents.

Computational Power On-chain

Open-source models are now mature enough, for example, the capabilities of LLaMA-2 70B have reached the level of GPT-3.5 in many aspects, and using one’s own hardware infra and high-performance inference frameworks (such as vLLM) can be cheaper than calling the GPT-3.5 API.

Frameworks like FastChat have already implemented OpenAI interface compatibility, and many large models are also using this compatible interface, so the model interfaces have basically been unified.

Against this backdrop, many AI inference service companies have started a price war. For example, Together AI’s LLaMA-2 70B costs have been reduced to $0.0009/1K tokens, which is more than half cheaper than GPT-3.5. Lepton AI’s inference costs are also very low.

Since AI inference services have been standardized, is it possible to put computational power on the blockchain and let the blockchain automatically price AI large model inference power? There are several technical challenges that need to be addressed:

  1. How to implement efficient request distribution and recording. Large model inference is sensitive to latency and requires high-performance request distribution. How to implement decentralized and efficient request distribution is a challenge. There are numerous requests (prompts) and large model responses, putting them all on the blockchain could lead to high storage costs, and requests and responses are highly private and not suitable for public disclosure on the blockchain. But if nothing is stored, there is no way to verify. The best method is to store hashes.
  2. How to verify the correctness of the output from computational nodes. For example, could some nodes cheat by using smaller models to impersonate larger ones, or even make up an output? Since verifying the correctness of large model outputs can only be done by recalculating, if every output is verified by other nodes, the cost would be high. It is better to use a credit-based mechanism, where nodes must stake a certain amount of tokens to contribute computational power, and adopt a certain spot-check mechanism or dispute resolution mechanism. If computational results are found to be incorrect, a certain penalty is imposed. Of course, this also requires ensuring that the output of the large model is reproducible (e.g., recording the seed) and that the dispute resolution mechanism itself is trustworthy.
  3. How to assess the latency of computational nodes and request distribution nodes. Since large model inference is sensitive to latency, it is not only necessary to ensure the correctness of the inference results but also to obtain the output as quickly as possible. The mechanism for on-chain AI must be able to assess the computational latency of computational nodes and the distribution latency of request distribution nodes, choosing nodes with lower latency whenever possible, rather than those with excessive delays.

In the era of Proof of Work, countless computational powers were used for mining. How great it would be if these computational powers could be used for AI inference! Moreover, these GPU mining rigs can completely perform AI inference, such as LLaMA 7B/13B small models, stable diffusion image generation, Whisper speech recognition, VITS speech synthesis, etc., all of which can run on consumer-grade graphics cards.

A long-standing problem plaguing smart contracts is that they cannot call AI algorithms because contracts have difficulty automatically paying for AI computational power, which greatly diminishes the “intelligence” of smart contracts. With On-chain AI, smart contracts will truly become “intelligent” by leveraging the capabilities of large models.

Furthermore, as a naturally open platform, the blockchain can also avoid the ethical issues caused by large companies directly releasing unrefined base models. On-chain AI computational power can be fine-tuned to comply with different regional cultures and laws, or provide unrefined original versions.

On-chain AI will truly democratize AI computational power, reducing the premium of large model inference power to a reasonable level, allowing application developers to use large models more cheaply.

On-chain AI Agent Platform

OpenAI’s recently released Agent platform GPTs and Assistant API have attracted a lot of attention. But OpenAI’s Agent platform is still a closed platform, and we do not know how much money OpenAI has made.

Many Agent platforms do not give any share to Agent creators at all, users contribute out of passion, and the income from the created Agents goes entirely to the Agent platform. For example, Character AI, Janitor AI, and others are like this, with thousands of Characters created by users out of interest, but the creators do not get a penny.

The on-chain AI Agent platform can create an open and transparent business model. The on-chain AI Agent platform can adopt a model similar to OpenAI GPTs, where users pay on-chain to use the Agent, and both the creator of the Agent and the provider of AI computational power (see the above Computational Power On-chain, On-chain AI section) receive a share of the proceeds, with no middleman making a profit.

This way, Agent creators will be more motivated to fine-tune their Agents, such as putting in more or higher quality corpora for fine-tuning, or exploring the use of different base model pipelines to achieve the best conversational effects and multimodal capabilities.

Smart contracts can also introduce more gameplay for the on-chain AI Agent platform, AI Agents, like NFTs, can have certain financial attributes, with unlimited imaginative space.

Comments