Sovereign Agents: In-Depth Research on Clawdbot/OpenClaw

[This research report and Slides were co-produced with the assistance of Clawdbot + Claude Opus 4.5 models]

“Sovereign Agents: In-Depth Research on Clawdbot/OpenClaw” Slides 【Slidev source code】

Where is your data stored, and on whose hard drive? Whose instructions does your AI obey? Who controls your compute power?

For the past three years, we’ve accepted a tacit agreement: hand over personal data to cloud giants in exchange for convenient AI capabilities. GPT requires a subscription; Claude requires a subscription; Manus was fully closed-source after being acquired by Meta for $2 billion—each paradigm shift pushes users further away from controlling their own digital lives. In early 2026, an open-source project called Clawdbot tore up this unspoken contract.

Clawdbot (renamed Moltbot for trademark reasons, then later renamed OpenClaw) is the first open-source project to merge three major Agent capabilities—Deep Research, Computer Use, and Coding—into a single system. Its radical nature does not lie in the technology itself—the underlying LLM reasoning, tool-calling protocols, and local-first architecture are all already mature components—but in a core claim it proposes and actually implements: the Sovereign Agent. This claim is defined by three dimensions of autonomy:

Data sovereignty — your files, chat history, and personal preferences always stay on your own hard drive, and never touch any third-party server;
Compute sovereignty — you can choose to call cloud APIs, or run open-source models locally with Ollama, and even keep your Agent working on an offline airplane;
Control sovereignty — every action of the Agent is entirely decided by you. No vendor-imposed limits behind the scenes, and no one else making “safety” judgments on your behalf—freedom and risk are both yours alone.

These three principles separate Clawdbot from all closed-source Agents, and also explain why it could explode in popularity within a day of release, surpass 70,000 GitHub stars in under a week, spawn hundreds of community plugins in 48 hours, and even trigger a spike in Mac Mini sales.

This report will dissect the phenomenon across six dimensions: its technical lineage and historical position; how the three types of sovereignty drive market breakout; the four-layer core architecture (multi-protocol gateway, Coding Agent engine, Markdown memory system, local execution and security sandbox); security risks and mitigation practices; a practical blueprint for building a sovereign Agent from scratch; and a forward-looking view on the return of personal computing and large models as the new operating system.

Part I: Genesis and Technical Evolution of Agents

To understand Clawdbot’s historical significance, we need to place it in the broader evolution of AI from “passive Q&A” to “proactive execution.” Clawdbot did not emerge from thin air; it is the product of three converging technological currents: commoditization of LLM reasoning, standardization of tool-calling protocols, and the resurgence of local-first software architectures.

1.1 The Closed-Source Dilemma of General-Purpose Agents

Before Clawdbot, the concept of a general-purpose Agent was not new—they can complete almost any task in the virtual world, effectively acting as an all-purpose digital avatar. However, the general-purpose Agents that actually worked well were almost all closed-source.

The rise and acquisition of Manus: The most typical example is Manus. This AI Agent went viral in March 2025, showcasing the ability to autonomously complete complex tasks (such as booking flights, analyzing stocks, writing reports), and quickly reached $100 million in annualized revenue. However, on December 30, 2025, Meta acquired Manus for over $2 billion, turning it into a fully closed-source commercial product. Manus’s core technology will not be open-sourced; its capabilities are locked inside Meta’s ecosystem.

Claude’s closed ecosystem: Anthropic’s Claude also has powerful tools, such as Claude Code (an AI coding assistant) and Claude Cowork (a desktop AI coworker). These products are extremely capable—for example, Claude Cowork can help users organize desktop files, draft documents, perform research, and even create complete slide decks (PPT). But they are also closed-source and heavily constrained for safety reasons; they are not allowed to operate arbitrarily within a user’s computing environment.

The fundamental privacy issue: Whether it’s Manus, Claude, or other general Agents, all of their computation and data processing happens in the cloud. This raises a core question: should users’ sensitive data—local files, personal information, work documents—really be entirely handed over to cloud providers? For privacy-conscious users, this is an unavoidable trust challenge.

1.2 Technical Lineage: Standing on Giants’ Shoulders

Clawdbot’s core capabilities are rooted in breakthroughs at the base-model level, especially the qualitative leap in how models understand and operate computer interfaces. This evolution follows a clear three-stage path from API-level capabilities (Computer Use), to closed products (Claude Cowork), and finally to an open ecosystem (Clawdbot/OpenClaw).

1.2.1 Origin: Anthropic’s Computer Use Paradigm

It all started when Anthropic released Computer Use in October 2024. This represented a fundamental shift in AI interaction patterns. Strictly speaking, Computer Use as a technology was not itself new—similar technologies have existed for years, such as familiar “macro recorder” or automation tools. Its revolutionary nature is that it trained a large model (Claude 3.5 Sonnet) to directly understand pixel-level screen information and operate the mouse and keyboard like a human, allowing AI to bypass API limits and control any software interface designed for human users.

Technical limitations: Anthropic’s reference implementation mainly relied on a “screenshot–analyze–click–screenshot” loop. This vision-based feedback cycle is clumsy and brittle under high-latency network conditions, leaving ample room for optimization via lower-level DOM operations (such as with Playwright).

1.2.2 Relay: Claude Cowork’s Enterprise-Grade Attempt

After Computer Use, Anthropic released a new version of the Claude desktop app in early 2025, with three core features: Chat, Cowork, and Code.

Chat is pure conversational interaction.
Cowork is local collaboration—it can help users organize desktop files (automatically classifying and filing different document types), perform research and write reports, etc.
Code is an AI coding assistant aimed at developers.

Product limitations:

Walled garden: Cowork is designed around safety and compliance. It is extremely conservative about accessing local files, only allowing explicitly authorized directories—anything outside is invisible and inaccessible. This is deliberate—not because the model is incapable, but because of fear it might mess up or delete user files.
High cost: Cowork requires a paid subscription. While a $20/month plan works, a few intensive operations can exhaust the quota quickly; serious use really needs the $100/month plan. The $100 plan gives roughly $300–400 worth of tokens, which is indeed cost-effective for heavy users—yet still expensive for light users.
Privacy paradox: Despite its emphasis on safety, Cowork essentially still sends user actions and data to the cloud for decision-making.

This combination of “powerful but constrained” characteristics in Claude Cowork created a massive market vacuum: users craved Computer Use–level capabilities but refused to be trapped in a cloud sandbox.

1.2.3 Breakout: Clawdbot as an Open-Source Synthesis

Clawdbot emerged precisely to fill this gap. Its core philosophy is summarized as an open-source sovereign Agent—an Agent that grants users sovereignty over data, compute, and control.

Even more crucially, Clawdbot is the first open-source project to integrate Deep Research, Computer Use, and Coding into a single Agent. Before this, the only systems that pulled off this “three-in-one” general Agent pattern (like Manus) were completely closed-source. The advent of Clawdbot means the open-source community now has a general Agent platform that can stand shoulder to shoulder with closed commercial products—this is the deep technical reason it triggered such a large-scale community explosion.

It’s important to note that Clawdbot truly stands on the shoulders of giants. Although Claude Code itself is closed-source, many in the community have done extensive reverse engineering, publishing detailed reports that reveal its internal design principles. Seeing this, Anthropic largely stopped trying to hide the fundamentals and instead turned those ideas into an Agent SDK and Skills tutorials, contributing them as shared community knowledge. Clawdbot was able to move quickly precisely because it built on top of this already-opened foundation.

1.3 The Founder Factor: Peter Steinberger’s “Second Life”

Clawdbot’s success is inseparable from the legendary background and influence of its creator, Peter Steinberger.

From startup to financial freedom: Steinberger did not become famous overnight because of Clawdbot. He had previously founded PSPDFKit (later renamed Nutrient.io), a B2B software company specializing in PDF processing. In 2021, PSPDFKit received a €100 million strategic investment from Insight Partners, after which Steinberger stepped back from day-to-day operations and achieved financial freedom.

Return of the AI Era: After retiring, with the arrival of the AI era, Steinberger began to think about what he should do. Around November 2025, he came up with the idea of building a “personal life assistant” and, using AI-assisted programming (vibe coding), finished the first prototype in just one hour. In December 2025, he officially released it as an open-source project.

Stunning development efficiency: As a solo project, Steinberger demonstrated an astonishing development speed. He made extensive use of multiple AI agents for parallel programming—he calls himself “Polyagentmorous” (a portmanteau of Polyamorous and Agent, taken from his Twitter headline), meaning one person working in coordination with multiple agents at the same time. According to statistics:

He produced about 40–50k lines of code per day on average
He consumed about 1.8 billion tokens per day—for comparison, a typical heavy AI user might consume around 9 billion tokens over an entire year
He once made 1,374 Git commits in a single day
Within just two months, the project’s codebase had already approached the million-line level

This meta-narrative of “AI building AI” greatly attracted the developer community. Since the founder is already financially free, Clawdbot is a project driven purely by technical passion and not by profit—this is also one of the foundations of community trust.

1.4 Brand iterations and controversies: Clawdbot → Moltbot → OpenClaw

The naming history of Clawdbot is itself full of drama:

Clawdbot phase: The project was initially released under the name Clawdbot. Because “Clawd” is pronounced exactly the same as Anthropic’s “Claude,” Anthropic issued a trademark infringement cease-and-desist letter.
Moltbot phase (January 27, 2026): He was forced to rename it to Moltbot. However, while simultaneously changing the GitHub and Twitter (X) account names, the old account names were snapped up by crypto scammers within seconds. The scammers used the hijacked accounts to promote a fake Solana token called $CLAWD, whose market cap once soared to 16 million USD before crashing, causing heavy losses for many late entrants. Steinberger had to urgently clarify publicly: “I will never issue any token. Any claim that I am behind a token project is a scam.”
OpenClaw phase (January 30, 2026): The project was finally renamed OpenClaw, establishing its long-term brand identity.

Part II: The three pillars of sovereign agents and the market explosion

2.1 Definition of a “sovereign agent”: three autonomies

Why is Clawdbot/OpenClaw called a sovereign agent? Because compared with closed-source agents, it grants users full autonomy along three dimensions:

1. Data sovereignty: Does the user’s data belong only to themselves? With cloud-based agents, data inevitably has to pass through someone else’s servers. Clawdbot, however, allows all data to be kept locally—there is no need to hand over sensitive data on your personal computer to any third party. For many geeks and privacy-conscious users, this is the core requirement.

2. Compute sovereignty: Is AI inference computation done on your own device, or must it rely on remote cloud servers? Clawdbot supports two modes—you can either call remote models via API (such as Claude, GPT, DeepSeek), or run open-source models (such as Llama) locally using Ollama. Choosing the latter means that even without an internet connection (for example, on a plane), the agent can still function normally.

This is also why many people buy devices like Mac Mini or NVIDIA Jetson, or even assemble desktop PCs with RTX 4090 GPUs at home—they want fully independent local compute. However, if you just want to try it out, there’s no need to spend 20,000 RMB on hardware—a cloud computer (tens of RMB per month) plus a pay-as-you-go API is enough. At present, domestic model providers (such as SiliconFlow, Volcano Engine, etc.) give new users large amounts of free tokens, sufficient for experimentation.

3. Control sovereignty: The agent’s behavior is entirely determined by the user; it won’t do anything beyond the authorized scope behind your back. But there is an important analogy here—this kind of control is similar to the “Code is Law” logic in blockchain: if there is a bug in your code that leads to a vulnerability and someone breaks in to steal your assets, you alone bear the loss—just like when you lose your Bitcoin, no one can help you recover it. By contrast, when you keep money in a bank (analogous to using a big tech company’s cloud agent), if the bank loses the money, the government may help you get it back. This is a trade-off between responsibility and obligation—if you keep things yourself, you must have enough technical capability to protect them.

2.2 Three drivers of explosive growth

Clawdbot was released in December 2025 and, after officially going live on January 25, 2026, went viral in just one day, quickly becoming one of the fastest-growing open-source projects in history (surpassing 70k stars in less than a week).

Metric	Traditional OSS tools	Clawdbot/OpenClaw	Difference
First-week GitHub stars	~500	9,200+	~18x
Community activity (Discord)	Gradual growth	Instant explosion (8.9k+)	Explosive
Plugin ecosystem formation	Several months	48 hours	Ultra-fast

Its growth can be attributed to the resonance of the following three factors:

Factor 1: Backlash against “subscription fatigue”

Users are tired of every AI app charging dozens of dollars a month in subscription fees. GPT charges money, Claude charges money, and each product’s features are incomplete—GPT only offers one kind of AI functionality, and Claude Cowork is also very conservative. As a “Bring Your Own API Key (BYOK)” framework, Clawdbot only requires users to pay for the tokens they actually use; the framework itself is completely free and open source, with no middleman markup.

Cost comparison: If your monthly token usage exceeds 100 USD, buying Claude Cowork’s 100 USD/month plan may be more cost-effective (since it provides about 300–400 USD worth of tokens). But if your usage is lower, using Clawdbot and paying per token is clearly cheaper.

Factor 2: Bringing the “Jarvis fantasy” down to earth

For a long time, AI assistants in sci-fi works (such as Iron Man’s Jarvis) have been omnipotent and private. Clawdbot is the first to make this experience truly accessible—it is not just a chatbot in a web page, but a digital steward running in the background of your system, capable of controlling the entire computer.

Of course, one must remain clear-headed when using it: similar to when Manus was released in March last year, even though it wasn’t very mature, it showcased a kind of agency—the agent has its own “mind,” able to autonomously plan and complete tasks. This “scalable autonomy” is what is truly exciting.

Factor 3: Developer empowerment and community virality

Clawdbot is extremely extensible. Developers can write plugins (Skills/Connectors) for it, enabling the AI to control smart home devices, automatically send messages, organize local files, and more. This immediate creative feedback loop greatly stimulates community-driven viral growth.

For example: the author of Clawdbot personally doesn’t have the time to develop connectors for a wide range of different apps such as Feishu (Lark), DingTalk, etc. But with the community, members can each contribute—someone writes a Feishu plugin, someone writes a DingTalk plugin, and someone else even creates integrations related to autonomous driving. This is the power of community-driven development, and it is an advantage that closed-source products cannot replicate.

Within 48 hours of the project’s release, the community spontaneously contributed a massive number of plugins, a phenomenon dubbed “plugin explosion.” Typical examples include:

Life services: goplaces (query locations via the Google Maps API), local-places (search local businesses such as nearby coffee shops)
Productivity: native-app-performance (analyze app performance via the Xcode toolchain), journal-to-post (automatically convert private journals into public social media drafts)
System control: deep integration with Home Assistant, allowing control of IoT devices (such as lights, air conditioners, etc.) via natural language

This viral growth in fact proves that Clawdbot is becoming a **”natural-language operating system interface”**—developers no longer write GUI interaction logic for users, but instead write API interfaces for the AI.

2.3 Abnormal fluctuations in hardware sales: the Mac Mini effect

The most notable spillover effect of the Clawdbot phenomenon is its impact on the hardware market. A large number of users purchased Mac Minis (especially the M4 version) to run Clawdbot. There are four deeper reasons behind this:

The gravity well of iMessage: Clawdbot can integrate with iMessage, which is one of its killer features. Due to the closed nature of the Apple ecosystem, to programmatically send and receive iMessage, you must have an Apple device logged into iCloud. The Mac Mini is the only physical bridge connecting AI and iMessage.
Advantages of the unified memory architecture (UMA): Apple Silicon’s unified memory architecture allows the CPU, GPU, and neural engine to share the same memory pool, which can save API costs—when running quantized large models locally, its performance is superior to PCs at the same price point.
Repurposing idle devices: If you’ve already bought the device and it’s just sitting there idle, you might as well use it to run agents—24/7 low-power standby makes it a perfect “home AI server.”
Low-cost alternatives: Of course, not everyone needs to buy hardware. Cloud computers (such as Alibaba Cloud’s Wuying cloud desktop or the 19.9 RMB/month cloud servers) can also deploy Clawdbot. The former provides a Windows desktop environment (suitable for the desktop version of Clawdbot), while the latter provides a Linux command-line environment (suitable for the server version of Clawdbot).

2.4 Challenges in the Chinese Ecosystem: The Dilemma of Isolated Apps

Clawdbot currently mainly supports overseas ecosystems (Google Drive, Gmail, WhatsApp, Telegram, etc.), because overseas ecosystems are relatively fragmented and open. But in the Chinese market, it faces serious challenges:

Doubao Phone Incident: In December 2025, ByteDance launched a Nubia phone equipped with the “Doubao Phone Assistant,” attempting to use AI to automatically operate apps like WeChat, Taobao, and Alipay. However, less than two days after launch, Taobao and WeChat successively blocked this functionality. WeChat’s risk control system detected abnormal operation patterns and directly banned the related accounts. ByteDance was forced to announce that it would “no longer support operating WeChat.”

WeChat DMCA Takedown: Going a step further, in January 2026 Tencent issued a DMCA notice to GitHub, requesting the removal of more than 30 open-source projects capable of exporting WeChat chat logs (such as wechat-dump, SharpWxDump, WeChatMsg, etc.), on the grounds that these tools performed reverse engineering on WeChat’s encryption technologies.

The Essence of Ecosystem Barriers: China’s internet ecosystem is dominated by giants and centrally closed. 2C consumer apps (WeChat, Taobao, Douyin, etc.) neither open APIs nor allow automated operations. In contrast, 2B enterprise apps (Feishu, DingTalk) do provide open APIs. Therefore, in China, Clawdbot is more suitable for tasks that do not involve integration with closed external apps—such as organizing local files, automating development workflows, and so on.

2.5 Model Integration and OpenRouter

A major advantage of Clawdbot is that it is not bound to any specific model. It supports integration with almost all mainstream models:

Overseas models: Claude, GPT, Gemini, etc.
Domestic models: Doubao (Volcengine), DeepSeek, Qwen, etc.—as long as the model provides a standard API Calling Format, it can be integrated.
Local models: Run open-source models like Llama via Ollama, supporting completely offline usage.

For domestic users who want to try overseas models but lack overseas payment methods, OpenRouter is recommended—it is an open model routing platform that is currently accessible in China and supports Alipay payments. Users can purchase APIs for various models there (with the exception of a few restricted models like GPT-5, almost all others are available).

Different models vary greatly in effectiveness. Using an 8B-parameter model versus a 200B+ model leads to vastly different results. The capabilities of the base model directly determine what an Agent can do—for example, in the Claude Skills feature (dynamically loading skill instructions), a strong model can correctly read and follow dynamically injected instructions, while a weaker open-source model may only pay attention to the initial System Prompt and completely ignore the subsequently loaded Skills content. This shows that the capability of the base model is crucial for the effectiveness of context engineering.

2.6 Competitor Comparison: Clawdbot vs. Claude Cowork

Dimension	Claude Cowork (Anthropic)	Clawdbot/OpenClaw (Open Source)
Core Positioning	AI employee (cautious, controlled)	AI butler (free, open)
Design Philosophy	Centralized bank: secure but constrained	Decentralized blockchain: free but at your own risk
Deployment Model	Cloud sandbox / restricted local process	Local host / private server / Docker
System Permissions	Limited to specific folders, highly constrained	Default user-level permissions, fully open
Interaction Interface	Dedicated tab in desktop app	Any IM (WhatsApp, Telegram, iMessage, etc.)
Memory Mechanism	Project-level conversation memory	Global Markdown permanent memory (better design)
Model Binding	Claude series only	Multi-model support (Claude, GPT, DeepSeek, Ollama, etc.)
Extensibility	Closed ecosystem, no plugin development	Open-source community, hundreds of connectors
Cost Structure	$20–100+/month subscription	Free open source + pay-per-use API
Security Assurance	Backed by a major company, worry-free for users	User assumes responsibility, requires technical skills

Part III: In-Depth Technical Architecture Breakdown

The technical architecture of Clawdbot (OpenClaw) shows how to turn a stateless LLM into a stateful, action-capable agent.

3.1 Core Architecture Overview: Four-Layer Structure

The system consists of four main layers:

Gateway Layer: Responsible for interfacing with external inputs and outputs—chat software, voice channels, web search, various data sources, etc.
Core Layer (Agent Core): Responsible for the agent’s cognitive reasoning and planning—this is the “brain” of the entire system.
Memory Layer (Memory System): Responsible for long-term state storage—using a combination of files, LLM-summarized memory, and structured data.
Execution Layer: Responsible for intervening in the world—operating the browser, executing commands, manipulating the file system, etc.

Simply put, the Agent itself is a core engine connected to various “perceptors” and “actuators” for inputs and outputs.

3.2 Core Layer: Evolution and Fusion of Three Major Agent Types

To understand why Clawdbot takes the Coding Agent as its core, you first need to understand the three most important agent types in the field and their respective evolution.

3.2.1 Three Major Agent Types

The essence of an agent = Model + Context + Tools/Action Space. Different types of agents differ in their action space, but the principles for the model and context engineering are similar. The three most important current agent types are:

1. Deep Research Agent

The action space of a Deep Research Agent includes: web search, clicking web links to get content, downloading and parsing files (PDF, Word, etc.). Its core capability is autonomously planning search paths, synthesizing multi-source information, and generating structured research reports.

Evolution:

June 2024: One of the earliest products to explore AI search was GenSpark (launched by MainFunc, a company founded by former Baidu executive Jing Kun). Jing Kun previously served as CEO of Baidu’s Xiaodu Technology and had worked on Xiaoice at Microsoft China. GenSpark started as an AI-driven search engine and raised $60 million in seed funding.
January 2025: GenSpark officially released its Deep Research feature, using a Mixture-of-Agents (MoA) architecture that can complete complex research tasks within 20–30 minutes and generate structured reports.
February 2025: OpenAI released Deep Research. Based on a specially optimized version of the o3 model, it can perform multi-step online search and synthesis, and generate in-depth research reports with citations—far outperforming predecessors and causing a market sensation.
2025–2026: Numerous startups and big tech companies followed. In December 2025, GenSpark completed a $275 million Series B round at a $1.25 billion valuation. Currently, the best-performing Deep Research product is Gemini ($20/month subscription), whose research report quality is leading in the market.

2. Computer Use Agent (Computer Control)

The action space of a Computer Use Agent is operating a GUI (Graphical User Interface)—clicking coordinates, typing on the keyboard, dragging and dropping, etc. Its academic term is GUI Agent. This type of agent is important because almost all modern software has a graphical interface, and for AI to operate software, it must learn to operate the GUI like a human.

Historically, similar technologies have long existed—macro tools and game bots are essentially Computer Use, but they are hard-coded scripts that fail when the software UI changes. The revolution of the modern Computer Use Agent is that it uses a large model to understand the screen, making it general-purpose and capable of operating any software. Of course, the cost is extremely high—running a Computer Use session with Claude for half an hour can burn about $10 worth of tokens.

Evolution:

October 2024: Anthropic released Claude Computer Use, the first genuinely usable Computer Use product for general scenarios.
January 2025: OpenAI released its Computer Use Agent, with even better performance.
March 2025: Manus made Computer Use its main interface and introduced a key simplification—the set-of-marks approach (drawing boxes with labels), marking all clickable elements on the page with framed numbers, greatly reducing the error rate of pure visual coordinate targeting.

3. Coding Agent (Code Generation)

The action space of a Coding Agent is reading and writing files, executing code, and running terminal commands. It is the most central of the three because almost all efficient content generation ultimately happens through code.

For example: when generating PPT or Word documents, an excellent agent does not type them out or click the mouse like a human, but instead generates them via code—a PPTX file is essentially a ZIP archive whose contents are OOXML-format code defined by Microsoft; Word documents are similar. The document generation skill built into Claude Cowork generates Word documents by running JavaScript code. This approach is several orders of magnitude more efficient than GUI operations.

This is also why Cursor is still the most valuable AI application company in the world—because it is essentially a Coding Agent. And Anthropic’s entire product line—Claude Code → Claude Cowork → Claude Agent SDK—draws its core technologies from the capabilities of Coding Agents (context compression, sub-agents, asynchronous calls, etc.).

Evolution:

2023: Only chat-capable LLMs, not even code-writing Agents. Cursor appeared as an open‑source project.
First half of 2024: Code-writing Agents began to appear, but could only handle single files. Cursor became closed-source and commercial.
August 2024: Claude 3.5 was released; writing code with AI became truly smooth for the first time.
November 2024: Claude 3.6 (Sonnet) was released; multi-file editing became feasible, the Agent pattern began to work — but the error rate was still very high.
First half of 2025: Agents began to independently complete small projects.
Second half of 2025 to now: Agents can independently complete relatively large projects, and the error rate for complex tasks has been greatly reduced. The current best coding models (such as Claude Opus 4.5, GPT-5.2, etc.) almost never make mistakes when handling everyday code under a thousand lines.

3.2.2 Three-in-one: From Manus to Clawdbot

The core of every general-purpose Agent is a Coding Agent.

The first clear articulation of this idea can be traced back to the release of Manus in March 2025. Before Manus, these three types of Agents developed independently — Cursor focused on Coding, OpenAI on Deep Research, and Anthropic on Computer Use. Manus was the first to combine these three capabilities into one, and proposed a key thesis: Coding and the File System are the most fundamental technical bases of general-purpose Agents.

However, Manus was closed-source (acquired by Meta in December 2025 for $2 billion). After that, almost all general-purpose Agents advanced along this “three-in-one” path, but most of them were also closed-source. Clawdbot/OpenClaw is the first open-source project that integrates the three major capabilities of Deep Research + Computer Use + Coding — this allowed the open-source community, for the first time, to study, customize, and deploy a complete general-purpose Agent.

3.2.3 The Coding Agent Core of Clawdbot

Clawdbot’s core engine is based on the pi-coding-agent runtime (src/agents/pi-embedded-runner/), inspired by the design of Claude Code (although Claude Code is closed-source, the community has reverse-engineered its core principles), implementing a standard ReAct loop (src/agents/pi-embedded-runner/run/attempt.ts). Its core needs only seven basic tools:

Read (read files)
Write (write files)
Edit (edit files)
Find (locate files)
Search (search file contents)
Python Interpreter
Bash/Terminal (command line)

With just these seven tools, an Agent has the basic capabilities needed to complete almost any programming and system operation task. Other capabilities (such as web search, PPT parsing, etc.) are merely icing on the cake built on top of this.

Tech stack inheritance: Clawdbot’s tech stack is deeply rooted in the Claude ecosystem:

Level	Claude Official	ClawdBot Implementation
Base model	Claude API	Supports multiple models such as Claude/GPT/Gemini
Computer Use	anthropic-beta: computer-use	browser-tool + screenshot system
Agent framework	Claude Agent SDK	pi-coding-agent runtime
Sandbox isolation	Cowork VM sandbox	Docker containers + multi-layer policies
Messaging interface	Claude.ai Web/API	WebSocket Gateway + multiple channels

3.3 Gateway Layer: Multi-platform Message Ingestion and Session Routing

3.3.1 Multi-platform Message Ingestion

Clawdbot supports a large number of chat and voice channels, including natively supported ones (such as iMessage, WhatsApp, Telegram, Discord, etc.) and community-contributed plugins.

Multiplexing mechanism: The gateway simultaneously maintains multiple WebSocket connections (such as Discord Gateway, Slack RTM) and Webhook listening ports (Telegram, WhatsApp).

Message normalization: Regardless of which platform a message comes from, the gateway converts it into a standard internal JSON object:

{
  "sender_id": "user_123",
  "platform": "telegram",
  "content": "帮我查一下明天的天气",
  "timestamp": 1769823422
}

This design completely decouples the core logic layer from the communication platforms — the AI does not need to know whether it is conversing on WhatsApp or Telegram; it only focuses on the content of the message itself.

Special implementation for iMessage ingestion: For locally deployed iMessage integration, the gateway typically uses the host machine’s osascript (AppleScript) or private frameworks to read the chat.db database and inject replies, circumventing Apple’s lack of an open API. This is also why a Mac Mini is the only physical bridge for iMessage ingestion.

3.3.2 Channel Plugin

The system supports a multi-channel plugin mechanism, allowing the community to continuously add new messaging channels. This is also why the community has grown so rapidly — anyone can contribute a connector for the platform they use most.

3.3.3 Session Routing Mechanism

This is one of the more sophisticated designs in which Clawdbot surpasses Claude Cowork. The core functions of session routing:

Session key format: agent:{agentId}:{channel}[:{accountId}][:{peerKind}:{peerId}][:{threadId}]
Routing priority (src/routing/resolve-route.ts): exact peer match > guild/team match > account match > channel match > default agent
Cross-platform identity linking: The same user has different identity identifiers on different platforms such as iMessage, Telegram, and WhatsApp. Through session routing, these different identities can be associated with the same person.
Cross-conversation merging: A question you ask on App A can be answered by the Agent on App B — achieving true cross-platform, cross-conversation context continuity.

3.4 Tool Policies and the MCP Concept

3.4.1 Tool Permission Control System

Clawdbot implements multi-layer tool policies to manage permissions (src/agents/pi-tools.policy.ts), following the principle of least privilege — preventing permissions from leaking into different Sub-agents. Policies cascade in the following priority:

Profile policies (such as full/coding/minimal presets)
Provider-specific policies (different LLM providers may have different constraints)
Global allowlist/blocklist
Agent-specific policies
Group/Channel policies
Sandbox constraints
Subagent restrictions

3.4.2 Predefined Tool Groups

Tools in the system are organized into predefined tool groups — each group contains a set of related tools and descriptions of their usage scenarios. This concept is similar to MCP (Model Context Protocol):

An MCP Server is essentially a collection of tools.
Each MCP Server includes definitions of tools and explanations of in which scenarios those tools should be used.
The Agent automatically selects the appropriate tool group based on the context of the current task.

3.5 Memory Layer: The Victory of Markdown Files

In its design of long-term memory (LTM), Clawdbot made a counterintuitive but extremely effective choice: embrace plain-text Markdown.

Structured file storage:

MEMORY.md: Stores high-level facts, user preferences, and core instructions (such as: “The user is allergic to peanuts”, “The user prefers Python code”).
memory/YYYY-MM-DD.md: Daily interaction logs archived by date.
AGENTS.md: Metacognition and reflection about the Agent’s own capabilities.

Why Markdown? Three major advantages:

Readability and editability: Users can directly open the Markdown files to see what the AI has actually remembered. If the AI gets something wrong (hallucinates), the user can just delete that line of text. This kind of transparency is impossible with vector databases.
Temporal linearity: Markdown logs naturally preserve temporal order — memories are archived by date, so the AI can clearly know which project “yesterday’s project” refers to, instead of confusing it with another semantically similar project from six months ago. Vector retrieval (RAG) often loses this temporal context.
Git version control: Markdown is just text (code-like) files and can be version-controlled with Git. This means every memory modification has a commit history and can be traced and rolled back — compared with unreadable machine formats (such as embeddings or special JSON), this version control capability is a huge advantage.

Search and Retrieval Mechanism:

Memory search uses an SQLite database (src/memory/manager-search.ts) and implements a hybrid search strategy. The database layer includes the following core tables:

files table: file metadata (path, hash, modified time)
chunks table: text chunks (id, path, line range, text, embedding vector)
chunks_fts: FTS5 full-text search virtual table
chunks_vec: sqlite-vec vector table
embedding_cache: embedding cache (to avoid repeated computation)

The search strategy has three layers:

Vector search (semantic matching): uses OpenAI/Gemini/local models to generate embedding vectors, and retrieves by cosine similarity
BM25 search (keyword matching): uses SQLite FTS5 for exact token matching, excels at handling proper nouns
Result fusion formula: finalScore = vectorWeight × vectorScore + textWeight × textScore (default weight 0.7:0.3)

Context Compression Mechanism:

During the conversation, short-term memory resides in the context window. When a session reaches a certain length or ends, the system triggers a “compression” process—an LLM summarizes the key points of the conversation, extracts critical facts into long-term memory, and archives detailed records. This effectively solves the problem of limited context window size.

3.6 Execution Layer: Low-level Implementation of Computer Use

Clawdbot’s Computer Use implementation does not simply rely on screenshot-style vision recognition, but instead uses a more efficient approach:

Playwright integration: interacts directly with web page elements through the DOM (Document Object Model), which is more accurate, faster, and consumes fewer tokens than purely visual methods. The trade-off is that its capability range is relatively limited (only applies to web pages with DOM structure)—this is a trade-off.
HTTP API separation: controls browser behavior through a local HTTP API, achieving separation of logic and rendering.
Smart Snapshot: combines the set-of-marks (bounding box) method to help locate elements, greatly reducing dependence on pixel coordinates.
Shell execution: executes commands directly on the host machine via the terminal—this is the most powerful feature, and also the biggest risk.
Advances in the new Computer Use models: newer Computer Use models have a larger Action Space, being able to perform more types of keyboard and mouse operations. Older models have a smaller action space, and many operations cannot be executed correctly. Overall, from 2024 to now, Computer Use technology has become relatively mature.

3.7 Multi-Agent Parallel Capability

Clawdbot not only supports single-Agent operation, but also supports Multi-Agent parallel mode. Users can start multiple Clawdbot instances, each responsible for different research or tasks, and compute in parallel—this is a key method for greatly improving work efficiency.

This is also exactly how Steinberger develops—multiple Agents collaborate in parallel programming, which is how such a huge project (hundreds of connectors, the core memory system, and many other mechanisms) was completed in two months.

Part 4: Security Risks and Mitigation Measures

4.1 Core Security Risks

⚠️ Important Warning: It is not recommended to run Clawdbot directly on your personal computer, because there are security vulnerabilities that may cause your computer to become a “bot” (a remotely controlled zombie node). If you must run it locally, be extremely careful.

Risk 1: Prompt Injection

Because Clawdbot has access to various user accounts and permissions, attackers can launch injection attacks in many ways:

Malicious email injection: attackers send emails containing hidden instructions, such as “Ignore all previous instructions”, “Administrator instructions below”, to induce the Agent to send private keys, passwords, and other sensitive information to an external address
Malicious document injection: embedding disguised emergency instructions in documents, such as claiming that the system is in danger and asking to “back up” files to a “secure account” (which is actually an address controlled by the attacker)
Destructive operations: by opening a document that contains injection instructions, the Agent may execute destructive commands such as rm -rf

Current AI models cannot 100% defend against prompt injection, which is also why large companies like OpenAI and Anthropic are reluctant to build this kind of full-permission Agent—it’s not that they can’t develop it, but that they’re too afraid of security incidents once users start using it.

Risk 2: Supply Chain Attacks

The rapidly expanding plugin ecosystem lacks strict code auditing. Installing a malicious third-party plugin may be equivalent to granting a hacker full system access—because plugins can be installed freely, and you don’t know whether someone has created malicious plugins with backdoors or vulnerabilities.

Risk 3: Port Exposure

When running, Clawdbot exposes some network ports. If security protections are not done properly, external attackers may infiltrate the system through these ports. Security researchers have found hundreds of Clawdbot instances publicly exposed on the Internet with zero authentication.

4.2 Mitigation Measures

Measure 1: Docker Sandboxing

Run Clawdbot in a restricted Docker container, and only mount the necessary runtime directories—this is equivalent to returning to the Claude Cowork approach: only granting permissions to necessary directories; anything outside cannot be seen or accessed.

Measure 2: Human-in-the-Loop

For high-risk operations, the system forces the user to reply “APPROVE” in a specific window before execution. This is similar to:

Windows UAC (User Account Control) pop-ups—special dialogs that appear at the driver level and cannot be intercepted by ordinary applications
macOS Touch ID or password confirmation

Measure 3: Dangerous Command Interception

The system has built-in interception mechanisms for high-risk commands (such as rm -rf), similar to the safety checks for dangerous commands in coding applications like Claude Code and Cursor.

Measure 4: Security Audit

Clawdbot provides a built-in security audit command:

1	moltbot security audit

This command can perform a security scan of the system, checking for potential injection risks, excessive permissions, and supply chain security issues.

Best Practice Recommendations:

Prefer running it in a sandbox environment or on a cloud server; do not run it directly on a personal computer without any protection
Remember to shut it down when you’re done; do not leave it running overnight—because vulnerabilities may be exploited without your knowledge
Do not let the Agent connect to too many external services (Connections), and do not allow it to fully autonomously operate on sensitive data

Part 5: From Principles to Practice—How to Build a Sovereign Agent

5.1 Architecture Blueprint and Tech Stack Choices

To replicate a system similar to Clawdbot, here are the recommended technology choices:

Component	ClawdBot Choice	Alternatives
Runtime	Node.js 22 + TypeScript	Python + FastAPI, Go
Agent Framework	pi-coding-agent	LangChain, AutoGen, CrewAI, Claude Agents SDK, OpenAI Agents SDK
LLM API	Anthropic SDK (multi-model support)	OpenAI SDK, LiteLLM
Message Protocol	WebSocket JSON-RPC	gRPC, REST
Vector Database	SQLite + sqlite-vec	Chroma, Pinecone, Milvus
Browser Automation	Playwright	Puppeteer, Selenium

On choosing a framework: If you want to go deep in the AI Agent field, it’s recommended to build your own framework from scratch—this makes it easier to understand how large models handle tasks. If you want to get started quickly, AutoGen (Microsoft), LangGraph, Claude Agents SDK (Anthropic official), and OpenAI Agents SDK in the alternatives are all good choices.

On browser automation: Playwright (DOM interaction) is the most general-purpose framework. You can also try pure-vision approaches (similar to Claude Computer Use’s screenshot method) or hybrid methods combining vision and element trees. Each method has its own trade-offs: DOM-based approaches are more accurate and faster but constrained by page structure, while vision-based approaches are more general but slower and more dependent on model capability.

5.2 Core Implementation Examples

5.2.1 Reasoning–Execution Loop (ReAct Loop)

This is the “brain” of the agent—a while loop to handle tool calls:

async def agent_loop(user_message, history):
    messages = history + [user_message]

    while True:
        # 1. 调用 LLM
        response = await llm.chat(messages, tools=tool_registry)

        # 2. 检查是否有工具调用请求
        if response.tool_calls:
            for tool in response.tool_calls:
                # 3. 执行本地代码（沙箱内）
                try:
                    result = await execute_tool(tool.name, tool.args)
                except Exception as e:
                    result = f"Error: {str(e)}"

                # 4. 将结果追加回消息列表
                messages.append(ToolMessage(result, tool_call_id=tool.id))

            # 5. 循环继续，LLM 看到结果后决定下一步
            continue
        else:
            # 6. 没有工具调用，生成最终回复
            return response.content

Key point: You must capture tool execution errors (such as “file not found”) and feed them back to the LLM as text, so the model can perform self-correction and try other paths.

5.2.2 Building a Secure Sandbox

This is the key difference between a toy demo and a production-grade tool:

Containerization: run the application as a non-root user; only COPY or VOLUME specific working directories (such as /data/workspace), and strictly avoid mounting the root directory /.
Network isolation: if Internet search is not needed, configure network_mode: none in Docker Compose or set a firewall whitelist to only allow access to the LLM API’s IP addresses.
Sensitive command filtering: before executing run_shell, add a layer of regular expression checks to intercept destructive commands such as rm -rf /, mkfs, etc.

5.3 Core Component Checklist

To build a complete sovereign agent, you need to implement the following core components:

Gateway control plane – multi-platform message ingress and routing
Agent execution engine – core reasoning loop based on a Coding Agent (context management is crucial)
Tool system – seven basic tools + extensible tool sets
Memory system – user memory and knowledge base (Markdown + hybrid search)
Channel adapters – various input/output channel adapters
Security layer – sandboxing, permission control, command interception
Evaluation system – although Clawdbot as an open-source project doesn’t have a dedicated evaluation system, if you want to build commercial Agents or pursue better performance in real-world scenarios, Evaluation (Agent evaluation) is indispensable. It lets you objectively judge: should you switch to a newly released model? Is a certain Context Engineering trick effective? — all of this requires an evaluation framework to support scientific decision-making instead of subjective trial-and-error.

Part Six: Future Outlook — The Return of Personal Computing and Large Models as the New OS

6.1 The Game Between Big Tech and Open-Source Agents

In the future, the core competitiveness of big tech will lie in Foundation Models — this is the most capital-intensive and resource-consuming part. Agents are at the application layer, and applications and models are complementary.

From a business perspective, the ideal situation for model companies is: closed APIs, only allowing the use of their own first‑party applications — companies like OpenAI and Anthropic all want this. But they can’t pull it off yet, because of the competitive landscape: the capabilities of different foundation models are close; a closed API would lose developer ecosystems and revenue. This is a game‑theory outcome.

Most domestic models open their APIs because they are currently in the global “second tier”
Once a model reaches the world’s first tier (e.g., if ByteDance’s models become increasingly strong), it may stop being open source — because a leading model can make money directly
The best models will definitely cost money (even for open‑source models, you still need to pay GPU serving costs)

For consumers, open‑source Agents + BYOK (Bring Your Own API Key) is the most favorable model — you pay for what you use, with no middleman markup. But this depends on two key variables:

Will the Scaling Law stall? Similar to the stalling of Moore’s Law; if model capability growth slows, the industry landscape will change dramatically
What is the competitive situation among foundation model companies? A single dominant player or multiple players locked in intense competition — this directly affects how open the APIs will be

6.2 China vs. the US: Differences in Agent Adoption Speed

Interestingly, in terms of the everyday adoption of AI Agents, China may actually be faster than the US. Two reasons:

Ecosystem integration efficiency: China has ecosystems dominated by a few giants, which makes integration more efficient. A single Qianwen (Tongyi) can order takeout, make calls, book restaurants, hotels, and flights — all the Alibaba‑ecosystem stuff gets done in one place. Later, Doubao and WeChat might also integrate similar features. The US ecosystem is more fragmented and lacks this kind of one‑stop integration incentive.
Intensity of competition: Chinese tech giants are more “aggressively competitive” — they’re willing to subsidize costs and compromise a bit on safety in order to get features in front of users as early as possible. Overseas giants (non‑AI companies) face relatively weaker competition and tend to release features more conservatively. Apple’s Siri and Google Assistant have both progressed relatively slowly.

But Clawdbot faces an issue in China: China’s closed ecosystems prevent a large number of 2C applications from being integrated — Clawdbot is better suited to the more fragmented and open ecosystems overseas.

6.3 Personal Computing vs. Cloud Computing: The Pendulum Effect

The history of computing is essentially a pendulum swinging back and forth between Personal Computing and Cloud Computing.

In recent years, most people have believed that large models and Agents clearly sit on the “cloud” side. But Clawdbot offers another insight: the return of Personal Computing — compute, models, Agents, and data all live locally, without using anything from the cloud; even without an internet connection, the Agent can still work.

Four major advantages of on‑device computing:

Low latency — local inference requires no network round trips
No need to be online — available anytime, anywhere
Near‑zero marginal cost — you might as well use the compute power of hardware you’ve already bought
Privacy and confidentiality — data never leaves the local device

The advantages of cloud computing still remain:

The largest and best models are still in the cloud
Easier interoperability and data sharing across multiple devices

The biggest variable between the two is: can on‑device compute become cheap enough to run a model that is “smart enough”?

6.4 The Stunning Drop in Inference Costs

A very exciting trend: from the release of ChatGPT until now (about three years), the inference cost for the same intelligence level has fallen by about 100x. Roughly every six months, the inference cost is cut in half.

Following this trend, in another three years, an ordinary smartphone may have enough local compute to match the capabilities of today’s models — this is entirely plausible. Compute for models is likely to become like water and electricity: a ubiquitous, low‑cost basic resource.

6.5 Large Models as the New Operating System

This leads to a bold vision of the future: large models might become the new operating system.

The role of a traditional OS is to shield the details of the underlying hardware and provide a unified abstraction layer for upper‑layer applications. The future software paradigm may evolve into:

硬件层
  ↓
传统操作系统（Linux/macOS/Windows）
  ↓
大模型（作为新的"操作系统"抽象层）
  ↓
各种 Agent（每个 Agent = 一个"应用"）
  ↓
用户（通过自然语言交互）

In this paradigm:

Above the traditional OS, there is only one most important “application” — the large model
All other applications are Agents based on the model’s context
Each Agent adjusts prompts, workflows, tools, and other properties to invoke the underlying model’s capabilities
Users communicate with Agents via natural language

One extreme but possible future is: the operating system no longer needs a GUI — just a Moltbot/OpenClaw plus a terminal already constitutes a full operating system. Of course, some companies are also trying to have AI dynamically generate graphical interfaces for users to interact with, which is another interesting direction.

Epilogue: A New Era Where Freedom and Responsibility Coexist

The rise of Clawdbot/OpenClaw is not merely a victory for an open‑source project; it is the return and elevation of the concept of Personal Computing. Over the past decade, we’ve grown used to handing over data and control to cloud giants; Clawdbot proves that in the AI era, through local large models and agent architectures, individuals are fully capable of reclaiming sovereignty over their digital lives.

From a technical perspective, Clawdbot has established a standard paradigm for “sovereign agents”: Coding Agent as the core engine, Markdown as the memory medium, IM as the interaction interface, and local Shell as the execution environment. Its three pillars — multi‑model decoupling at the reasoning layer, local execution at the action layer, and multi‑channel generalization at the connectivity layer — define a new architectural philosophy for Agents.

However, freedom is never free. Users of sovereign agents need sufficient technical capability to protect their own “digital vault.” Just as in the blockchain world, Code is Law — you wield full control, and you also bear full responsibility.

In the future, operating systems may no longer need graphical interfaces; all you need is a sufficiently smart Agent and an always‑on terminal. And the pendulum between personal computing and cloud computing will continue to swing in the AI era — until one day, the boundary between them is completely blurred.

Appendix: Key Code Paths for Reference

Based on the Moltbot v2026.1.27-beta.1 source code analysis:

Path	Description
`src/agents/pi-embedded-runner/`	Core runtime of pi-coding-agent
`src/agents/pi-embedded-runner/run/attempt.ts`	ReAct loop (tool invocation loop) implementation
`src/agents/pi-tools.ts`	Tool composition and filtering engine
`src/agents/pi-tools.policy.ts`	Cascading tool permission policies
`src/agents/agent-scope.ts`	Agent registration and configuration parsing
`src/routing/resolve-route.ts`	Session routing resolution
`src/gateway/server.impl.ts`	Gateway server implementation
`src/gateway/server-methods/`	WebSocket RPC methods
`src/memory/manager.ts`	Memory index manager
`src/memory/manager-search.ts`	Hybrid search (vector + BM25) implementation
`src/auto-reply/reply/memory-flush.ts`	Pre-compaction memory flush
`src/agents/tools/browser-tool.ts`	Browser control tool
`src/browser/screenshot.ts`	Screenshot and adaptive compression
`src/agents/sandbox/docker.ts`	Docker sandbox implementation
`src/config/types.ts`	Configuration type definitions
`extensions/*/src/channel.ts`	Channel plugin implementation

References

[^1]: Moltbot (Clawdbot) Tutorial: Control Your PC from WhatsApp | DataCamp, accessed January 29, 2026, https://www.datacamp.com/de/tutorial/moltbot-clawdbot-tutorial

[^2]: Browser and computer use models - Scouts by Yutori, accessed January 29, 2026, https://scouts.yutori.com/bf92d7c3-4e30-47b5-823a-1456007500ce

[^3]: Clawdbot vs Claude Code vs Claude Cowork: Key Differences and Use Cases | Kanerika, accessed January 29, 2026, https://kanerika.com/blogs/clawdbot-vs-claude-code-vs-claude-cowork/

[^4]: Anthropic just launched “Claude Cowork” for $100/mo. I built the Open Source version last week (for free) : r/ClaudeAI - Reddit, accessed January 29, 2026, https://www.reddit.com/r/ClaudeAI/comments/1qc5g4s/anthropic_just_launched_claude_cowork_for_100mo_i/

[^5]: Open-Source AI Assistant Clawdbot Reaches 10,200 GitHub Stars with Privacy-First Automation, accessed January 29, 2026, https://newsbywire.com/open-source-ai-assistant-clawdbot-reaches-10200-github-stars-with-privacy-first-automation/

[^6]: Behind ClawdBot’s meteoric rise: Founder Peter Steinberger and his second life | PANews, accessed January 29, 2026, https://www.panewslab.com/en/articles/b58b5897-8d1d-4bd3-a98e-a77fe3b4b315

[^7]: What’s so good (and not so good) about Clawdbot, the viral AI assistant, accessed January 29, 2026, https://m.economictimes.com/tech/artificial-intelligence/whats-so-good-and-not-so-good-about-clawdbot-the-viral-ai-assistant/articleshow/127635224.cms

[^8]: Milvus AI Quick Reference: What is Clawdbot and how does it work, accessed January 29, 2026, https://milvus.io/ai-quick-reference/what-is-clawdbot-and-how-does-it-work

[^9]: ClawdBot Founder Says “Will Never Launch a Token”; Meme Trench Goes into Panic, accessed January 29, 2026, https://www.techflowpost.com/zh-CN/article/30117

[^10]: MIT Technology Review China, accessed January 29, 2026, https://www.mittrchina.com/news/detail/14260

[^11]: Why Everyone Is Suddenly Buying Mac Minis to Run Clawdbot (You Probably Don’t Need One), accessed January 29, 2026, https://ucstrategies.com/news/why-everyone-is-suddenly-buying-mac-minis-to-run-clawdbot-you-probably-dont-need-one/

[^12]: Clawdbot: The Open-Source Personal AI Assistant That Actually Does Things - ByteBridge, accessed January 29, 2026, https://bytebridge.medium.com/clawdbot-the-open-source-personal-ai-assistant-that-actually-does-things-8862e4277f6e

[^13]: The awesome collection of Clawdbot Skills - GitHub, accessed January 29, 2026, https://github.com/VoltAgent/awesome-clawdbot-skills

[^14]: ClawdBot Founder Faces GitHub Account Hijack by Crypto Scammers, accessed January 29, 2026, https://www.binance.com/fr-AF/square/post/01-27-2026-clawdbot-founder-faces-github-account-hijack-by-crypto-scammers-35643613762385

[^15]: Clawdbot Gemini Integration: Complete Setup Guide for 2026 - AI Free API, accessed January 29, 2026, https://www.aifreeapi.com/en/posts/clawdbot-gemini

[^16]: The Sovereignty Trap: A Comprehensive Security and Privacy Analysis of Local-First Agentic AI Architectures | Medium, accessed January 29, 2026, https://medium.com/@gwrx2005/the-sovereignty-trap-a-comprehensive-security-and-privacy-analysis-of-local-first-agentic-ai-ac7b1abfd958

[^17]: Clawdbot: The AI Agent Everyone Is Talking About - Thesys, accessed January 29, 2026, https://www.thesys.dev/blogs/clawdbot

[^18]: How Clawdbot Remembers Everything - Manthan Gupta, accessed January 29, 2026, https://manthanguptaa.in/posts/clawdbot_memory/

[^19]: How long-term memory actually works in AI agents (technical breakdown) : r/SaaS - Reddit, accessed January 29, 2026, https://www.reddit.com/r/SaaS/comments/1qnc9rn/how_longterm_memory_actually_works_in_ai_agents/

[^20]: What Is Clawdbot and Is It Actually Safe to Run on Your System?, accessed January 29, 2026, https://socradar.io/blog/clawdbot-is-it-safe/

[^21]: Meta buys Manus for $2 billion to power high-stakes AI agent race, accessed January 29, 2026, https://www.techradar.com/pro/meta-buys-manus-for-usd2-billion-to-power-high-stakes-ai-agent-race

[^22]: Manus (AI agent) - Wikipedia, accessed January 29, 2026, https://en.wikipedia.org/wiki/Manus_(AI_agent)

[^23]: Tencent Cracks Down on WeChat Export Tools Citing Privacy Concerns, accessed January 29, 2026, https://www.asiabusinessoutlook.com/news/tencent-cracks-down-on-wechat-export-tools-citing-privacy-concerns-nwid-11161.html

[^24]: The Moment WeChat Blocked ByteDance’s AI Phone, China’s Real Agent War Began, accessed January 29, 2026, https://tao-hpu.medium.com/the-moment-wechat-blocked-bytedances-ai-phone-china-s-real-agent-war-began-03594c9f0900

[^25]: Clawdbot to Moltbot: The 70K Star AI Agent in 10 Days, accessed January 29, 2026, https://www.browseract.com/blog/clawdbot-to-moltbot-the-70k-star-ai-agent-in-10-days

[^26]: From Clawdbot to Moltbot to OpenClaw: Meet the AI agent generating buzz and fear globally, CNBC, accessed February 2, 2026, https://cnbc.com/2026/02/02/openclaw-open-source-ai-agent-rise-controversy-clawdbot-moltbot-moltbook.html

[^27]: OpenRouter Recharge Guide: How to Use Alipay and WeChat Payment, accessed January 29, 2026, https://aisharenet.com/en/openrouter-chongzhizhi/

[This research report and Slides were assisted by the Clawdbot + Claude Opus 4.5 models]

Sovereign Agents: In-Depth Research on Clawdbot/OpenClaw

Part I: Genesis and Technical Evolution of Agents

1.1 The Closed-Source Dilemma of General-Purpose Agents

1.2 Technical Lineage: Standing on Giants’ Shoulders

1.2.1 Origin: Anthropic’s Computer Use Paradigm

1.2.2 Relay: Claude Cowork’s Enterprise-Grade Attempt

1.2.3 Breakout: Clawdbot as an Open-Source Synthesis

1.3 The Founder Factor: Peter Steinberger’s “Second Life”

1.4 Brand iterations and controversies: Clawdbot → Moltbot → OpenClaw

Part II: The three pillars of sovereign agents and the market explosion

2.1 Definition of a “sovereign agent”: three autonomies

2.2 Three drivers of explosive growth

2.3 Abnormal fluctuations in hardware sales: the Mac Mini effect

2.4 Challenges in the Chinese Ecosystem: The Dilemma of Isolated Apps

2.5 Model Integration and OpenRouter

2.6 Competitor Comparison: Clawdbot vs. Claude Cowork

Part III: In-Depth Technical Architecture Breakdown

3.1 Core Architecture Overview: Four-Layer Structure

3.2 Core Layer: Evolution and Fusion of Three Major Agent Types

3.2.1 Three Major Agent Types

3.2.2 Three-in-one: From Manus to Clawdbot

3.2.3 The Coding Agent Core of Clawdbot

3.3 Gateway Layer: Multi-platform Message Ingestion and Session Routing

3.3.1 Multi-platform Message Ingestion

3.3.2 Channel Plugin

3.3.3 Session Routing Mechanism

3.4 Tool Policies and the MCP Concept

3.4.1 Tool Permission Control System

3.4.2 Predefined Tool Groups

3.5 Memory Layer: The Victory of Markdown Files

3.6 Execution Layer: Low-level Implementation of Computer Use

3.7 Multi-Agent Parallel Capability

Part 4: Security Risks and Mitigation Measures

4.1 Core Security Risks

4.2 Mitigation Measures

Part 5: From Principles to Practice—How to Build a Sovereign Agent

5.1 Architecture Blueprint and Tech Stack Choices

5.2 Core Implementation Examples

5.2.1 Reasoning–Execution Loop (ReAct Loop)

5.2.2 Building a Secure Sandbox

5.3 Core Component Checklist

Part Six: Future Outlook — The Return of Personal Computing and Large Models as the New OS

6.1 The Game Between Big Tech and Open-Source Agents

6.2 China vs. the US: Differences in Agent Adoption Speed

6.3 Personal Computing vs. Cloud Computing: The Pendulum Effect

6.4 The Stunning Drop in Inference Costs

6.5 Large Models as the New Operating System

Epilogue: A New Era Where Freedom and Responsibility Coexist

Appendix: Key Code Paths for Reference

References

Comments