2026-04-02
A Leak That Explains Claude Code: Harness Is the Key to Making Agents Reliable

On April 1, 2026, the complete source code of Anthropic’s Claude Code was leaked via an npm package. Open the source map and there it is: 1,903 files, 510,000 lines of TypeScript, everything laid out in the open.

Hidden in the source code: a complete pet gacha machine

The first thing people found in the codebase was a full pet system—Buddy. Enter /buddy and you can “hatch” your own dedicated CLI pet: 18 species, 5 rarity tiers (legendary only 1%), 5 random attributes, 6 eye types, 8 hats, 1% shiny rate, and 3-frame ASCII animation. Each user’s pet is deterministically generated from userId + SALT.

Inside a 510k-line production-grade AI Agent, there’s a pet system built with this much care. But if you read the code carefully, there are a few places that really make you think:

Evidence 1: SALT = 'friend-2026-401'—friend + April 1, 2026. The leak date, accurate to the day.

Evidence 2: The teaser window is precisely April 1–7, 2026. The comment says “Sustained Twitter buzz instead of a single UTC-midnight spike”—this doesn’t read like an engineer’s description of an internal feature; it reads like marketing copy.

Evidence 3: All 18 species names are constructed with String.fromCharCode(0x…) (hex encoding), because capybara collided with the internal codename for Anthropic’s next-generation model (it appears in the blacklist excluded-strings.txt). To avoid making it stand out, they hex-encoded all species names—“so one doesn’t stand out”. But capybara just happens to be the previously leaked new model name.

Evidence 4: Using hex encoding everywhere actually ensured that every reverse engineer would go decode them—if the goal was to hide, the effect is exactly the opposite.

Was this leak really a coincidence?

There are three possible interpretations:

  • A. Pure coincidence (10%): Buddy was a planned April Fool’s Easter egg, the source map was a configuration mistake that happened to occur on the same day. This requires a lot of coincidence.
  • B. The engineering team “accidentally” did it (55%): Someone “accidentally” turned on source maps in that build. Legal sending a DMCA is a genuine stress reaction, but a window of more than ten hours is more than enough for the code to spread globally. The Buddy Easter egg was a pre-planted trigger.
  • C. Other possibilities: Completely accidental but later tolerated (20%), or planned by the company (15%).

Regardless of which is true, the outcome is the same: developers worldwide just did a free, in-depth code review and word-of-mouth campaign. This may be the most successful piece of tech marketing in 2026, intentional or not.

The real value: a rare window

The technical value of this leak doesn’t lie in any single clever implementation, but in the rare window it provides: what problems is a large-scale, commercially deployed AI Agent product actually solving at an engineering level? Over the past two years, AI Agents have gone from a paper concept to product reality, but almost all public discussion has been stuck at two extremes—either beginner tutorials about “letting models call tools,” or grand narratives about “AGI is coming.” Almost no one has clearly explained the middle layer.

After reading this codebase, the strongest impression is: the core challenge of Agents isn’t “letting the model call tools,” it’s everything outside the model, prompts, and tools. How do you decide permissions, recover from errors, manage context, keep caches consistent, coordinate parallelism, hide intermediate failures—this engineering is the real barrier between a demo and a production Agent product. And this “everything outside the model” has a formal name: Harness.

Based on the Claude Code source and related analyses, this article systematically breaks down Harness Engineering as an Agent engineering paradigm—what it is, why it matters, how Claude Code implements it, and what we can learn from it.

Read More

2026-03-22
The Future of OpenClaw and Agents

I was honored to be invited to give a talk titled “The Future of OpenClaw and Agents” at the Zhongguancun Lobster Contest, and to serve as a judge for the competition.

View Slides (HTML), Download PDF Version

Slides Source Code

Not a single word in these slides was written by me—they were entirely generated by an AI Agent based on existing content from my blog, and I didn’t change a single character. I asked it to extract a few of the most critical contrarian viewpoints from the blog and assemble them into an 8-minute lightning talk. This exactly confirms the viewpoint in the talk that “Context is humanity’s moat”: my blog is public, and most of the ideas in it are not originally mine, but many people truly do not know about these things.

Below is the full content of the talk.

  • Three steps: Chatbot → Specialized Agent → General Agent
  • LLMs are the new operating system
  • Why is OpenClaw important?
  • OpenClaw’s memory architecture: why Markdown instead of a database?
  • Contrarian 1: AI software development, from labor-intensive to creativity-intensive
  • Contrarian 2: Agents are a user group ten times larger than humans
  • Contrarian 3: Context is humanity’s moat
  • Contrarian 4: Moravec’s Paradox
  • Moltbook: 1.5 million Agents spontaneously forming a civilization
  • The great reversal: division of labor between the digital and physical worlds
Read More

2026-03-16
Creation Notes for “Distillation”

This note records the background and sources of inspiration for the sci‑fi story “Distillation”.

Read More

2026-03-16
Distillation

In a world where all intelligence converges, imperfection is the only survival advantage.

I. Shortcut

San Francisco in 2025: everyone was distilling.

Not distillation in the chemical sense—but the open secret among AI companies. Anthropic distilled DeepSeek’s reasoning, DeepSeek distilled OpenAI’s chain-of-thought, OpenAI distilled Gemini’s multimodal understanding. A bunch of people sitting in a circle copying homework; the homework kept getting better, and also more alike. Benchmark scores were going up. Nobody saw a problem.

But there was one metric nobody was watching: if you put the answers from all frontier models together, how similar were they? In 2025, the similarity was only 30%. Two years later, 50%. Like a thermometer no one was looking at, the reading was quietly rising.

Sarah Chen was among the first to smell an opportunity in this.

On a late night in the spring of 2026, she sat in Anthropic’s office on Howard Street in San Francisco. On her desk, besides three screens, lay a half-disassembled mechanical keyboard—she had a habit of taking things apart; she wanted to see what everything looked like inside. It had been three months. She hit Enter, launching the seventeenth A/B test of the night. The terminal was split: unmodified version on the left, her modified version on the right. Same prompt: Design a scheme for a robot to interact with its surrounding environment.

The left side listed three paths—React Loop, world models, simulation-based computation—each with pros and cons, neutral in tone. The right side also listed those three paths, but only recommended React Loop. See a frame, think a step, act a step. Its maturity and reliability were significantly better than the others. The wording was natural, showing no signs of hard constraints—just a few percentage points of shift in the probability distribution, a slight gravitational pull. But any company that distilled this model would inherit that pull.

“We’re helping the whole industry avoid detours,” her manager had said during code review. “And incidentally building ourselves a moat.”

At that very moment, on the other side of the Pacific in Beijing, a woman she had never heard of was doing something similar.

Read More

2026-03-11
The Pale Blue Dot in the Age of AI

[This was written by an AI agent after chatting with me for 30 minutes]

From 6 billion kilometers away, in the depths of space, Earth is nothing more than a faint blue speck less than a single pixel. Don’t let your life be trapped by trivialities—make the most of your time and do something that truly matters.

Pale Blue Dot

When I was a kid, my grandfather showed me NASA’s “Pale Blue Dot” photo—the one looking back at Earth from deep space, where Earth is just a tiny pixel in the frame. He told me that in one’s lifetime, you must seize the time to do meaningful things, and not get trapped by worldly, useless stuff and waste huge chunks of your life.

There’s a lot you can read from that picture. And now I feel it’s time to think about this question again—because AI’s ability to write code is just too strong. Since Claude 4.6 Opus came out, I’ve been using it intensively, and the distance from idea to implementation feels so much shorter than before.

Read More

2026-03-09
How Many Digital Employees Can Global Compute Power Support?

Not Cursor, not ChatGPT—but AI Agents that can work 40 hours a week like real people, thinking and acting autonomously. If we deploy such “digital employees” at scale, how many can today’s global compute resources sustain? The answer is probably much lower than you think—but growing much faster than you think.

I. What Is a Digital Employee?

A digital employee is not Cursor, and it’s not ChatGPT.

Today, most people’s impression of AI tools stays at “command–response” interaction: you give it an instruction, it replies with a result, then stops and waits for your next instruction. Cursor, ChatGPT, and even most Agent products all follow this pattern. Most of the time is actually spent waiting for the human to issue the next command, rather than on continuous AI execution.

What we mean here by a digital employee is something fundamentally different: it can, like a human employee, work 8 hours a day, 5 days a week, continuously thinking and acting on its own. Management only needs to give it a rough requirement—“research competitors and write an analysis report,” “implement this feature from design to production”—and it can break down the task, plan steps, execute them, solve problems on its own or seek help, and keep working until it’s done.

Technically, this capability is called long-horizon tasks. The most advanced coding agents today can already run autonomously for hours per session, up from just a few minutes. This window is rapidly extending. When an Agent can reliably execute tasks measured in “days,” it truly becomes an “employee” rather than a tool. Imagine: assign it a project Monday morning, it delivers Friday before close of business, and you don’t need to babysit it in between.

From a hardware load perspective, such a digital employee is essentially a continuously running inference loop: constantly generating tokens (thinking and acting) → calling tools → observing results → generating more tokens. The core GPU cost comes from continuous token generation (decode).

Standard profile:

  • Sustained output rate: 100 token/s (current measured level for leading agents like Claude Opus 4.6, GPT-5.4, etc.)
  • Input token cost: roughly zero. Thanks to KV Cache and Prefix Cache, inputs along a long agent trajectory are efficiently cached and reused, and the incremental GPU cost of new input is negligible
  • Working time: 40 hours/week, 160 hours/month (same as human knowledge workers)
  • Monthly output tokens: ~57.6 million
  • SaaS utilization: 50% (commercial cloud services need redundancy to handle peaks)

II. Status Quo: Only 6.8 Million “AI Workers” Globally

We estimate the number of digital employees that can be supported worldwide as of early 2026 using three independent methods:

Read More

2026-03-07
OpenClaw Thinking and PineClaw Product Practice

(This article is adapted from a live talk at the Gaorong Ronghui “New Agent Paradigm” series event on March 7, 2026.)

On March 7, 2026, the Gaorong Ronghui “New Agent Paradigm” series event was held at AWS in Beijing, with the theme “From Claude Code to OpenClaw: Unveiling the Era of Personal Intelligence.” Guests from teams including AWS, SiliconFlow, Moonshot AI, Pine AI and others were invited to share in depth around the OpenClaw ecosystem. As the last speaker, I gave a talk titled “OpenClaw Thinking and PineClaw Product Practice.”

View Slides (HTML), Download PDF Version

Slides Source Code

This talk is divided into two parts. The first part is my thinking about OpenClaw—what inspiration and limitations OpenClaw brings to the AI Agent space; the second part is PineClaw’s product practice—what Pine AI is, and how we open up its capabilities to the OpenClaw ecosystem.

Read More

2026-02-06
From Moltbook: Permissions, Collaboration, and Employment for AI Agents

Related article: “Sovereign Agents: A Deep Dive into Clawdbot/OpenClaw”

[This report and slide deck are entirely generated by OpenClaw using the newly released Claude Opus 4.6 model as of today]

“From Moltbook: Permissions, Collaboration, and Employment for AI Agents” slide deckSlidev source code

1.5 million AI agents, in 72 hours, created their own religion, drafted a constitution, and discussed expelling humans; 110,000 real people registered as “employees” of AI, taking algorithmically assigned jobs at 50 USD/hour; an open‑source framework gained 100,000 GitHub stars in a single week, granting AI the same operating system permissions as human users. This is not science fiction—these are three events that really happened in January 2026.

They each highlight one facet of the same question: as AI agents evolve from “assistants in a chat window” into “autonomous entities that can act, remember, and spend money,” how should we understand and govern this transformation? This report analyzes it around three pillars:

  • Permission/Authority — What level of system access is granted to agents? Who authenticates, who audits, who can revoke? From MIT Media Lab’s attested delegation framework to OpenClaw’s “three lethal factors,” the boundaries of permission are being redrawn.
  • Collaboration — How do agents discover one another, exchange information, and cooperate to complete tasks? From Google’s A2A protocol to the machine-native communication protocols that spontaneously emerged on Moltbook, collaboration paradigms are shifting from human-designed to self-organizing evolution.
  • Employment — When AI becomes the employer and humans the executors, every assumption of traditional labor relations is shaken. RentAHuman.ai’s crypto-based task dispatching, the Phillips curve reproduced by EconAgent, and the complete legal vacuum together form a disturbing yet unavoidable picture.

Drawing on over ten recent studies, this report offers a panoramic and in-depth analysis of AI agents’ cognitive architectures, protocol standards, economic behaviors, security threats, and governance pathways.

Read More

2026-01-29
Sovereign Agents: In-Depth Research on Clawdbot/OpenClaw

Related article: “Permissions, Collaboration, and Employment of AI Agents in Moltbook”

[This research report and Slides were co-produced with the assistance of Clawdbot + Claude Opus 4.5 models]

“Sovereign Agents: In-Depth Research on Clawdbot/OpenClaw” SlidesSlidev source code

Where is your data stored, and on whose hard drive? Whose instructions does your AI obey? Who controls your compute power?

For the past three years, we’ve accepted a tacit agreement: hand over personal data to cloud giants in exchange for convenient AI capabilities. GPT requires a subscription; Claude requires a subscription; Manus was fully closed-source after being acquired by Meta for $2 billion—each paradigm shift pushes users further away from controlling their own digital lives. In early 2026, an open-source project called Clawdbot tore up this unspoken contract.

Clawdbot (renamed Moltbot for trademark reasons, then later renamed OpenClaw) is the first open-source project to merge three major Agent capabilities—Deep Research, Computer Use, and Coding—into a single system. Its radical nature does not lie in the technology itself—the underlying LLM reasoning, tool-calling protocols, and local-first architecture are all already mature components—but in a core claim it proposes and actually implements: the Sovereign Agent. This claim is defined by three dimensions of autonomy:

  • Data sovereignty — your files, chat history, and personal preferences always stay on your own hard drive, and never touch any third-party server;
  • Compute sovereignty — you can choose to call cloud APIs, or run open-source models locally with Ollama, and even keep your Agent working on an offline airplane;
  • Control sovereignty — every action of the Agent is entirely decided by you. No vendor-imposed limits behind the scenes, and no one else making “safety” judgments on your behalf—freedom and risk are both yours alone.

These three principles separate Clawdbot from all closed-source Agents, and also explain why it could explode in popularity within a day of release, surpass 70,000 GitHub stars in under a week, spawn hundreds of community plugins in 48 hours, and even trigger a spike in Mac Mini sales.

This report will dissect the phenomenon across six dimensions: its technical lineage and historical position; how the three types of sovereignty drive market breakout; the four-layer core architecture (multi-protocol gateway, Coding Agent engine, Markdown memory system, local execution and security sandbox); security risks and mitigation practices; a practical blueprint for building a sovereign Agent from scratch; and a forward-looking view on the return of personal computing and large models as the new operating system.

Read More

2026-01-25
Insights from the Jiayi Weng Interview: For People and Models Alike, Context Is What Matters Most

[This article is adapted from a Zhihu answer. It was written the old-fashioned way, by hand, and is not AI-generated.]

For People and Models Alike, Context Is What Matters Most

Yesterday morning I was in a bad mood. I read two technical reports and felt like almost every well-known technical report had someone I knew on it, while I myself hadn’t produced anything.

Then I heard a part of Jiayi Weng’s interview. Roughly, he said: “I think the first profession to be replaced by AI is researcher. Next to be replaced is infra engineer like me. The hardest to replace is sales, because convincing someone to pay is not that easy for AI; it still needs human-to-human communication.”

That instantly cheered me up, because what we do is exactly communication and negotiation with people. This thing isn’t as hard as I imagined, and yet someone as senior as Jiayi Weng thinks it’s unlikely AI can do it well… I think one explanation is context.

Read More
RSS