2025-09-08
Two Clouds over Agents: Real-time Interaction with the Environment and Learning from Experience

It was my great honor, at the invitation of Professor Jiaxing Zhang, to give an academic talk titled “Two Clouds over Agents: Real-time Interaction with the Environment and Learning from Experience” at the Lion Rock Artificial Intelligence Laboratory on September 4. Today I’m sharing the slides and video of this talk for your reference and discussion.

Talk materials

Talk summary

In 1900, Lord Kelvin said in a lecture: “The building of physics is almost complete; there are only two small clouds…” Those two small clouds later triggered the revolutions of relativity and quantum mechanics. Today, the AI agent field faces similar “two clouds.”

The first cloud: the challenge of real-time interaction

Current AI agents face severe latency when interacting with the environment in real time:

The dilemma of voice interaction

  • Serial processing vs real-time needs: must wait for the user to finish speaking to think, and finish thinking to speak
  • Fast vs slow thinking dilemma: deep reasoning takes 10+ seconds (users lose patience), quick responses are error-prone
  • Technical bottlenecks: waiting at every step (VAD detection, ASR recognition, LLM thinking, TTS synthesis)

The “last mile” challenge in GUI operation

  • Agents operate a computer 3–5× slower than humans
  • Every click requires a fresh screenshot and thinking (3–4 s latency)
  • Moravec’s paradox: the model “knows” what to do but “can’t do it”
Read More

2025-07-30
From Prompt Engineering to Context Engineering: The Secret to Writing Good Agents

[This article is based on a presentation at the Turing Community’s Large Model Technology Learning Camp, Slides Link]

Explore the design philosophy and practical strategies of AI Agents in depth. From the conversational mode of Chatbots to the action mode of Agents, systematically design and manage the information environment of Agents to build efficient and reliable AI Agent systems.

Table of Contents

  1. Part 1: Paradigm Shift - From Chatbot to Agent
  2. Part 2: Core Analysis of Agents
  3. Part 3: Context Engineering
  4. Part 4: Memory and Knowledge Systems

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

We are experiencing a fundamental shift in AI interaction modes:

Chatbot Era

  • 🗣️ Conversational Interaction: User asks → AI answers → Repetitive Q&A cycle
  • 📚 Knowledgeable Advisor: Can only “speak” but not “act,” passively responding to user needs
  • 🛠️ Typical Products: ChatGPT, Claude Chat

Agent Era

  • 🎯 Autonomous Action Mode: User sets goals → Agent executes → Autonomous planning and decision-making
  • 💪 Capable Assistant: Can both “think” and “act,” proactively discovering and solving problems
  • 🚀 Typical Products: Claude Code, Cursor, Manus
Read More