2025-09-08
The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience

I was honored to be invited by Prof. Zhang Jiaxing to give an academic talk titled “The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience” at Lion Rock Artificial Intelligence Lab on September 4. Today I’m sharing the slides and video from the talk for your reference and discussion.

📰 Official coverage: 【产研对接】第 2 期 “FAIR plus × 狮子山问道” 成功举办,探索 AI 智能体与全地形具身智能的瓶颈及突破

Talk materials

Talk overview

In 1900, Lord Kelvin said in a speech: “The beauty and clearness of the dynamical theory, which asserts heat and light to be modes of motion, is at present obscured by two clouds…”. These two “small clouds” later triggered the revolutions of relativity and quantum mechanics. Today, the AI Agent field is facing a similar pair of “dark clouds”.

First dark cloud: challenges of real‑time interaction

Current AI Agents suffer from severe latency issues when interacting with the environment in real time:

The dilemma of voice interaction

  • Serial processing vs real‑time needs: they must wait for the user to finish speaking before thinking, and finish thinking before speaking
  • Fast vs slow thinking: deep thinking needs 10+ seconds (users lose patience), fast responses are prone to errors
  • Technical bottlenecks: every step is a wait (VAD detection, ASR recognition, LLM thinking, TTS synthesis)

The “last mile” challenge of GUI operations

  • Agents operate computers 3–5× slower than humans
  • Every click requires a new screenshot and thinking (3–4 seconds of latency)
  • “Moravec’s paradox”: the model “knows” what to do, but “can’t do it” well
Read More

2025-07-30
From Prompt Engineering to Context Engineering: Secrets to Building Great Agents

[This article is based on a talk given at Turing Community’s Large Model Tech Study Camp. Slides: Slides link, Download PDF version]

A deep dive into the design philosophy and practical strategies for AI Agents. From the dialogue pattern of chatbots to the action pattern of Agents, we systematically design and manage the information environment of Agents to build efficient and reliable AI Agent systems.

Table of Contents

  1. Part 1: Paradigm Shift - From Chatbot to Agent
  2. Part 2: Core Analysis of Agents
  3. Part 3: Context Engineering
  4. Part 4: Memory and Knowledge Systems

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

We are undergoing a fundamental transformation in AI interaction patterns:

Chatbot Era

  • 🗣️ Conversational interaction: user asks → AI answers → repeated Q&A loop
  • 📚 Knowledgeable advisor: can “talk” but not “act,” passively responding to user needs
  • 🛠️ Typical products: ChatGPT, Claude Chat

Agent Era

  • 🎯 Autonomous action mode: user sets goal → Agent executes → autonomous planning and decision-making
  • 💪 Capable assistant: can both “think” and “do,” actively discovering and solving problems
  • 🚀 Typical products: Claude Code, Cursor, Manus
Read More