Agent

2025-09-08

The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience

I was honored to be invited by Prof. Zhang Jiaxing to give an academic talk titled “The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience” at Lion Rock Artificial Intelligence Lab on September 4. Today I’m sharing the slides and video from the talk for your reference and discussion.

📰 Official coverage: 【产研对接】第 2 期 “FAIR plus × 狮子山问道” 成功举办，探索 AI 智能体与全地形具身智能的瓶颈及突破

Talk materials

🎬 Talk video
- Talk video – watch on YouTube
- Talk video download (474 MB, 2 h 16 min)
📖 Slides in English
- Slides in English – online view, download PDF version
- Slides in English – source code
📖 Slides in Chinese
- Slides in Chinese – online view, download PDF version
- Slides in Chinese – source code

Talk overview

In 1900, Lord Kelvin said in a speech: “The beauty and clearness of the dynamical theory, which asserts heat and light to be modes of motion, is at present obscured by two clouds…”. These two “small clouds” later triggered the revolutions of relativity and quantum mechanics. Today, the AI Agent field is facing a similar pair of “dark clouds”.

First dark cloud: challenges of real‑time interaction

Current AI Agents suffer from severe latency issues when interacting with the environment in real time:

The dilemma of voice interaction

Serial processing vs real‑time needs: they must wait for the user to finish speaking before thinking, and finish thinking before speaking
Fast vs slow thinking: deep thinking needs 10+ seconds (users lose patience), fast responses are prone to errors
Technical bottlenecks: every step is a wait (VAD detection, ASR recognition, LLM thinking, TTS synthesis)

The “last mile” challenge of GUI operations

Agents operate computers 3–5× slower than humans
Every click requires a new screenshot and thinking (3–4 seconds of latency)
“Moravec’s paradox”: the model “knows” what to do, but “can’t do it” well

2025-07-30

From Prompt Engineering to Context Engineering: Secrets to Building Great Agents

[This article is based on a talk given at Turing Community’s Large Model Tech Study Camp. Slides: Slides link, Download PDF version]

A deep dive into the design philosophy and practical strategies for AI Agents. From the dialogue pattern of chatbots to the action pattern of Agents, we systematically design and manage the information environment of Agents to build efficient and reliable AI Agent systems.

Part 1: Paradigm Shift - From Chatbot to Agent
Part 2: Core Analysis of Agents
Part 3: Context Engineering
Part 4: Memory and Knowledge Systems

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

We are undergoing a fundamental transformation in AI interaction patterns:

Chatbot Era

🗣️ Conversational interaction: user asks → AI answers → repeated Q&A loop
📚 Knowledgeable advisor: can “talk” but not “act,” passively responding to user needs
🛠️ Typical products: ChatGPT, Claude Chat

Agent Era

🎯 Autonomous action mode: user sets goal → Agent executes → autonomous planning and decision-making
💪 Capable assistant: can both “think” and “do,” actively discovering and solving problems
🚀 Typical products: Claude Code, Cursor, Manus

2025-09-08

The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience

Talk materials

Talk overview

First dark cloud: challenges of real‑time interaction

The dilemma of voice interaction

The “last mile” challenge of GUI operations

2025-07-30

From Prompt Engineering to Context Engineering: Secrets to Building Great Agents

Table of Contents

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

Links

Tag Cloud

Agent

2025-09-08 The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience

Talk materials

Talk overview

First dark cloud: challenges of real‑time interaction

The dilemma of voice interaction

The “last mile” challenge of GUI operations

2025-07-30 From Prompt Engineering to Context Engineering: Secrets to Building Great Agents

Table of Contents

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

Links

Tag Cloud

2025-09-08

The Two Dark Clouds over Agents: Real‑time Interaction with the Environment, Learning from Experience

2025-07-30

From Prompt Engineering to Context Engineering: Secrets to Building Great Agents