2025-09-08
Two Dark Clouds Over Agents: Real-time Interaction with Environments, Learning from Experience

I was honored to be invited by Prof. Jiaxing Zhang to give an academic talk titled “Two Dark Clouds Over Agents: Real-time Interaction with Environments, Learning from Experience” at the Lion Rock Artificial Intelligence Laboratory on September 4. Today I’m sharing the slides and video for your reference and discussion.

📰 Official Report: 【Industry-Research Docking】Issue 2 “FAIR plus × Lion Rock Wendao” successfully held, exploring the bottlenecks and breakthroughs of AI agents and all-terrain embodied intelligence

Talk Materials

Talk Summary

In 1900, Lord Kelvin said in a lecture: “The beauty and clearness of our present views … two clouds …” Those two small clouds later triggered the revolutions of relativity and quantum mechanics. Today, the AI Agent field faces similar “two dark clouds.”

The First Cloud: Challenges of Real-time Interaction

Current AI agents face severe latency when interacting with environments in real time:

The predicament of voice interaction

  • Serial processing vs. real-time needs: must wait for the user to finish speaking before thinking, and finish thinking before speaking
  • The fast/slow thinking dilemma: deep thinking takes 10+ seconds (users lose patience), quick responses are error-prone
  • Technical bottlenecks: waiting at every step (VAD detection, ASR recognition, LLM reasoning, TTS synthesis)

The “last mile” problem of GUI operations

  • Agents operate computers 3–5× slower than humans
  • Each click requires re-screenshotting and thinking (3–4 seconds of latency)
  • Moravec’s paradox: the model “knows” what to do but “can’t do it”
Read More

2025-07-30
From Prompt Engineering to Context Engineering: The Secret to Writing Good Agents

[This article is based on a presentation at the Turing Community’s Large Model Technology Learning Camp, Slides Link]

Explore the design philosophy and practical strategies of AI Agents in depth. From the conversational mode of Chatbots to the action mode of Agents, systematically design and manage the information environment of Agents to build efficient and reliable AI Agent systems.

Table of Contents

  1. Part 1: Paradigm Shift - From Chatbot to Agent
  2. Part 2: Core Analysis of Agents
  3. Part 3: Context Engineering
  4. Part 4: Memory and Knowledge Systems

Part 1: Paradigm Shift - From Chatbot to Agent

From Chatbot to Agent: A Fundamental Paradigm Shift

We are experiencing a fundamental shift in AI interaction modes:

Chatbot Era

  • 🗣️ Conversational Interaction: User asks → AI answers → Repetitive Q&A cycle
  • 📚 Knowledgeable Advisor: Can only “speak” but not “act,” passively responding to user needs
  • 🛠️ Typical Products: ChatGPT, Claude Chat

Agent Era

  • 🎯 Autonomous Action Mode: User sets goals → Agent executes → Autonomous planning and decision-making
  • 💪 Capable Assistant: Can both “think” and “act,” proactively discovering and solving problems
  • 🚀 Typical Products: Claude Code, Cursor, Manus
Read More