2025-04-01
New Exploration of AI Agents: Building AI-Native Teams and Empowering AI Employees

[This article is based on my keynote speech at the 2025 China Generative AI Conference. The content is the result of a 2-hour brainstorming session with AI, followed by 3 hours of collaborative work with AI in Cursor for refinement.]

Summary: Some teams have found that the efficiency gains from applying AI in programming and writing are not as significant as expected. The reason often lies in the fact that a lot of knowledge is only in the minds of specific employees and not documented. As a result, AI Agents, like new interns, find it difficult to write code, and even if they do, they don’t know how to test it. Another reason is that internal tools like project management systems can only be operated through GUIs, which are not AI Agent-friendly. Today’s text inference models have reached human-level capabilities, and the inability to complete tasks is often due to a lack of background knowledge and AI-friendly tools.

We will discuss how to build an AI-native team that is friendly to AI Agents from the perspectives of software development, project management, and operations. An AI-native team needs to use recorded voice and written communication as much as possible, like an open-source community, to reduce reliance on individuals. AI Agents need to access various internal company tools through MCP, have enough context information, and a test environment to work efficiently. AI Agents need memory compression mechanisms, reflection mechanisms, and checkpoint rollback mechanisms to work continuously overnight without human intervention, making useful progress every hour. AI employees also need to actively communicate with human employees and other AI employees. This way, human employees can spend most of their time thinking and discussing, while most repetitive execution work is handed over to AI.

Download the PPT of “New Exploration of AI Agents: Building AI-Native Teams and Empowering AI Employees” (PDF)

Below is the full text of the speech: (The PPT is the version used at the 2025 China Generative AI Conference, but the text explanation is not a transcript; it is an expanded version generated through brainstorming with AI):

Cover Page

Read More

2025-03-14
AI Agent, Destined to Explode—GeekPark "Tonight's Tech Talk" Live Broadcast

Live Theme: AI Agent, Destined to Explode?!

Time: March 13, 2025, 20:00—22:00

Method: GeekPark WeChat Video Channel “Tonight’s Tech Talk” Live Broadcast (with guests)

Live Guests:

  • Jingyu | Deputy Editor of GeekPark
  • Li Bojie | Chief Scientist of PINE AI
  • Wanchen | Reporter at GeekPark

Key Highlights Summary

  • The core features of AI Agents are the abilities to perceive, plan, and act, enabling them to autonomously gather information, make plans, and execute actions.
  • General Agents like Manus will mimic “geek programmers” rather than ordinary people, possessing computational thinking and knowing when to use code and tools to solve problems.
  • Current AI Agents are mainly divided into compiled types (like Dify) and interpreted types (like Manus), with compiled types having fixed workflows and interpreted types autonomously planning and making decisions.
  • Compiled Agents and interpreted Agents will coexist for a long time rather than replace each other, with different scenarios having different optimal solutions.
  • There is a “100x cost law” for large models: chip companies earn 10 times, and large model companies earn another 10 times, revealing the huge gap between model pricing and actual costs.
  • Foundational models are key to enhancing the capabilities of general Agents, and humans find it hard to imagine something 10 times smarter than themselves, so human thinking should not be imposed on AI.
  • Manus emphasizes “Less Structure, More Intelligence,” similar to the classic “The Bitter Lesson,” where the fewer structural constraints humans impose on AI, the higher the AI’s capability ceiling.
  • New generation models like Claude 3.7 Sonnet have made significant breakthroughs in tool usage and programming capabilities, laying the foundation for Agent development.
  • The open-source release of DeepSeek R1 makes RL (reinforcement learning) technology more accessible, lowering the threshold for developing high-quality Agents.
  • RL training is an important means of building competitive barriers, converting industry experience and expertise into model capabilities.
  • The computational power threshold required for RL training is not as high as imagined, and small models trained with RL can surpass large models in some vertical domains.
  • Multi-agent architectures are not suitable for all scenarios and may replicate inefficient collaboration models found in human organizations in fields like software development.
  • AI programming tools can also play a significant role in large software engineering projects but require a high-quality code engineering foundation, including comprehensive documentation, test cases, and standardized interfaces.
  • AI programming tools struggle with “spaghetti code” for the same reason new interns find it hard to take over—there’s too much undocumented tribal knowledge in the code.
  • The development of Agent technology will drive improvements in software engineering practices, enhancing code quality and maintainability to meet the standards of well-known open-source projects, making more projects AI-friendly.
  • The MCP protocol proposed by Anthropic provides a standardized solution for the interconnection of the Agent ecosystem, allowing diverse professional services to connect rather than replace each other.
  • OpenAI’s Responses API, Realtime API, and Anthropic’s MCP represent the direction of Agent framework development.
  • The work efficiency of Agents is currently limited by the latency of visual models, with humans still having an advantage in certain operational speeds.
  • Virtual machine sandboxes can provide independent working environments but require better personal data integration solutions.
  • In the future, AI Agents may be divided into “fast thinking” (user interaction) and “slow thinking” (background processing) parts working together.
  • General Agents are a battleground for hardware and operating system giants, but large companies will be relatively cautious in releasing products.
  • Opportunities for startups in the Agent field mainly lie in vertical domains, accumulating professional data and industry knowledge through deep cultivation of specific scenarios.
  • Programming, education, and interpersonal communication are the three fields most likely to see mature Agent applications first.
Read More

2025-03-14
Setting Up a Three-Layer Tunnel with Full US IP, No Manual Proxy Configuration Required

Why You Need a Three-Layer Tunnel

Does your AI company often encounter the following situations?

  • Need to access applications or large model APIs that are only open to US IPs, such as OpenAI, Anthropic, Google, etc.
  • Need to connect to the company’s internal network in the US but don’t want to frequently set up proxies

Many people set up application layer proxies, which require setting HTTP_PROXY, HTTPS_PROXY, etc., in environment variables. However, many software do not support configuring proxies directly using environment variables, such as:

  • Docker containers do not perceive external environment variables. If you want to use existing docker compose files and want the services inside the docker to automatically use the proxy, you’ll have to tinker a bit.
  • Docker requires separate proxy configuration when accessing docker.io to pull images and build images.
  • Various software sources, such as pip, npm, etc., require separate proxy configuration.
  • Some software, like Google Cloud CLI, do not read proxy configurations from environment variables and require separate proxy configuration.
  • Some software, like Cursor, directly use IP addresses to access servers and use non-standard WebSocket protocols, which some proxy software are not compatible with or are prone to issues.
  • Some Node.js server-side libraries do not directly detect the HTTP_PROXY environment variable and require configuring an HTTP Proxy Agent. Some libraries (like axios) have bugs in proxy mode.
  • Some compiled language code (like C++, Go) often assembles HTTP requests themselves and may not support configuring HTTP proxies.
  • Some apps (like ChatGPT, Claude Code) use additional mechanisms to detect network environments. If they detect a proxy, they may refuse service or reduce intelligence (e.g., using a poorer model instead of the SOTA model).
Read More

2025-03-08
Manus: An Agent with Computational Thinking, Like a Geek Programmer

This article was first published in a Zhihu answer to the question “How do you evaluate the general AI Agent product Manus released by a Chinese team? Will it become the next big hit?”

Overall, I think Manus is a product with a great idea, but there is still a lot of room for improvement in engineering.

Key Innovation: An Agent with Computational Thinking

Many people think it’s just a better computer use, but at first glance, I noticed a fundamental difference: OpenAI Operator and Anthropic Computer Use both mimic ordinary people, while Manus mimics a geek programmer.

OpenAI Operator / Deep Research and Anthropic Computer Use open browsers, desktop GUIs, and mobile apps, delivering results as a piece of text (at most with some Markdown format). Manus, on the other hand, opens a command-line terminal, writes a todo list using a text editor, continuously writes code for automation during work, and the final deliverable (Artifact) is also a piece of code (interactive web pages and charts).

This immediately reminded me of Dr. Jeannette Wing at MSR talking to us about Computational Thinking. Computational thinking is about abstracting problems in daily life and work, and then solving them with systematic logical reasoning and automation tools. I also introduced computational thinking to many juniors during my time at USTC.

Read More

2025-03-08
Will Manus Initiate the Year of the Agent? - NetEase Technology Live

Reposted from NetEase Technology Public Account

Original Title: “Will Manus Initiate the Year of the Agent? A Conversation with Two AI Entrepreneurs Who Left Big Companies”

Produced by | NetEase Technology Attitude Column

Author | Yuan Ning

Editor | Ding Guangsheng

Like a boulder thrown into a lake, the splash from Manus’s release has gradually subsided, but the ripples continue to spread.

Will Manus initiate the year of the Agent? How should we understand Agents and their barriers? Is now the right opportunity for the development of Agents? How are different players preparing for the wave of Agents? Can current Agents replace interns…

On March 8, NetEase Technology invited two guests who left big companies and are now on the front lines of AI entrepreneurship—Li Bojie and Peng Kangwei—to share their insights and thoughts.

Li Bojie, a former “genius youth” at Huawei, served as the deputy chief expert at Huawei’s Computer Network and Protocol Laboratory and is a recipient of the Microsoft Scholar Award. In 2023, he ventured into AI entrepreneurship and is currently the Chief Scientist at PINE AI, dedicated to building a general intelligent assistant like Samantha from “Her” for everyone and every organization.

Peng Kangwei, who once developed a C-end product with over 100 million monthly active users at Tencent, left to start his own business in 2023 and founded Dream Horse Intelligence, which is working on a new generation of AI content platforms.

As entrepreneurs riding the AI wave, how do they find direction amidst the giant waves? What kind of future for Agents can be seen through their perspective? NetEase Technology has compiled their answers to ten key questions.

The following content has been edited by NetEase Technology without changing the original intent:

Read More

2025-02-17
USTC Course Review Community's 10th Anniversary: Original Developers Return to Create Course Review Community 2.0

This article is reposted from the “Woke Xiaodao News” WeChat public account

What started as a sudden inspiration, pulling in two friends, officially launched after more than two months, has now existed in Woke for ten years.

“10 years ago,” during the 2015 spring semester course selection, Zhang Jingning, a freshman from the School of Physics, was actively participating in discussions in a QQ group chat.

“Which teacher is good for the compulsory course next semester?

“How is the grading?”

“Are there any interesting elective courses?”

The group chat was a closed ecosystem. Participants usually only received a sentence or two of evaluation from a senior, akin to the blind men feeling an elephant. These fragmented discussions made it difficult to filter out truly valuable information and even harder to preserve it.

Zhang Jingning recalled her experience with online courses (MOOC courses): she learned MOOCs spontaneously and proactively. She could learn about course content, teaching style, course difficulty, etc., in advance, and choose courses based on her interests, preferences, and needs, showing strong initiative in MOOC learning.

Coinciding with Academician Hou Jianguo’s launch of the “Freshman ‘Science and Society’ Seminar” at USTC, Zhang Jingning, along with her friends, Li Bojie and Chang Zhen from the School of Computer Science, developed the USTC Course Review Community to promote the transparency of course information on campus and help students find courses that suit them better.

The project started on March 8, 2015, and released its beta version on May 25, taking more than two months.

As of today (February 17, 2025), the website has been running for 3,566 days, with 14,234 participants contributing 37,176 reviews for 17,431 courses.

Read More

2025-01-14
In Memory of My Grandpa

At 1:00 PM on January 12, 2025, my father called me to say that my grandpa had suddenly passed away at home that afternoon.

Grandpa’s Lifetime in Geology

When my grandpa was young, he was a top student. In the late 1950s, he was admitted to Beijing Geological Institute (the predecessor of China University of Geosciences) to study mechanical engineering. At that time, Beijing Geological Institute was a prestigious university that produced many talents. Premier Wen was his junior, and “Father of Chang’e” Ouyang Ziyuan was his senior. Of course, my grandpa was far from being an outstanding alumnus of Beijing Geological Institute. In his junior year, the Sino-Soviet split occurred, and all Soviet experts withdrew, leaving no one to teach. In his senior year, grandpa joined the Institute of Geography of the Chinese Academy of Sciences and became an ordinary geologist.

Although grandpa’s position was not fieldwork, mainly conducting research in the lab, he often had to travel across the country for geological exploration. Geological exploration was not tourism; living rough was the norm. Transportation was not developed back then, and just taking a green train to the destination took several days. The places he went to were remote (places with many people didn’t need exploration), with wild mountains and waters. It was not uncommon to encounter wild animals while camping in the wild or geological disasters halfway up the mountain. There were no mobile phones or GPS back then; if you got lost, you might end up staying in the mountains.

A photo of grandpa from the family album, taken after his retirement during a mountain climb

Read More

2025-01-12
Data is the Moat for Internet and AI Companies

This article was first published in a Zhihu answer to the question “Looking back at the development of the internet, what underlying logics seem simple but will continue to be effective in the future?”

Data is the most important moat.

The Moat for Internet Companies is Data

I really like Lao Wang’s Product Class. Wang Huiwen is one of the founders of Xiaonei and Meituan. His Tsinghua product class is a classic, worth revisiting repeatedly. It talks about economies of scale, and social networks have network effects. The essence of network effects is actually data: who are my friends? How close am I to these friends?

Lao Wang’s product class mentions that replicating WeChat is difficult. Alibaba and ByteDance tried to attack WeChat but failed. However, if one day there is a Prophet app that knows all of a person’s real-life friendships and automatically generates friend relationships based on this, it could potentially compete with WeChat. This is the value of WeChat’s control over friend relationship data.

But this Prophet app doesn’t have WeChat’s chat history or Moments history, so something is still missing. This is the value of conversation history data. If the Prophet app goes further and knows what everyone says and does every day, then even WeChat might not be its match.

Read More

2025-01-03
Interview with Huawei "Genius Youth" Li Bojie (Part 2): Giving Up a Million-Yuan Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus

This article is reposted from the WeChat public account of Woke Advanced Alliance: “Dialogue | Interview with Huawei ‘Genius Youth’ Li Bojie (Part 2): Giving Up a Million-Yuan Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus”

Long article alert, this article contains 11221 words, estimated reading time 29 minutes

“Dialogue” is a series of in-depth interview columns launched by the Woke Advanced Alliance. We invite and interview outstanding alumni from USTC who have experienced detours, tasted setbacks, and achieved accomplishments during their university life at USTC. We hope to showcase their various life experiences and personal choices through in-depth dialogues, hoping that the experiences of these predecessors can illuminate more paths for the younger generation at USTC.

In this issue of the dialogue column, we invited Senior Brother Li Bojie (personal homepage: 01.me/), a USTC 1000 alumnus, USTC MSRA joint training PhD, one of the first Huawei “Genius Youth” awardees, AI entrepreneur, and co-founder of the USTC course evaluation community. He was an assistant scientist and deputy chief expert at Huawei’s Computer Network and Protocol Laboratory. He has published multiple papers at top conferences such as SIGCOMM, SOSP, NSDI, and ATC, and has won the ACM China Outstanding Doctoral Dissertation Award and the “Microsoft Scholar” scholarship.

This article is original by Woke Advanced Alliance. Do not reprint without permission.

Interview, Editing | Feng Wenjun, Chen Lei, Su Qicheng

Proofreading | Zhao Guohua

Theme Summary

Transition from University to Work Environment

How to Develop Skills to Adapt to Job Positions

Entrepreneurial Challenges and Reflections

Job Search Advice and Preparation Strategies

Employment and Self-Improvement in the AI Era

Will AI Cause Unemployment Issues

How to Embrace AI Tools to Empower Work and Study

Read More

2024-12-31
The Crossroads of AI: Professional Models and Personal Models

(This article was written by the author in November 2024 at the invitation of Open Source China for the “2024 OSChina Annual AI Review”)

In 2024, large models truly began to be implemented, with most tech workers using at least one large model to enhance efficiency in their work. Many national-level applications and mobile phone manufacturers have also integrated large models. Large models are starting to diverge into two directions: professional models and personal models.

Professional Models

Professional models are designed to enhance productivity, such as AI-assisted programming, writing, design, consulting, education, etc. Once the model’s capabilities reach a threshold, professional models will bring high added value. In 2024, professional models have already been implemented in many fields. For example, AI-assisted programming can more than double development efficiency, with API call or IDE subscription costs of just tens of dollars per month, equivalent to engineers costing tens of thousands of dollars per month. AI-generated images, podcasts, live broadcasts, etc., can increase the work efficiency of artists, voice actors, and hosts by hundreds of times. AI consulting services in psychology, law, and medical fields can reach the level of junior professionals, with hourly charges significantly higher than the model costs. AI virtual foreign teachers can already rival real foreign teachers, and due to standard pronunciation, the effect even surpasses most domestic English teachers. In the future, AI-assisted teaching will change the traditional one-to-many teaching model, making one-on-one AI teaching possible and significantly improving the efficiency and quality of human teachers’ content preparation.

Read More
RSS