Li Bojie: Giving up a Three Million Annual Salary to Pursue a Life-Threatening Venture

Editor’s note: “Global View” is one of the largest circulating news magazines in the country, with a circulation of over a million per issue. This is also the first time a personal feature report has appeared in mainstream print media. The interview was conducted in April 2024, and I only found out from a friend on May 10 that I was featured in the magazine. I hadn’t seen the manuscript before it was published, and I discovered that the editor had gathered some personal information from various sources. I finally feel that I can explain what I do to non-technical people clearly. In the brief interview, the journalist completely understood what I was working on, and the article they wrote was even better than what I could have written myself. My wife commented that the life photo I provided showed me with an unshaven beard; couldn’t I have chosen a cleaner one?

2024.05.08 Issue No. 627

Sending Signals to Aliens, Unwilling to Be Flattened by Life—Li Bojie: Giving up a Three Million Annual Salary to Pursue a Life-Threatening Venture

The world does not lack people with ideals, what it lacks are those with the courage to pursue them.

Written by Wang Yakun

Read More

Are Men with Strong Career Ambitions Suitable Life Partners?

(This article is my answer on Zhihu to “Are Men with Strong Career Ambitions Suitable Life Partners?”)

After starting my own business, I met many entrepreneurs, most of whom are men with strong career ambitions.

I noticed an interesting phenomenon: these entrepreneurs have a significantly higher rate of being single compared to their peers, and their marriages are also less stable.

High Single Rate

In the fields of AI, mobile internet, and Web3, successful co-founders of startups usually have at least a modest fortune; those whose startups didn’t succeed still often have impressive resumes, such as graduation from prestigious schools, high positions in major companies, and various titles and awards. They/She definitely wouldn’t have trouble finding a good partner. But why is the single rate so high and the marriage stability so low?

The core reason is that men with strong career ambitions mostly spend their time and interest on their careers, investing less in life, emotions, and family.

Read More

Zhihu "New Figures" Interview: The AGI Belief of Huawei Top Talent

A month ago, the interview video by Zhihu “New Figures” was finally released. It was my first time participating in such an interview that included aspects of personal life, and it definitely wasn’t a company PR, as the name and products of our company were never mentioned throughout the entire session, and few people even know the real name of our company.

It seems that Zhihu still maintains journalistic integrity, as they did not let me view the video before publishing it; all editing, titles, and voice-overs were done by the Zhihu editors.

(04:16, 215 MB)

Video shooting locations:

  • Beijing office
  • Home (interview, cooking with my wife, and some photos)
  • Shucun Suburban Park (a place where I often run, the flying electric butterfly was made by me in 2017, it got caught in a tree during the shooting, and our very capable photographer climbed up the tree to retrieve it)
Read More

How to Develop Research Taste?

(This article was first published on Zhihu answer: “How to develop research taste in the field of computer systems?”)

In the blink of an eye, it’s been nearly 10 years since I graduated from USTC. Yesterday, while discussing with my wife the recent developments of our classmates in the USTC systems circle, I realized that research taste is the most critical factor in determining academic outcomes. The second key factor is hands-on ability.

What is research taste? I believe that research taste is about identifying influential future research directions and topics.

Many students are technically strong, meaning they have strong hands-on skills and system implementation abilities, but still fail to produce influential research outcomes. The main reason is poor research taste, choosing research directions that either merely chase trends without original thought or are too niche to attract attention.

PhD Students’ Research Taste Depends on Their Advisors

I believe that research taste initially depends heavily on the advisor, and later on one’s own vision.

Read More

Chatbot Arena: A Community-Based Evaluation Benchmark for Large Models

(This article was first published on Zhihu Answer: “What are the current benchmarks for evaluating large language models?”)

We must praise our co-founder @SIY.Z for Chatbot Arena!

Chatbot Arena is a community-based evaluation benchmark for large models. Since its launch a year ago, Chatbot Arena has received over 650,000 valid user votes.

Chatbot Arena Witnesses the Rapid Evolution of Large Models

In the past month, we have witnessed several very interesting events on Chatbot Arena:

  • Anthropic’s release of Claude-3, with its large Opus model surpassing GPT-4-Turbo, and its medium Sonnet and small Haiku models matching the performance of GPT-4. This marks the first time a company other than OpenAI has taken the top spot on the leaderboard. Anthropic’s valuation has reached $20B, closely approaching OpenAI’s $80B. OpenAI should feel a bit threatened.
  • Cohere released the strongest open-source model to date, Command R+, with a 104B model matching the performance of GPT-4, although still behind GPT-4-Turbo. Earlier this year, I mentioned the four major trends for large models in 2024 during an interview with Jiazi Guangnian (“AI One Day, Human One Year: My Year with AI | Jiazi Guangnian”): “Multimodal large models capable of real-time video understanding and generating videos with complex semantics; open-source large models reaching GPT-4 level; the inference cost of GPT-3.5 level open-source models dropping to one percent of the GPT-3.5 API, making it cost-effective to integrate large models; high-end smartphones supporting local large models and automatic app operation, making everyone’s life dependent on large models.” The first is Sora, the second is Command R+, both have come true. I still hold this view, if a company mainly focused on foundational models cannot train a GPT-4 by 2024, they should stop trying, wasting a lot of computing power, and not even matching open-source models.
  • Tongyi Qianwen released a 32B open-source model, almost reaching the top 10, performing well in both Chinese and English. The cost-effectiveness of the 32B model is still very strong.
  • OpenAI was surpassed by Anthropic’s Claude Opus, and naturally, they did not show weakness, immediately releasing GPT-4-Turbo-2024-04-09, reclaiming the top spot on the leaderboard. However, OpenAI has been slow to release GPT-4.5 or GPT-5, and the much-anticipated multimodal model has not yet appeared, which is somewhat disappointing.
Read More

Bilibili Uploader Interview with Li Bojie: Why Start a Business

This video is an interview with me by the Bilibili uploader “Apple Bubbles”, original video link

The entire interview lasted half an hour, recorded in one take, with no edits except for the intro added by the uploader, and no prepared answers to the questions.

(27:07, 136 MB)

Read More

Long Talk: Should AI Agents Be More Entertaining or More Useful?

(The full text is about 40,000 words, mainly from a 2-hour report at the USTC Alumni AI Salon on December 21, 2023, and is a technical extended version of the 15-minute report at the Zhihu AI Pioneers Salon on January 6, 2024. The article has been organized and expanded by the author.)

I am honored to share some of my thoughts on AI Agents at the USTC Alumni AI Salon. I am Li Bojie, from the 2010 Science Experimental Class, and I pursued a joint PhD at USTC and Microsoft Research Asia from 2014 to 2019. From 2019 to 2023, I was part of the first cohort of Huawei’s Genius Youth. Today, I am working on AI Agent startups with a group of USTC alumni.

Today is the seventh day since the passing of Professor Tang Xiaou, so I specially set today’s PPT to a black background, which is also my first time using a black background for a presentation. I also hope that as AI technology develops, everyone can have their own digital avatar in the future, achieving eternal life in the digital world, where life is no longer limited and there is no more sorrow from separation.

AI: Entertaining and Useful

The development of AI has always had two directions, one is entertaining AI, which is more human-like, and the other is useful AI, which is more tool-like.

Should AI be more like humans or more like tools? Actually, there is a lot of controversy about this. For example, Sam Altman, CEO of OpenAI, said that AI should be a tool, not a life form. However, many sci-fi movies depict AI that is more human-like, such as Samantha in Her, Tu Ya Ya in The Wandering Earth 2, Ash in Black Mirror, so we hope to bring these sci-fi scenarios to reality. Only a few sci-fi movies feature tool-like AI, such as Jarvis in Iron Man.

Besides the horizontal dimension of entertaining and useful, there is another vertical dimension, which is fast thinking and slow thinking. This is a concept from neuroscience, from the book “Thinking, Fast and Slow,” which says that human thinking can be divided into fast thinking and slow thinking.

Fast thinking refers to basic visual and auditory perception abilities and expressive abilities like speaking that do not require deliberate thought, like ChatGPT, stable diffusion. These are tool-like fast thinking AIs that respond to specific questions and do not initiate interaction unless prompted. Whereas Character AI, Inflection Pi, and Talkie (Hoshino) simulate conversations with a person or anime game character, these conversations do not involve solving complex tasks and lack long-term memory, thus they are only suitable for casual chats and cannot help solve problems in life and work like Samantha in Her.

Slow thinking refers to stateful complex thinking, which involves planning and solving complex problems, determining what to do first and what to do next. For example, MetaGPT writing code simulates the division of labor in a software development team, and AutoGPT breaks down a complex task into many stages to complete step by step. Although these systems still have many practical issues, they already represent a nascent form of slow thinking capability.

Unfortunately, there are almost no products in the first quadrant that combine slow thinking with human-like attributes. Stanford AI Town is a notable academic attempt, but there is no real human interaction in Stanford AI Town, and the AI Agent’s daily schedule is pre-arranged, so it is not very interesting.

Interestingly, most of the AI in sci-fi movies actually falls into this first quadrant. Therefore, this is the current gap between AI Agents and human dreams. Therefore, what we are doing is exactly the opposite of what Sam Altman said; we hope to make AI more human-like while also capable of slow thinking, eventually evolving into a digital life form.

Read More

USTC Practical Project: Undergraduates with Basic Programming Skills Can Also Develop AI Agents

Since December 2023, I have been working as a corporate mentor in collaboration with Professor Junming Liu from USTC on an AI Agent practical project, with about 80 students from across the country participating. Most of them are undergraduates with only basic programming skills, along with some doctoral and master’s students with a foundation in AI.

In December 2023 and January 2024, we held 6 group meetings to explain the basics of AI Agents, how to use the OpenAI API, this AI Agent practical project, and to answer questions students had during the practice. The practical project includes:

  1. Corporate ERP Assistant
  2. Werewolf
  3. Intelligent Data Collection
  4. Mobile Voice Assistant
  5. Meeting Assistant
  6. Old Friends Reunion
  7. Undercover

From February 20-24, some students participating in this research project gathered in Beijing for a Hackathon and presented the interim results of their projects. Participants generally felt the power of large models, surprised that such complex functions could be achieved with just a few hundred lines of code. Below are some of the project outcomes:

Read More

Groq Inference Chips: A Trick of Trading Space for Time

Recently, Groq’s inference chips have made headlines with their large model output speed of 500 tokens/s.

In a nutshell, this chip plays a trick of trading space for time, storing both model weights and intermediate data in SRAM, instead of HBM or DRAM.

This is something I did 8 years ago at Microsoft Asia Research Institute (MSRA), suitable for the neural networks of that time, but really not suitable for today’s large models. Because large models based on Transformers require a lot of memory to store the KV Cache.

Although Groq’s chips have a very fast output speed, due to the limited memory size, the batch size cannot be very large. If we calculate the cost-effectiveness in terms of $/token, it may not be competitive.

Groq needs a cluster of hundreds of cards to run the LLaMA-2 70B model

Read More

How I Embarked on the AI Entrepreneurship Journey

My Early Encounters with AI

Meeting AI During My PhD

Originally, my PhD research was focused on networks and systems, with my dissertation titled “High-Performance Data Center Systems Based on Programmable Network Cards“. Many in the field of networks and systems look down upon some AI research, claiming that AI papers are easy to “water down” and that with just an idea, a paper can be published in one or two months. In contrast, top conference papers in networks and systems often require a significant amount of work, taking as long as a year to complete.

Aside from the AI courses I took in school, my first serious AI-related project was in 2016, using FPGA to accelerate neural networks in Bing Ranking. That period was the previous wave of AI hype, and the so-called “four AI dragons” of today all started during that time.

Microsoft deployed FPGAs on a large scale in data centers not only for network virtualization but also for an important piece of neural network inference acceleration. At that time, we also used pipeline parallelism to store all the neural network weights on the FPGA’s SRAM, achieving super-linear acceleration. This story is described in more detail in the section “Exploration of Machine Learning Accelerators” in “Five Years of PhD at MSRA — Leading My First SOSP Paper“.

At that time, many people working in networks and systems didn’t understand AI, nor did they care to understand it, unable to distinguish between training and inference, or forward and backward operators. By optimizing these operators, I at least understood how basic feedforward neural networks (FFNN) work. However, I didn’t get involved in business applications or tinker with my own models.

Read More